Interesting debugging: Specific packets cause some Intel gigabit ethernet controllers to reset

Over the year I've read some interesting (horrifying?) tales of
debugging on NANOG. It seems I finally have my own to contribute:

The strangest issue I've experienced, that's for sure.

Wow, you just solved my issue with my firewall.

FWIW, I had a similar situation crop up a couple of years ago with *five
different* Seagate SATA drives: they grew some specific type of bad spot
on the drive which, if you even tried to read it, would *knock the drive
adapter off line until powercycle*; even a reboot didn't clear it.

Nice writeup.

-- jra

On a similar vein here's some fun reading:


That environment does not have out-of-band framing, which can't be duplicated
by the data inside a framed packet?

-- jra

I have come to believe the Intel 82574L is the worst Ethernet chip in the universe. We had horrible issues with it (random bursts of dropped packets showing in ifconfig). We ended up simply putting a card based on a different chip into our systems and all our issues went away.


    Yes I had that issue, it was a firmware problem... and a timed one
too :frowning:

    We had a customer with a few Raid5 of 3 drives, once 1 drive go bad
he had about 20m before another drive would.

    And they where bricked btw, you couldn't just upload the new firmware.

    Wasn't an happy weekend.

Update with a response to the statement from Intel:

I just want you to know that this was the best piece of technical debugging I've read in years. Absolutely awesome. Thank you so much for sharing what I can only imagine was an endless series of nightmares.

I've done debugging like this before and I can only say: I feel your pain and I wish I documented my previous efforts. Great writing sir.


Joshua Goldbard
VP of Marketing, 2600hz

116 Natoma Street, Floor 2
San Francisco, CA, 94104