Extreme + Nachi = ipfdb overflow

Joshua_Coombs1 · August 25, 2003, 7:38pm

After battling Nachi and it's flood of icmp traffic, I've discovered
that it's not the Cisco gear that gets hit hard by it, it was the
Extreme gear. Nachi generates enough 'random' traffic to flood and
subsequently thrash the ip forwarding DB on the Summit 1i we were using
so badly as to drop it from gigabit capible to barely eeking out
6mb/sec. Before I redeploy the switch, I need to find a way to keep the
ipfdb from flodding while allowing it to be the primary carrier of
traffic. ACLs blocking ICMP on the Extreme act too late, by the time
the cpu sees the packet to drop it, it's already horned its way into the
ipfdb. Does anyone have any suggestions on ways to allow the switch to
participate as an L3 router while minimizing the chances of a worm
taking it out so easily again?

Joshua Coombs
GWI Networking

Richard_A_Steenbegen · August 25, 2003, 8:03pm

This affects most layer 3 switches, including Extreme, Foundry, and anyone
else who still can't figure out how to pre-generated a FIB instead of a
Fast Cache style system.

It amazes me that people still have not learned this lesson. How old is
CEF now? Then again, I suppose most of these boxes are being marketed to
Enterprises anyways. As long as there is a label that says "60Gbps", the
box looks good, and it's relatively cheap, how many of their customers are
really going to notice the first packet performance of 6Mbps before they
buy, right?

At least some of the other vendors have workarounds (lame as they might be
*coughnetaggcough*), or newer supervisors with FIBs, but I'm not aware of
anything you can do to make an L3 Barney Switch behave well under a random
dest flood.

Robert_M_Enger2 · August 25, 2003, 8:36pm

I believe the old IBM "routers" used for the NSFnet
implemented fully distributed routing tables in each line card.
At that time, the commercial router vendors were still faulting
routes in to the line cards (or central accelerator cards)
on demand.

I think the good folks at Merit, Watson and ANS
were some of the early advocates of fully distributed tables,
due in part to their analysis of samples of real-world
backbone traffic.

Daniel_Senie2 · August 25, 2003, 8:57pm

Cisco 65xx gear suffers the same problem. SQL Slammer infested 3 neighboring customers in a colo space we use. The 6509 (used for aggregation in that colo) dropped 10% or more of our packets, though we were not infected. So much for claims from both of these vendors about "wire speed" forwarding.

When testing switch gear, I think it's time to update Scott Bradner's test suites to use random source and destination IP addresses, so we can find out the true limits of the equipment.

Mikael_Abrahamsson · August 25, 2003, 9:08pm

The options in the market that I know of in the $3k-$8k range either has a
very small routing table (Cisco 3550 for instance) or has a large route
cache (Extreme Summit i-plattform is a good example). So it's either a
3550 with a lame low number of routes and mac addresses and memory that
behaves well under random destination flood, or it's Extreme with a good
number of mac addresses and routes that normally does everything it
should, but behaves badly under random destination load.