DOS attack tracing

Hi,

We recently experienced several DOS attacks which drove our backbone routers
CPU to 100%. The routers are not under attack, but the router just couldn't
handle the traffic. There is a plan to upgrade these routers. One criteria
is the ability to track which IP address is under attack and blackhole the
traffic quickly. Anyone can share your experience of what kind of router is
capable of doing this?

Also besides having a powerful router which can handle large volume of
traffic, is there any other things that we need to consider in selecting the
routers?

Thanks,
Richard

I recently wrecked my car, totaling it and running down several small
children on their way to sunday school in the process. I plan to upgrade
my car, and one of the criteria is that it not crash and kill people. Can
you share advice on what car is capable of doing this?

This example is about as descriptive and useful at solving the problem as
your original post. Without any details it is impossible to make any
useful recommendation even if we wanted to. What type and scale of DoS are
you trying to protect against, what type and scale of traffic are you
routing, what kind of interfaces and how many, basic things like that.
Without details, the best that you're likely to get (now that Dean is gone
:P) is something akin to "go buy a volvo", namely "go buy a Juniper".

: We recently experienced several DOS attacks which drove our backbone routers
: CPU to 100%. The routers are not under attack, but the router just couldn't
: handle the traffic. There is a plan to upgrade these routers. One criteria
: is the ability to track which IP address is under attack and blackhole the
: traffic quickly. Anyone can share your experience of what kind of router is
: capable of doing this?
:
: Also besides having a powerful router which can handle large volume of
: traffic, is there any other things that we need to consider in selecting the
: routers?

You shouldn't buy a bigger router just to handle DOS attacks. THere're
many ways to address these types of issues using routers and/or servers.
What is your normal CPU usage when there is no DOS attack? What does your
capacity look like on the router interface where the DOS is coming in on?
We need more info.

scott

We recently experienced several DOS attacks which drove our backbone
routers CPU to 100%. The routers are not under attack, but the
router just couldn't handle the traffic. There is a plan to upgrade
these routers.

What kind of routers? We had problems like this with Cisco 7206VXRs
with NPE-300s at my last job because they just couldn't handle the
high volume of packets-per-second from certain types of attack.

One criteria is the ability to track which IP address is under
attack and blackhole the traffic quickly. Anyone can share your
experience of what kind of router is capable of doing this?

Disclaimer: I'm not an expert on this stuff, and it's possible
(likely) that others on the list may have some other and / or better
suggestions.

Generally, I've seen this done by exporting flow data to another box,
and then analyzing this data. I've used ehnt (extremely happy netflow
tool) (http://ehnt.sourceforge.net/) to capture the flow data and
export it to an easily machine-parsable feed, then used a Perl script
to capture information on the top source / destination addresses. If
there's interest, I could see whether it's possible to get this code
and put it up somewhere (on an as-is basis) - the code was written by
Kenytt Avery at Willing Minds (willingminds.com).

We were keeping an ongoing log of such data, in case the router itself
took a crap.

On a Cisco router, you can also look at the raw cache flow data (sh ip
cache flow), which has some summary data at the top, and then data on
each flow. By rshing into the device and capturing this output, you have
access to some other data to futz around with in some sort of script.

So I'm not sure if there are any vendors which make it easy to figure
this out while logged into the device itself (or whether this is a
practical thing to do at all or something vendors are working on
implementing), but it is possible to do using tools like netflow.

w

> We recently experienced several DOS attacks which drove our backbone
> routers CPU to 100%. The routers are not under attack, but the
> router just couldn't handle the traffic. There is a plan to upgrade
> these routers.

What kind of routers? We had problems like this with Cisco 7206VXRs
with NPE-300s at my last job because they just couldn't handle the
high volume of packets-per-second from certain types of attack.

Oh... I guess that it would a known issue then... we have the exactly same
type of routers. Our routers normally run at 35% CPU. What sucks is that the
traffic volume doesn't have to be very high to bring down the router.

On a Cisco router, you can also look at the raw cache flow data (sh ip
cache flow), which has some summary data at the top, and then data on
each flow. By rshing into the device and capturing this output, you have
access to some other data to futz around with in some sort of script.

So I'm not sure if there are any vendors which make it easy to figure
this out while logged into the device itself (or whether this is a
practical thing to do at all or something vendors are working on
implementing), but it is possible to do using tools like netflow.

So far we manually login to the router and use 'sh ip cache flow' on the
router. It is ok, but not very effective. First when the router is slow to a
halt, it is not even possible to the run the command most of the time.
Secondly reading through the output and figuring out what's going on is not
an easy task. I will definitely look into the tools to automate this
process. Appreciate your suggestion. Just wonder if any router vendor has
any built-in tools.

Thanks,
Richard

: > > We recently experienced several DOS attacks which drove our backbone
: > > routers CPU to 100%. The routers are not under attack, but the
: > > router just couldn't handle the traffic. There is a plan to upgrade

: type of routers. Our routers normally run at 35% CPU. What sucks is that the
: traffic volume doesn't have to be very high to bring down the router.

That's because it's the number of packets per time period that it can't
handle, not the traffic level. At this point it seems most likely that
it's a simple UDP flood. If your CPU usually runs at 35% you definitely
don't need a bigger router unless you're expecting a growth spurt. You
might want to put an RRDTool or MRTG graph on the CPU usage to be sure.

Depending on the size of your network you also might put a server at a
good place where you can mirror the traffic to it and use NTop on the
server. The software is free and will show a huge amount of detail if the
server has the brawn to handle the load. More detail means more server
brawn. You'll definitely see where the DOS is going.

scott

I'll disagree here.

When you're engineering a network, what you generally need to care about is peak traffic, not average traffic. While DOS attack traffic is presumably traffic you'd rather not have, it tends to be part of the environment.

This is somewhat of an arms race, and no router will protect you from all conceivable DOS attacks. That said, designing your network around the size of attack you typically see (plus some room for growth) raises the bar, and turns attacks of the size you've designed for into non-events that you don't need to wake up in the middle of the night for.

Remember, the real goal in dealing with DOS attacks is to get to the point where you don't notice them, rather than just being able to explain why your network is down.

For those attacks that go beyond the capacity you can afford, being able to divert the traffic is a good thing. The Riverhead system (now known as Cisco Guard, I think) does reasonably well at protecting networks downstream from it without being a big point of failure, but the network upstream from it still needs to be able to take the load. And being better able to characterize the attack traffic may help you ask your upstreams to block it for you. This can be done with some of the tools others have mentioned, including your router's flow cache *if your router hasn't already fallen over and died*.

A rather dated paper on my experiences dealing with this sort of thing is at http://www.stevegibbard.com/ddos-talk.htm.

-Steve

: On Mon, 9 May 2005, Scott Weeks wrote:
: > On Mon, 9 May 2005, Richard wrote:
: >
: > : type of routers. Our routers normally run at 35% CPU. What sucks is that the
: > : traffic volume doesn't have to be very high to bring down the router.
: >
: > That's because it's the number of packets per time period that it can't
: > handle, not the traffic level. At this point it seems most likely that
: > it's a simple UDP flood. If your CPU usually runs at 35% you definitely
: > don't need a bigger router unless you're expecting a growth spurt. You
: > might want to put an RRDTool or MRTG graph on the CPU usage to be sure.
:
: I'll disagree here.

Cool! Good 'ol operations discussion... :slight_smile:

I took things out of order from your email, but kept the context.

: www.stevegibbard.com/ddos-talk.htm

Nice paper. However, you still say what I was saying, just in a
different sort of way. Instead of NTop and RRDTool/MRTG, you use Cricket.
RRDTool/MRTG alerts you to the problem and NTop directs you to the source
of the problem. Once you get the procedure down pat, it can go pretty
fast.

As far as puttimg something in front of the core router(s) (such as
Riverhead), I assumed there was nothing there for Richard; just raw
router interface(s) to the upstream and not enough budget to afford those
nice-but-expensive boxes. I was going to mention things like Riverhead or
Packeteer later in the posts if appropriate.

: When you're engineering a network, what you generally need to care about
: is peak traffic, not average traffic. While DOS attack traffic is
: presumably traffic you'd rather not have, it tends to be part of the
: environment.
:
: This is somewhat of an arms race, and no router will protect you from all
: conceivable DOS attacks. That said, designing your network around the
: size of attack you typically see (plus some room for growth) raises the
: bar, and turns attacks of the size you've designed for into non-events
: that you don't need to wake up in the middle of the night for.

This is what I was getting at. Engineering the network. That's more
than buying a Bigger Badder Router and Fatter Pipes(BBR&FP). If your
router is running at 35% during the normal peak traffic flow, you don't
need a BBR&FP. All you need to do is design the network (and train the
monkeys, as randy terms it... :slight_smile: to deal with extraordinary peaks.

: Remember, the real goal in dealing with DOS attacks is to get to the point
: where you don't notice them, rather than just being able to explain why
: your network is down.

Yes, but a BBR&FP isn't the way to deal with this unless you've got the
big budget. I know that a bigger hammer is better if you've got the
money, but if you don't engineering finesse can work well.

scott

1) Get 'Cisco guard' , too expensive ?
2) Get Arbor, Stealthflow, Esphion, too expensive ?
3) Use flow-tools, ntop, Silktools and open-source Netflow collectors
& analyzers
4) Apply Ingress/Egress Filtering : RFC 2827 , uRPF, Team cymru IOS template
5) Monitor CPU/Netflow table size using SNMP
6) Request a blackholing BGP community from your upsream provider.

Quite decent suggestions

3) Use flow-tools, ntop, Silktools and open-source Netflow collectors
& analyzers
4) Apply Ingress/Egress Filtering : RFC 2827 , uRPF, Team cymru IOS template
5) Monitor CPU/Netflow table size using SNMP
6) Request a blackholing BGP community from your upsream provider.

You start with #4, first of all. Then get #6. Then put #2 and #5 in place.

After that, you get one or the other of these, if you can push through
a budget for expensive kit.

1) Get 'Cisco guard' , too expensive ?
2) Get Arbor, Stealthflow, Esphion, too expensive ?

--srs

: 1) Get 'Cisco guard' , too expensive ?
: 2) Get Arbor, Stealthflow, Esphion, too expensive ?
: 3) Use flow-tools, ntop, Silktools and open-source Netflow collectors
: & analyzers
: 4) Apply Ingress/Egress Filtering : RFC 2827 , uRPF, Team cymru IOS template
: 5) Monitor CPU/Netflow table size using SNMP
: 6) Request a blackholing BGP community from your upsream provider.

Yep, those are some of the things I meant by:

: > Yes, but a BBR&FP isn't the way to deal with this unless you've got the
: > big budget. I know that a bigger hammer is better if you've got the
: > money, but if you don't engineering finesse can work well.

scott