Katrina Network Damage Report

George_William_Herbe · September 10, 2005, 6:25pm

Todd Underwood wrote:

Sean Donelan wrote:

Todd Underwood wrote:
> the general idea is: take a large peerset sending you full
> routes, keep every update forever, and take a reasonably long (at
> least a month or two) time horizon. calculate a consensus view for
> each prefix as to whether that prefix is reachable by some set of
> those peers. an outaged prefix is one that used to be reachable that
> not no longer is. in other words, one that has been withdrawn from
> the full table by some sufficiently large number of peers.

This describes a partioning, not necessarily an outage.

can you explain what you mean?

I'm not sure if Sean's thinking the same thing I am,
but let me chime in with a nickel's worth of commentary.

There are some inconsistent terms used in computer
dependability research, but I prefer and use two
key definitions: failure (something is offline)
and outage (customer sees the service offline).

Various redundancy can hide failures from customers
and keep them from being true outages.

Looking at the routing tables you see failures.
If a prefix goes away completely and utterly,
and is truly unreachable, then anyone trying to
see it is going to see an outage. But you can
have a lot of intermediate cases where routes are
mostly down but not completely, or where parts
of the net can see it but other parts can't
due to the vagarities of route propogation
and partial failures.

And there are situations where the route is
down but the service is still up.

There are other network monitoring groups
that do end to end connectivity tests from
geographically distributed clients out to
sample systems around the net. Some for research
and some for hire for network monitoring.

I think what they do is much closer to
identifying true outages than your method.

-george william herbert
gherbert@retro.com

Todd_Underwood2 · September 10, 2005, 7:04pm

interesting discussion. at least we're talking about networking now.

wrt sean's comment, the only thing i can think he means by 'partition'
is that the networks may have power may be in some routing table but
just not the routing table of any of renesys's (or routeviews or ripe)
peers. in that case, i guess i would agree. our use of 'outage' is a
special case of 'partition' where the whole internet is on one side
and it's possible that the networks in question are on the other.
they may route somewhere. just not to the internet.

quick question below...

There are some inconsistent terms used in computer
dependability research, but I prefer and use two
key definitions: failure (something is offline)
and outage (customer sees the service offline).

not sure i understand these definitions. i'm happy to use any
well-defined terms (vocabulary never being worth fighting over).
again, when i use 'outage' i mean: previously in global internet
tables of a consensus of a large peerset and now removed from those
tables. which is that in your terms?

Looking at the routing tables you see failures.

not necessarily, if i'm understanding your definitions (which i guess
i'm not).

If a prefix goes away completely and utterly,
and is truly unreachable, then anyone trying to
see it is going to see an outage. But you can
have a lot of intermediate cases where routes are
mostly down but not completely, or where parts
of the net can see it but other parts can't
due to the vagarities of route propogation
and partial failures.

yes. we cover all of these by having a large peerset and integrating
our data across them. the outages that we report are not from a
particular point on the net. they are from a consensus of a large,
selected peerset.

And there are situations where the route is
down but the service is still up.

unless you use words differently, this is not true. by 'service' i
mean 'IP service'. if the route is down, no one can reach anything
associated with that route, obviously. do you mean 'service' as local
loop service?

There are other network monitoring groups
that do end to end connectivity tests from
geographically distributed clients out to
sample systems around the net. Some for research
and some for hire for network monitoring.

I think what they do is much closer to
identifying true outages than your method.

yes, that may be. those are good ways of identifying certain kinds of
outages. the problem is that they only measure what they measure.
frequently these systems measure well-connected sites monitoring
well-connected sites. this creates a bias in the data, tending to
suggest that no big event ever really impacts the internet. this is
obviously a false conclusion.

for reference compare the analysis of the 2003 US blackouts from
keynote:

(summary: nothing to see here, move along)

with those from renesys:

http://www.renesys.com//resource_library/blackout_results.html

(summary: >4K prefixes disappeared from the global table impacting
connectivity to hospitals, schools, government and lots of
businesses).

i would agree that our method of routing table analysis has
significant limitations and needs to be combined with other data. but
it's a fantastic way of showing a lower bound on what was affected:
prefixes without entries in the global table almost certainly have no
service.

t.

Steve_Gibbard · September 11, 2005, 4:35am

interesting discussion. at least we're talking about networking now.

wrt sean's comment, the only thing i can think he means by 'partition'
is that the networks may have power may be in some routing table but
just not the routing table of any of renesys's (or routeviews or ripe)
peers. in that case, i guess i would agree. our use of 'outage' is a
special case of 'partition' where the whole internet is on one side
and it's possible that the networks in question are on the other.
they may route somewhere. just not to the internet.

The difference between a partitioning and a complete outage can depend a lot on what's on each side of the partition.

If my DSL line goes down, I suppose that's technically a partitioning. I can still get to the DNS server in the basement, or to my neighbors' computers on my wireless network, but not to anything else. Meanwhile, the rest of the Internet can't get to anything in my or my neighbors' houses, but is otherwise functional. Complaining that that was anything less than a complete outage would be at best extremely pedantic, since there's likely nobody on my home network who particularly cares about being able to get to other things on my home network.

However, the same sort of partitioning can happen on a much bigger scale. There are some countries or large regions that have several ISPs, an exchange point they use to connect to eachother, locally hosted content, and a single path out to the rest of the world. In those areas, it's possible for the international link to fail but for connectivity to the nearby portions of the Internet to work fine. In those cases, it's far less clear-cut to say, "they don't have access to the Internet," and might be more accurate to say that their part of the Internet had been cut off from the rest of the Internet.

(I gave a talk on this at NANOG and a few other conferences last spring. The associated paper is at http://www.pch.net/resources/papers/Gibbard-mini-cores.pdf)

From what I understand of the Renesys methodology, the difference between

a partitioning and a total outage wouldn't be visible. A router in a region that wasn't able to send data to Florida wouldn't be able to send data to your collector (which doesn't mean the Renesys system isn't really cool for answering all sorts of other questions -- it is).

That said, I haven't heard any reports of a large scale partitioning happening in New Orleans. It sounds like most of what was down was down due to local infrastructure being under water or without power, so my guess is that the Renesys view was pretty accurate in this case. Thanks for sharing it.

-Steve

Alan_Spicer · September 11, 2005, 10:52am

love IPv6 more than you guys would ever give to a sole. Shoot I could run a big ISP on a single 48. God bless America.

Bring it on... Why are you so afraid?

Suresh_Ramasubramani · September 11, 2005, 12:40pm

Instead, you have small end sites getting /48s from tunnel providers,
and then running maybe two or three hosts on those.

And seriously, does the main assumption of v6, that every single
toaster out there is going to become a v6 host, really not scare
anyone? Giving IP connectivity to stuff that was just not designed
from a security point of view .. I'm sure people have seen all the
stories about network printers and electron microscopes running open
relay smtp daemons, so when do I get to see a botnet full of
compromised toasters that'll burn your toast to cinders if you try to
disinfect them?

ianai · September 11, 2005, 4:32pm

Inability to run our networks because the design lacks essential elements.

But feel free to run your network on it. If it works, then the rest of us will know. If not... then the rest of us will know.

Iljitsch_van_Beijnum · September 11, 2005, 5:01pm

And seriously, does the main assumption of v6, that every single
toaster out there is going to become a v6 host, really not scare
anyone?

Nope. I guess people have other things that scare them... See subject.

Giving IP connectivity to stuff that was just not designed
from a security point of view .. I'm sure people have seen all the
stories about network printers and electron microscopes running open
relay smtp daemons, so when do I get to see a botnet full of
compromised toasters that'll burn your toast to cinders if you try to
disinfect them?

Well, because I want to NAT some stuff (i.e., Windows XP box...) and not other stuff (the machines that I actually use) my wireless base station that is also a print server needs to accept print jobs from both "the outside" and "the inside". So far, I haven't found any spam printouts yet...

In other words: 0wning random appliances isn't all that interesting.

In fact, I would much rather allow access to pretty much anything else rather than a powerful general-purpose computer.

Alan_Spicer · September 11, 2005, 6:26pm

I don't think the point is that every thing could be connected to the Internet but that the worry that 2 things can't be connected and ISP's get to charge stupid fees for a static IP and that some countries other than the US are severely starved for IP addresses. The reason IPv6 adoption is so slow is because of things like NAT so the general public has no idea of any IP Address shortage. Until they try to run any kind of server on the Internet. If my ISP can give me a dynamic IP address on DSL for 100% of the time, regardless of wether it changes when I disconnect, means there are enough to give a static IP. I finally got one it took years to get it but an upgrade to service includes it now. I think the broadband stuff like increased DSL, and Cable and Cellular are going to starve these darned hoarded IP's out of the US companies that hold them and finally get this thing done one day soon. The fact that Google is looking at is I think is a wakeup call to that.

Bellsouth.net isn't offering IPv6 which is crazy they should talk to google I guess. So where is IP6 being done? I heard in mobile - cellular data?

Iljitsch_van_Beijnum · September 11, 2005, 6:43pm

some countries other than the US are severely starved for IP addresses.

Please point me to the RIR policies that say that organizations in the US that don't have address space get it, while the same request from a non-US organization is denied.

Or, how a US organization that doesn't have address space can get it other than from their ISP or regional internet registry.

Bellsouth.net isn't offering IPv6 which is crazy they should talk to google I guess. So where is IP6 being done? I heard in mobile - cellular data?

If you want IPv6, it's generally easier to use a tunneling mechanism rather than wait for your ISP to deploy native IPv6. Two important reasons why large outfits like Bellsouth aren't doing IPv6 right now is that their customers can't use it anyway because cheap residential gateways don't support it, and in a large network even an insignificant change costs a lot of money.

Joel_Jaeggli1 · September 11, 2005, 7:08pm

love IPv6 more than you guys would ever give to a sole. Shoot I could run a

big ISP on a single 48. God bless America.

Instead, you have small end sites getting /48s from tunnel providers,
and then running maybe two or three hosts on those.

And seriously, does the main assumption of v6, that every single
toaster out there is going to become a v6 host, really not scare
anyone?

It doesn't scare us... ever try nmaping a /48?

Instead of toasters, though which I think of as hyperbole, possibly because my toster is ~40 years old, and still works fine thanks, think digital set-top boxes and tv's that need bi-directional communication to unwrap drm, That's order of a billion or so devices in the US over the next 10 years.

Valdis_Kletnieks · September 12, 2005, 12:47am

In other words: 0wning random appliances isn't all that interesting.

Amazingly enough, the *single* biggest problem in trying to get Joe
Sixpack to secure their systems is "But I don't have anything they'd be
interested in..."

In fact, I would much rather allow access to pretty much anything
else rather than a powerful general-purpose computer.

On the other hand, if it's got enough smarts to do an IPv6 stack and have
enough left over to have something interesting to say, it's probably
"powerful enough" for miscreants to think of creative and interesting
uses for it, even if it *is* just a toaster....

Some small fraction of the population will network their toasters and
microwave ovens just Because They Can - but that's (a) just intellectual
masturbation and (b) those people have already *done* that. Everybody else
won't do it unless they discover the toasters and microwaves can carry on
a productive conversation. And for the miscreant, a device that can't do
much more than "I hear and obey" is often actually *more* useful than a device
that's likely to say "You want me to do *what*??"

Suresh_Ramasubramani · September 12, 2005, 12:50am

My microwave has a bigger and faster processor than the one that the
Apollo lunar modules had.

In the timelines people are looking at for v6 - or faster, if you
believe Moore's law - you're going to see "general purpose computers"
in lots of stuff that dont necessarily have a monitor and keyboard
attached..

And think of the next generation networkers mantra of "convergence" ..

--srs

Suresh_Ramasubramani · September 12, 2005, 12:55am

It doesn't scare us... ever try nmaping a /48?

one host at a time? from a single point? nope - once v6 becomes common
enough someone will just write a nice little distributed botnet to
propagate around it.

who wants nmap when all you need is to throw enough common exploits
blindly at a series of hosts?

the era of carefully crafted exploits against a single large host is
almost dead, except for really high value hosts.

botnets are kind of an industrial revolution in this area

digital set-top boxes and tv's that need bi-directional communication to
unwrap drm, That's order of a billion or so devices in the US over the
next 10 years.

a TV botnet will probably leave your channel locked onto a 24x7 feed
of Barney the big purple dinosaur and ... AAAH THE TENTACLES

But seriously, computing power that people would use for moon landings
a few years back is available on ubiquitous home devices that were
never intended to be connected to the internet.

Security is something that really must be taken into account now,
before it starts to become a problem

--srs

bill3 · September 12, 2005, 1:41am

>
> It doesn't scare us... ever try nmaping a /48?
>

one host at a time? from a single point? nope - once v6 becomes common
enough someone will just write a nice little distributed botnet to
propagate around it.

been there, seen that, and i want the green one.. :0

Security is something that really must be taken into account now,
before it starts to become a problem

er, not to be a naif, but what do you mean by "security"
in this context?

--bill

Suresh_Ramasubramani · September 12, 2005, 1:45am

Well, something like coding the firmware for whatever apps get
networked so that there is at least some amount of defense against
crackers breaking into them? A lot of stuff out there with
significant computing power concentrates on providing cool new
features, basically on the assumption that nobody is going to be dumb
enough to plug the thing into a network.

OK so forget toasters - look at stuff like printers, HVAC gear, phones
etc that are / will soon be networked.

--srs

Joel_Jaeggli1 · September 12, 2005, 1:49am

It doesn't scare us... ever try nmaping a /48?

one host at a time? from a single point? nope - once v6 becomes common
enough someone will just write a nice little distributed botnet to
propagate around it.

Drop me a line when your botnet finishes scanning 3FFE:0000::/16 and moves on to 2001:xxxx::

Probing for hosts isn't realistic. That doesn't rule out other resource discovery methods obviously, in fact it insures that they'll have to be used...

who wants nmap when all you need is to throw enough common exploits
blindly at a series of hosts?

the era of carefully crafted exploits against a single large host is
almost dead, except for really high value hosts.

botnets are kind of an industrial revolution in this area

For v4 space, spray and pray works well enough...

digital set-top boxes and tv's that need bi-directional communication to
unwrap drm, That's order of a billion or so devices in the US over the
next 10 years.

a TV botnet will probably leave your channel locked onto a 24x7 feed
of Barney the big purple dinosaur and ... AAAH THE TENTACLES

But seriously, computing power that people would use for moon landings

Uh... lunar module computer (1969), 5000 transistor cpu, 74k rom 4k ram.

a few years back is available on ubiquitous home devices that were
never intended to be connected to the internet.

Even low end parallax basic stamps have more horsepower than that.

Security is something that really must be taken into account now,
before it starts to become a problem

It's already to late, to stop it before it's a problem.

bill3 · September 12, 2005, 1:50am

> > Security is something that really must be taken into account now,
> > before it starts to become a problem
>
> er, not to be a naif, but what do you mean by "security"
> in this context?

Well, something like coding the firmware for whatever apps get
networked so that there is at least some amount of defense against
crackers breaking into them? A lot of stuff out there with
significant computing power concentrates on providing cool new
features, basically on the assumption that nobody is going to be dumb
enough to plug the thing into a network.

so, not security per se, more authentication...

OK so forget toasters - look at stuff like printers, HVAC gear, phones
etc that are / will soon be networked.

  those things are networkable now... as are these:
  light switches, door locks, keys, skis, toilets,
  stuffed animals, cars, elevators, bras, eye glasses,
  and some currency.
  ... the list goes on and on...

Suresh_Ramasubramani · September 12, 2005, 1:58am

so, not security per se, more authentication...

Authentication, access control, basic remote and local vulnerabltiies,
viruses .. the works

        those things are networkable now... as are these:
        light switches, door locks, keys, skis, toilets,
        stuffed animals, cars, elevators, bras, eye glasses,
        and some currency.
        ... the list goes on and on...

Scary isn't it? Not to sound like a stone age technophobe but some
things just weren't made to be put on a network. I mean, a bra for
god's sake .. unless someone wants to make a Lindsay Lohan + Brittany
Murphy movie on networked bras.

Suresh_Ramasubramani · September 12, 2005, 2:02am

It is a v6 botnet - so a correspondingly larger number of infected
hosts, and larger botnet size
If it is your argument that scanning just won't scale on a botnet,
anything can be made to scale if you throw sufficient resources that
aren't your own - botted toasters, like i said - at it

Valdis_Kletnieks · September 12, 2005, 2:23am

A /48 is 80 bits of address. 1,208,925,819,614,629,174,706,176 addresses.
Even at a million packets/second (which even Joe Sixpack will quite likely
notice until such time as the Linksys router you get at Walmart does 1M pps),
that's still 38,334,786,263 years of scanning. Of course, that's about
20 billion years after the Sun runs out of hydrogen and goes red giant and
incinerates the planet....

Now how big a pile of toasters were you planning to use?