Vulnerbilities of Interconnection

You also have the problem of cascading failures. Just because there
are redundant paths and alternate peering locations does not mean
those facilites have the bandwidth to handle all the redirected
traffic. If A gets swamped you go to B if the redrected traffic is to
much for B then you go to C and so on - each time the amount of
traffic increases and the avialble bandwidth decreases. According to
the analysis I've seen and run on the the Baltimore incident this is
the jest of how a few cut lines rippled across the Internet. I would
think Alex's scenario would have a bigger impact than that incident.


For some reason, I guess since Baltimore is near Washington DC, this
incident seems to have captured the imagination of folks in Washington DC.
Although some brand-name providers were impacted by this incident, it had
minimal impact on other providers. Essentially every major Internet
exchange point has failed at one time or another. In the past, there has
been simultaneous failures in at least three different locations.

The problem with your analysis is that's not what happens on the Internet.

One of the current issues of Internet traffic engineering is traffic
doesn't roll over to alternate paths B or C when the primary path A
is congested. This is a traditional design in the switched telephone
network, but not common in the Internet. Internet traffic tends to
follow the "best" available route.

Unlike phone calls, TCP traffic doesn't occur in fixed bandwidth
increments. TCP traffic, 90% of Internet traffic, is elastic. By design,
TCP adjusts the traffic rate to keep the bottleneck congested. As the
bottleneck moves, traffic reacts by increasing or decreasing the rate to
match the available capacity. This feedback occurs independently of what
is happening on nearby traffic paths. Even if there is available
capacity on elsewhere, the current Internet design is not very good at
using it. Some people view this as an inefficient use of available
capacity, other people view it as a self-protective mechanism.

In today's Internet, the type of cascading failure you postulated probably
won't happen. The design goal of the Internet is not to keep every part
of the network operating under every condition, but failures in part of
the network should not disrupt other parts of the network.

That's why during the Baltimore train tunnel you saw some providers with
severe problems in parts of their network, but other providers didn't
experience any slowdowns in their networks. I wouldn't be surprised if
a few people even experienced an improvement in their traffic that day.

There are vendors trying to sell systems which will "steer" traffic
through alternate paths seeking to avoid congestion. In addition there
are things like IEPREP which are seeking to bypass the congestion feedback
controls for selected traffic. It is unclear to me what impact these
will have on Internet traffic during a crisis. It is possible these
improvements will in fact make the Internet more brittle.