link between Sprint and Level3 Networks is down in Chicago

We received confirmation from Time Warner. The link between Sprint and
Level3 Networks is down in Chicago. This has been an issue since 3:10 PM
EST. Time Warner has a ticket open to address the issue. Not sure what it
is yet.

-Dennis

Does someone know if this is a *single* link down?? It seems bizarre to me that there would only be a single link (geographically) between those two.

Whatever happened to redundancy?

Deepak

Dennis Dayman wrote:

Accounting. :wink:

- billn

It’s not uncommon at all to have a single interconnect between providers in a geographical area. Multiple interconnects within a region tended to be more for load balancing, not redundancy.

There have been certain providers that, historically, could or would not support anything larger than an OC48. In those cases you had multiple circuits to maintain traffic balance. When they finally migrate to 10GE you back off to a single circuit and use your backbone for city redundacy.

-Steve

Whatever happened to redundancy?

lost in the transition from reality to fantasy and conjecture? it's the
sharp curves.

randy

also perhaps in other regions of their network they have connectivity, so
it was expected to fail out of region properly?

Chris L. Morrow wrote:

Whatever happened to redundancy?

lost in the transition from reality to fantasy and conjecture? it's the
sharp curves.

also perhaps in other regions of their network they have connectivity, so
it was expected to fail out of region properly?

I think that is what I meant. If there is no other connect anywhere, I understand it being dead (though, except for accounting, I don't understand why there is only one). If its not failing over to use one of the other connects, methinks some clever noc fellows could make that work temporarily whilest the troublesome routers are addressed betwixt them.

Everyone has been using such clever figurative speech I decided to get a little flowery too.

Deepak

I think that is what I meant. If there is no other connect anywhere, I
understand it being dead (though, except for accounting, I don't
understand why there is only one). If its not failing over to use one of
the other connects, methinks some clever noc fellows could make that
work temporarily whilest the troublesome routers are addressed betwixt them.

So, I'm not sure how sprint/l3 are connected, I imagine they have more
than one interconnect, in more than one region... It's possible something
'bad' happened and the paths didn't get removed :frowning: It's hard to speculate
from 2 as-hops away :slight_smile:

Everyone has been using such clever figurative speech I decided to get a
little flowery too.

haha :slight_smile: pleae continue to be verbose and flowery.

We started seeing problems as early as 2:05 PM EST, but saw substantial improvement at 3:30 PM. I would be interested in comparing notes with anybody else affected by the issue - or if anybody has heard an actual explanation from Sprint/L3.

Chris L. Morrow wrote:

Does someone know if this is a *single* link down?? It seems bizarre to
me that there would only be a single link (geographically) between those
two.

Whatever happened to redundancy?
Deepak

From the outside, this appeared to be more like a CEF

consistency sort of thing; routes were still carrying packets
to the interconnect, but the packets were not successfully
making it across the interconnect. I would hazard a guess
that had the link truly gone down in the classic sense, BGP
would have done the more proper thing, and found a different
path for the routes to propagate along.

Again, this is speculation from the outside, based on the
path packets were taking before dropping on the floor.

Matt

Charlie said:

I would be interested in comparing notes with anybody else
affected by the issue - or if anybody has heard an actual
explanation from Sprint/L3.

Things started to clear up for us at around 1443 Central, so it wasn't
too long after I posted my original inquiry to the list. By 1500
Central it seemed to be fully functional. Over at
http://www.internetpulse.net our experience seemed to be reflected in
that the Network Availability metric for the Sprint/L3 intersection
started creeping back up from 80%.

This morning at 0830 I got a phone call from the BSAC folks at Sprint,
who said it was exclusively a routing issue within L3. When I pressed
for details all I got was the same answer. Make of that what you will.

-JFO