NANOG36-NOTES 2006.02.15 talk 4 Interdomain Routing Consistency

Access point movie goes whizzing past very quickly
as Bill Fenner narrates.
Lets you see where people are congregating, and
which talks are more interesting, and when people
migrate out of talks; could feed into the survey
to tell the program comittee which talks are of
more interesting.

netdisco, collects data from network elements,
plots them, put a front end on it;

If you opted in, by emailing him you MAC address,
it would render a map with your location on it.

has RSS feeds of your location as well.

fenner at

2006.02.15 An Inter-domain Consistency Management Layer
Nate Kushman, MIT

Steve Feldman, welcome back, Nate Kushman is up first
to talk about routing consistency.

Transient BGP loops
was with akamai, now at MIT
srikanth kandula, dina katabi, john wroclawski

Do loops matter?
can we do something about them?

what is a transient BGP loop?
slide showing loop forming.

How common are “transient BGP loops”

Sprint study, IMC 2003, IMW 2002
looked at packet traces from the sprint backbone
up to 90% of the observed packet loss was caused by
routing loops
60-100% could be attributed to BGP

Is it true on internet?

Routing loop damage

20 fvantage points with BGP feeds
did pings, traceroutes, watch for loops.

correlated on BGP updates, and ttl exceeded
on ping, traceroute.

In fact, all loops were within 100seconds of
BGP updates.
10-15% of all BGP updates caused routing loops!!

Collateral damage.
they cause impacts on congestions that are part of
the loop, causing loss to non-rerouted networks
from non-rerouted-to source networks.

traceroute to see which links were part of the
loop, see which other traces shared a link in
common with the loop.
there is a marked increase in packet loss in
the 100second window around the BGP loop.

Prefixes sharing a loopy link see 19% packet loss
in general.

What should be done?
We need to prevent forwarding loops.

A loop occurs because:
one AS pushes a route update to the data plane, but
other ASes are not yet aware of that route change.

What about telling everyone about the change before
the change actually happens?

continue to route traffic
tell control system not to propagate the route
FIB stays same for now, RIB doesn’t send route.

downstream networks only update forwarding tables
once upstreams have acknowledged the path change.

More generally:
we have proven:
loops are prevented in general case
convergence properties similar to normal BGP
incrementally deployable.


works well for planned maintenace. We can delay move
to backup path during those events, at least.
20% of update events caused by planned maintenance
Link up events also cause loops, no way to plan for
them smoothly now.
What about:
unplanned link down events
trade-off between loss on current path and collateral damage

Are we willing to do this in general, to avoid impacting
stable prefixes from unstable prefixes.

In short: routing loops are a significant performance

Bill Norton–hidden question: what is the time domain
during which these traffic impacts are seen? Will
the propagation path take 10, 20, 30 seconds?
A. one event causes many, many loops rippling out,
so one update may cause packet loss for many seconds,
up to tens of seconds total.
Q. you’re talking about adding MORE state information
into the network. Also adding latency to update

Jared notes that router software bugs tend to
exacerbate routing loop issues. You can tune configs
to try to minimize the number of loops seen, as well
as upgrading to “fixed” code to get better results
without more state.

Patrick Gilmore asks jared, does tuning help internal
sessions or external sessions? Both, it really controls
when the updates are sent out (immediately vs batched,
etc.). Jared notes the internet is being used

Someone (Bill?) asks if convergence times are similar to
current model, as the slide claims; is that within
a few seconds? convergence in the lab is similar, yes.

Matt Petach asks about details of convergence; it
basically puts you at mercy of the slowest, farthest
away router on the network, since it has to get the
message, realize it has nobody to send to, and then
acknowledge back before anyone else can update FIB;
yes, true, so you’d want to put timers in to limit
how long you wait; basically, like “wait 5 seconds,
and either hear an ACK, or go ahead and update FIB”
type timeout, so you don’t wait forever for a
non-conformant device on the other side of the world.

Riverdomain question–with suspension, you’re basically
in passive mode, listening but not updating, is that
correct? Yes, with respect to the links/prefixes in