it was damp in belleview

the network at the belleview meeting was fantastic, thanks to the host,
xkl, and the usual suspects (merit, tony, ...).

but there was one outage. as best i know, that outage was caused by one
of the two upstream transit isps bouncing (at least) the nanog prefix on
one of their routers far over the water in seattle.

of course, that is why we had two upstreams. but, flap amplification
and dear old route flap damping caused our prefix to be damped by a
number of global isps. yes, our prefix, not just the path.

for those who want to read up on route flap damping and why this caused
problems, please see <http://www.nanog.org/mtg-0210/flap.html> "Route
Flap Damping: Harmful?" from the 2002 nanog eugene meeting.

for those wishing historical perspective on route flap damping, document
ripe-378 (may 2006) says

    1.1 Background

    In the early 1990s the accelerating growth in the number of
    prefixes being announced to the Internet (often due to inadequate
    prefix-aggregation), the denser meshing through multiple
    inter-provider paths, and increased instabilities started to cause
    significant impact on the performance and efficiency of the
    Internet backbone routers. Every time a routing prefix became
    unreachable because of a single line-flap, the withdrawal was
    advertised to the whole core Internet and handled by every single
    router that carried the full Internet routing table.

    It was soon realized that the increasing routing churn created
    significant processing load on routing engines, sometimes
    sufficiently high load to cause router crashes.

    To overcome this situation RFD was developed in 1993 and has since
    been integrated into most router BGP software implementations. RFD
    is described in detail in RFC 2439. RFD is now used in many service
    provider networks in the Internet.

for reasons described in the 2002 preso cited above, and demonstrated by
the network outage in belleview, ripe-378 goes on to conclude

    4.0 Recommendation

    This Routing Working Group document proposes that with the current
    implementations of BGP flap damping, the application of flap
    damping in ISP networks is NOT recommended. The recommendations
    given in ripe-229 and previous documents [2] are considered
    obsolete henceforth.

i.e., it's time to turn it off. you are damaging your customers and
others' customers.

randy

Randy Bush wrote:

i.e., it's time to turn it off. you are damaging your customers and
others' customers.
  

There is a growing number of "Tier 1" NSPs who do not dampen anymore (or at least they don't dampen their customers).

NTT is one of them. Who are the others?

-David

i.e., it's time to turn it off. you are damaging your customers and
others' customers.

There is a growing number of "Tier 1" NSPs who do not dampen anymore (or
at least they don't dampen their customers).

damping one's customers has never been very sane. they pay us to put up
with their <bleep>. damping a customer is direct death to them. i wish
all my competitors did that.

damping one's peers has been another matter. this is what caused the
nanog meeting prefix to be widely damped, and this is the issue i am
addressing.

and, if you tell us that you need to damp in order to save your routers
from drowning from churn, then you had best stand up and cry "bs!" when
dave and john they tell us two million prefixes is just fine. and your
rir attendees had best be on the very prefix-count-conservative side in
rir pi space allocation discussions.

randy

I've always thought that damping as an idea is a good one, but implementation
is done horribly wrong. I want my customers to get best of the stable
paths, i.e. I'd like to see method to dynamically worsen routes in path
selection that are unstable, local-pref would be the obvious choice for me.