From: Craig Labovitz <email@example.com>
* there are an awful lot of withdraw/announcements out there
* we don't know were they are comming from, and no, it is not Cisco's fault
* some valid, and standards-compliant vendor implementation decisions are
contributing a small fraction of extra withdraws (yes, Cisco is one of
these vendors, but there are probably others)
You know, Craig, sometimes you are just too nice.
If "we don't know were they are comming from" [sic], you don't know
whether it is or is not Cisco's fault. Why bend over backwards to give
such public deference?
An implementation that propagates _extra_ withdrawals shouldn't _hide_
behind "standards compliant". In fact, I don't think _is_ either
"valid" or "standards compliant". There is no standard that says "send
extra BGP withdrawals for routes that you are not currently announcing".
It was just a bug in the implementation.
We _know_ (based on your research and Cisco's admission) that Cisco's
have/had this bug. Therefore, it is at least Cisco's "fault".
We don't _know_ (no research, no admissions) that anybody else has this
problem; so the value of "probably" is more like "possibly". In any
case, it "probably" doesn't matter if anybody else has the problem,
since Cisco controls 85+% of the backbone market.
* there are differing opinions on how much of a problem all of this extra
withdraws pose for Internet
There are always differing opinions about the Internet. Heck, there are
differing opinions about whether the Internet is even useful. We are
not looking at or talking about opinions, we are looking at operational
facts. And the facts indicate that the duplicate withdrawals comprise a
large fraction of the BGP traffic, which in turn comprises a large
fraction of the backbone router CPU usage.
* even though it is not clear there was/is a significant problem, vendors
(Cisco) have created fixes to limit some of the extraneous routing
Have these fixes been universally deployed yet? What release? And how
did that affect the withdrawals seen in the overall backbone?
* a recently noticed 30-second periodicity to updates tends to suggest a
systematic problem in the infrastructure.
What you mean is a systematic problem in the routing code, or maybe even
in the BGP specification. Occam's Razor.
The "infrastructure" is something else entirely. The number of wires in
the routing mesh are not likely to have any effect on the periodicity of
the traffic sent over them.
This 30 second periodicity
occurs with both widthraws and announcements (and seems independent of
the extra withdraw problem).
Yes, so why are we talking about it in the same breath (paragraph)?
Wouldn't it be better to acknowledge it as a separate issue?
And how is the research on the source of the problem progressing?
Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32
Key fingerprint = 2E 07 23 03 C5 62 70 D3 59 B1 4F 5E 1D C2 C1 A2