RFC 1771, further thoughts

Marshall Eubanks wrote:

>In an attempt to return to an argument, rather than simple contradiction
>(ok, ok, it's far more polite and reasonable so far than that would imply,

>but I couldn't miss the cheap shot; apologies hereby tendered), perhaps we

>should consider *what* the RFC should say, if it should be changed? Going
>to the WG with a proposal in hand and a rationale to support it would seem

>to be the best path.
>So, a summary of my view on it at the moment:
>Assumption #1) Resetting a BGP session is 'costly'. Both in terms of the
>time it takes, the stability it removes, and the fact that it flaps all
>of your *outgoing* announcements as well as incoming ones.
>Assumption #2) A router that sends a malformed route is clearly doing
>something which it Should Not Be Doing (tm) (ok, this might be axiomatic,
>but should still be laid out)
>Assumption #3) The current practice has been shown to demonstrably
>increase the brittleness of the Internet, by causing severe flapping when
>someone only partially follows the RFC (in particular, propagating bad
>route data, whether or not the origional source session is reset).
>Assumption #4) Routing errors which are bad data, but *not* malformed
>routes, will not generally be caught by normal means in normal operation,
>until a case of human intervention to cross-check the data.
>Assumption #5) Any router which breaks so badly as to start spewing large
>amounts of validly formed but errorneous data, and is *also* spewing badly

>formed data, will spew noticeable amounts of said badly formed data. (This

>one is key, and is only a conjecture; field evidence would be of great use

>in validating it).

Can "badly formed data" be reasonably clearly defined ?
What tests are there for "validly formed but errorneous data" ?

To clarify, I will use a non-BGP example.

"Sky blue is the" is a badly formed english sentance. It violates the standard
rules of syntax, and is patently obvious to anyone familiar with said syntax
to be invalid on it's surface, even if one could infer what it probably was
trying to say and derive the information.

"The sky is purple with green polka dots" is a correctly *formed* sentance,
but the data in it is erroneous/corrupt (well, assuming we haven't altered
the laws of physics, etc etc).

In BGP, badly formed data does not meet the requirements as laid out in the
RFC for properly formated messages, while erroneous data is simply any data
which does not accurately represent what it should (a typoed routing
announcement would be a clear example). The former can be caught by very
simple automated checking on input (validity testing), while the latter
can only be detected by some level of human intervention (noticing the
bad route, or filtering based on human-defined policy, even if the filtering
is done automatically).

There are several monitoring efforts (including the one done here) which compare
sets of (m)bgp routing tables over time.
It seems to me that such (m)bgp pollution
should be detectable with a monitoring project.

This is one way for humans to cross-check for erroneous but well-formed data.

BTW, what seems to be the clearest sign here of the
recent flap was the dropping of
43 Autonomous Systems by UU.net for the
Sat Jun 23 16:37:41 2001 status run. This is not a good
enough metric to relieably detect such problems. There do seem to be a lot of
weird changes in the routing table in that
dump, but a simple test for this is not apparent to me at present.

The only simply test that I was implying could be added is "if we have
received X badly formed routes from a peer, assume it is insane and can't
be trusted", where we permit X to be >1 and do something short of a full
session drop/admin-down/etc for N bad routes where X > N > 0.