routing meltdown

Paul A Vixie previously wrote:

Precisely my point. I think it's a neat solution.

So do I, in the absence of ...

>route servers and a unified/recursive/realtime RADB. [...]

We're discussing this precisely because you're not sure, as you say,
what course of action is best.

"I'm not sure" in this context is a euphemism for "that's a really bad idea."

I figured. :slight_smile:

Nope. As you're about to explain yourself, the RS architecture makes it
possible to spout different truths out of each and every orifice. It is
very definitely _not_ nec'y for every RS peer to have the same view, or
that the Internet have a single great and context-insensitive "Truth".

Yes, indeed. But one RS will have to be far larger than each of the
collocated RSes plus its software would be far more complicated.

Mind you, I like the idea of a central RS configured from a routing DB
that is up to date and populated with correct information.

Right now, this is the thing that makes the RS unusable. At the RPS WG in
Stockholm I heard Daniel talk about a way to improve RADB update times, and
Bill Manning and I proposed a DNS-based RADB whose rollups could be done by
anyone needing the information rather than by a central body. While I admit
that the RS doesn't have realtime updates right now, I know that it's coming.

Well, the RADB can be updated very quickly nowadays via e-mail; that's
not the point. The point us that the RS itself needs to reflect RADB
changes very quickly.

Knowing Sean for who he is, I'm fairly sure that no RADB or RS will ever be
suitable to him. In particular...


and should we really trust such a route server? its implementation?
its administrators?

...while I would trust those things if given sufficient reason to, I know of
at least one network/routing engineer who wouldn't no matter how sufficient
the reasons seem to the rest of us. So your point is valid on that score.

An RS that implements every AS' policy and responds to changes to its
routing DB quickly is quite an undertaking, its software will be fairly
complicated, that's why some people will not trust it for a long time;
others won't trust the RS because it is not them running it.

In legal terms, we cannot add contract terms now that a lot of peering points
are in use. There is, quite literally, "no way to require" colo'd workstations
at peering points. It doesn't matter how little rack space it takes, or that
the BGP4 traffic would be on different media like Ethernet, or whether it is
(I'm not going to take a position) a wonderful idea or not. Legally, we cannot
require it. Practically, the peering points are "open" to the extent that
folks are expected to use their GIGAhose "as they see fit."

Actually, the only ASes that would benefit immidiately from this
collocated RS configuration would be those that are seeing 100k or more
paths now (i.e. folk who are located at multiple NAPs, mostly do
non-transit peering, peer with others like themselves and buy little
transit from others). The rest could continue peering at the XP as they
already do since they only add small numbers of paths and tend to buy
transit from others to the routes that they don't get via non-transit

Anyways, noone has to be forced to use collocated RSes (if they can't be
asked to use collocated RS, how can they be asked to use the RA RS?), if
it's the only acceptable way to prevent routers from falling over, then
many will choose to implement this. Besides, where peering is arranged
on a one-by-one basis rather than multilaterally (as at the MAEs) any
carrier of sufficient size can refuse to peer with anyone not willing to
use collocated RSes.

I think this is a reasonable architecture and if I am asked to recommend an
architecture to someone connected to a MAE or NAP, I will mention this one.
I recommend that you do the same. Perhaps you can live to see your ideal
done up in practice. But banish all expectations that either a peering point
administrator will help enforce your ideology, or that you will ever get full
voluntary buy-in from every peer.

See above.

N**2 BGP4 sessions are bad for likely values of N (100, maybe.) That won't
change just because we've got a 1GB-RAM DEC Alpha with a 300MHz processor
instead of a Cisco to do our route processing. N**2 BGP4 sessions is a bad
design no matter what you're implementing it with. In that sense, your idea
is not "viable" since it doesn't solve some of the real problems coming up.

Oh, of course, and I did mention that collocated route servers do not
solve any scalability problem, they just get around the Cisco memory
capacity problem. Apart from this, it is also worth noting that few
routers at each XP announce a full set of routes to some or all
neighbors: most announce what they originate, transit ASes announce what
they originate as well as what their customers originate; those who do
not get enough routes from non-transit peering buy transit for what they
cannot hear from bigger carriers. The result is that with 20 neighbors a
given router likely won't hear 20x full-Internet paths. :slight_smile:

Does a central RS per-NAP have better scalability than collocated per-AS
route servers do? This is an honest question; I would venture that it
doesn't, but I have not studied the question enough. The whole point of
my suggesting collocated per-AS route servers was that Sprint refuses to
go with the RA RS idea and Sprint is large enough that there is now
going to be a bit of the Pittsburg NANOG conference dedicated to this
problem; my hope was that collocated route servers would be acceptable
to Sprint, thus buying us much time before even that scheme became
unusable due to scalability problems.

[A fast PC with 1GB of RAM should be able to handle any NAP for the next
year to two years, by which time IPv6 could be hitting the scene]

I can't even begin to comment on that, I'm sure that I'd offend you.

That's ok. If I'm ignorant of something, I'd like to learn; it won't
offend me to be shown I don't know something (at worst I will feel
embarrassed). What bothers you about that comment? I know how quickly the
Internet is growing and I'm betting that collocated route servers will
scale for the next year at least, probably longer. As for IPv6, I freely
admit that I'm not up to date on what is happenning with it, but I
understand it uses CIDR from the beginning. That IPv4 used that awful
class scheme for so long is the main reason we're talking about 100k
paths and routers falling over; if IPv6 can start with CIDR from the
beginning along with no IP address protability accross carriers, then
we'll be ok. IPv6 could benefit from better host, router and name server
configuration protocols so as to simplify renumbering; so could IPv4.