RE: multi-homing fixes

From: Adam Rothschild [mailto:asr@latency.net]
Sent: Thursday, August 23, 2001 10:36 PM

> At $99US for 512MB of PC133 RAM (the point is, RAM is disgustingly
> cheap and getting cheaper), more RAM in the routers is a quick
> answer. Router clusters are another answer, and faster CPUs are yet
> another.

Throwing more RAM and CPU into our routers (assuming for a
moment that
they're most certainly all Linux PC's running Zebra) is not the
solution you're looking for; the problem of RIB processing still
remains.

Getting a forwarding table requires extracting data from the RIB, and
this is the problem, because RIBs are very large and active, and are
being accessed by lots of reading and writing processes. RIB
processing is substantial, and is only getting worse.

SMP systems and multi-ported RAM is a good enough stop-gap. If I didn't like
non-deterministic systems, I might suggest Echelon technologies
(hardware-based neural nets).

> If the IETF is being at all effective, that should start now and
> finish sometime next year, so that we can start the 5-year
> technology roll-out cycle.

Roeland, The IETF is eagerly awaiting your solution. Send code. See
Tony Li's presentation at the Atlanta NANOG on why this solution of
jamming RAM and CPU into boxes is not a long term viable answer:

  <http://www.nanog.org/mtg-0102/witt.html>

I've read that and largely agree. The hardware approach was only meant to
buy time, while the geniuses at the IETF find a better approach. What I
don't agree on, and am amazed to see, the admission that they don't know at
what point the convergeince problem becomes intractible. Or even, if it
does... that sounds more like a fundimental lack of understanding of the
algorithm itself.

In short, state growth at each level must be constrained and must not
outstrip Moore's law, and to be viable in an economic sense, it must
lag behind Moore's law.

In the mid-80's, I worked on an OCR problem, involving a add-on 80186
processor card. We used a brute-force solution. It as too slow on the 8 MHz
CPU. Years later, with the advent of faster hardware, the product was
released. It's funny that the market timing was just about perfect. It gave
that company a huge head start, when the market turned hot. It is alright to
target performance/capacity solutions expected to be present at the time of
product release (about 5-years from now). In fact, that's about the only way
I see the problem getting solved.

SMP systems and multi-ported RAM is a good enough stop-gap. If I didn't like
non-deterministic systems, I might suggest Echelon technologies
(hardware-based neural nets).

  Nothing is a good enough stop-gap while things continue to grow at
this rate. Encouraging people to throw an extra hamster onto the wheel
does not solve the problem for long at all, and encourages more waste of
existing resources.

I've read that and largely agree. The hardware approach was only meant to
buy time, while the geniuses at the IETF find a better approach. What I
don't agree on, and am amazed to see, the admission that they don't know at
what point the convergeince problem becomes intractible. Or even, if it
does... that sounds more like a fundimental lack of understanding of the
algorithm itself.

  CIDR was only meant to buy time. We've bought our time, and we
still don't have a solution. If it were that easy, people would be doing
it. Buying more time in small increments is not necessarily in our
interests.

  Continuing to pile announcements onto this mess, looking for the
prefix that breaks the camel's back, is not a good idea until you -have-
a solution.

In the mid-80's, I worked on an OCR problem, involving a add-on 80186
processor card. We used a brute-force solution. It as too slow on the 8 MHz
CPU. Years later, with the advent of faster hardware, the product was
released. It's funny that the market timing was just about perfect. It gave
that company a huge head start, when the market turned hot. It is alright to
target performance/capacity solutions expected to be present at the time of
product release (about 5-years from now). In fact, that's about the only way
I see the problem getting solved.

  Precisely -- and this /doesn't work/ if you make the existing problem
worse during those intervening 5 years.

  --msa

Roeland Meyer <rmeyer@mhsc.com> writes:

>> From: Adam Rothschild [mailto:asr@latency.net]

>> being accessed by lots of reading and writing processes. RIB
>> processing is substantial, and is only getting worse.

SMP systems and multi-ported RAM is a good enough stop-gap. If I didn't like
non-deterministic systems, I might suggest Echelon technologies
(hardware-based neural nets).

Roeland, what I believe Dr. Rothschild was alluding to is that the
underlying problem here is is more of fundamental database managment
problem. Key issues are concurrency controls involved in operating on
data that must be accessed by multiple readers and writers, among
other things.

don't agree on, and am amazed to see, the admission that they don't
know at what point the convergeince problem becomes intractible. Or
even, if it does... that sounds more like a fundimental lack of
understanding of the algorithm itself.

Large, distributed database systems are annoying and hard to deal
with, there is no silver bullet yet. This is something the DB folks
have been working on for years and since AS's can be viewed as a
distributed multiversion databases, no doubt there is much to be
learned from the research in the DB field.

/vijay

Roeland Meyer writes:

I've read that and largely agree. The hardware approach was only meant to
buy time, while the geniuses at the IETF find a better approach. What I
don't agree on, and am amazed to see, the admission that they don't know at
what point the convergeince problem becomes intractible. Or even, if it
does... that sounds more like a fundimental lack of understanding of the
algorithm itself.

This is not a particularly tractable problem. In my experience, large
distributed systems usually give little to no warning before they melt
down, and once they do, it's not necessarily obvious how you get them
back to a stable state.

When you have an algorithmic model of multiple connected
autonomous systems, each configured in unspecified ways,
which you can analyse in some manner which is non-NP
complete and does not require collection of uncollectable
data, please post here, as I and a lot of others here would
be interested to make the presentation.

Sarcasm aside: routing theory is 'easy'. applicability
to the real world internet is hard.