What is the limit? (was RE: multi-homing fixes)

Sean_M_Doran · August 29, 2001, 11:58am

The Internet *FAILED* in 1994. The number of prefixes carried globally
exceeded the ability of the global routing system to carry it, and
as a result, some parts of the world deliberately summarized away
whole large-scale ISPs merely to survive.

The Internet *FAILED* again in 1996, when the dynamicism in
the global routing system exceeded the ability of many border
routers to handle, and as a result, some networks (not just core ones)
deliberately made the Internet less quick to adjust to changes in
topology.

Thus, the routing system FAILED twice: once because of memory, once
because of processor power.

If the size or the dynamicism of the global routing system grows
for a sustained period faster than the price/performance curve of
EITHER memory OR processing power, the Internet will FAIL again.

I do mean the Internet, and not just some pieces of it.
The old saw, "the Internet detects damage and routes around it" simply
isn't true, when your routing system isn't working.

It was unpleasant both times. Three continents were effectively
isolated from one another for a couple of days while organizing
a response to the memory crisis. Three of the largest ISPs at
the time were crippled on and off for days during the processing
power crisis, and even when mechanisms were brought into place,
relatively unimportant bugs destabilized the entire Internet
from time to time.

Note that the processor power issue is the one that has been
scariest, since it has had small-scale failures fairly regularly.
Things like selective packet drop *exist* because of the
price/performance curve and engineering gap for deploying and USING
more processing power.

So, Moore's Law, or more specifically the underlying curve
which tracks the growth of useful computational power, is
exactly what we should compare with the global routing system's
growth curve.

Note that when Moore is doing better than the Internet,
it allows for either cheaper supply of dynamic connectivity,
or it allows for the deployment of more complex handling
of the global NLRI.

The major problem, as you have pointed out, is that processing
requirement is often bursty, such as when everyone is trying
to do a dynamic recovery from a router crash or major line fault.
We could still use 68030s in our core routers, it's just that it'd
take alot longer than it used to perform a global partition repair,
which means your TCP sessions or your patience will probably time out
alot more frequently.

I think that what we need to do is have a fourth group, call them Internet
Engineers for lack of a better word, come in and determine what the sign
should read.

Structures built according to best known engineering practices
still fall down from time to time. That's the problem in anticipating
unforeseen failures. Consider yourself lucky that you haven't had
to experience a multi-day degredation (or complete failure!) of
service due to resource constraints. And that you haven't
run into a sign that says: "please note: if you try to have an
automatic partition repair in the event this path towards {AS SET}
fails, your local routing system will destabilize".

Finally, we have a sixth group, call them the IETF, come in
and invent a flying car that doesn't need the bridge at all.

As Randy (with his IETF hat) says: "send code".

Sean.

Leo_Bicknell1 · August 29, 2001, 2:06pm

This is great FUD. First rate. Have you considered working on a
political campaign? This statement seems so true, but is so false.
We are all being held hostage by vendors here, and I hope the rest
of the people on here are letting them know as loudly as I do at
every opportunity.

Routers, in terms of route processing ability, are about as far from
state of the art as computers get these days. I can buy a < $1000
PC with 10x the MIPS and twice the memory of major vendors largest
routers. Intel and IBM have built supercomputers that can model
millions of atoms in a nuclear weapon. IBM has a machine that can
play chess. Oracle can set TPC records well over the average rate
of change of the BGP table on data sets 1000 times as large.

Once when I had hardware designers in the room from a major router
vendor I asked them 'why isn't the CPU on your route processor
socketed'? They looked at me completely puzzled, looked at each
other, and then asked in a timid voice "why would you want to do
that"? I looked at them and said 'so you can upgrade it when
a faster CPU comes out'. They replied with "we don't want people
field upgrading CPU's". I just shook my head and said it would
be nice if when a faster one came out they could spin a new rev
of the board with a new CPU faster, I didn't want to upgrade in
the field. They then started scribbling notes furiously.

Don't even get me started on the discussion of why they were custom
designing a board for the route processor, when there are off the
shelf motherboards, or if it must fit in a form factor, motherboard
designs that would be less costly for them, use all off the shelf
parts, and would allow them to bring things to market quicker.

I bet at least half the people reading this e-mail have more CPU
and memory in the box they are using for that task than the largest
core router in their network has for processing routes. And that's
without exploring the things router vendors could do to really
speed things up, like true multi-processor designs, or real amounts
of memory. I don't build a server with less than 1G of ECC RAM
these days, because it's $200. But if I buy a $1M router I'm
lucky to get 64M for processing routes.

Look back. Routers have always lagged _WAY_ behind most other
computer technology in terms of processor power, RAM, and general
purpose IO. In fact, I would go so far as to venture that they
are falling further behind, that is not keeping up with moores
law even when it is true.

There was no reason for those past failures. It was a combination
of bean counters, cluelessness, and lazyness. The routing table
is growing slower than the number of lines of code in Windows, and
god help us if we can make Windows "work" we should be able to do
some simple routing.

James_Smallacombe · August 29, 2001, 2:21pm

There is certainly some truth about re: planned obsolescence of very
expensive routers, but every time I hear this argument, it always seems to
overlook the three most important factors in big router hardware
performance: I/O, I/O and I/O.

Nonetheless, it's still annoying as hell that Cisco can't just allow a GB
or more of RAM in 5-6 figure router and just be done with that aspect of
it...

James Smallacombe PlantageNet, Inc. CEO and Janitor
up@3.am http://3.am

Andrew_Partan · August 29, 2001, 2:41pm

To the extent that this is true, and to the extent that router
vendors can and are willing to fix the problem, and to the extent
that ISPs can and are willing to deploy the new gear, and to the
extent that the overall problem is fixable by making the nodes that
make up the entire system faster, this is only a one-shot fix.

Sean's point is still true - to the extent that the system grows
faster than Moore's law, we are still sunk.

If the problem is growing at Y/year and the elements that make up
the system can grow at Z/year, then if Y>Z, you will have problems
at some point.

You have to figure out how to make the work that the elements of
the system are handing grow no faster (and preferably slower) than
the elements themselves can grow.

If the overall

--asp@partan.com (Andrew Partan)

Leo_Bicknell1 · August 29, 2001, 3:00pm

Maybe. For a single CPU, getting up to the state of the art, yes,
it is a one shot fix. Mind you it's one shot that could add _years_
of service to the existing solution (a 10x speed up in CPU could
give us 3+ years right there, if you believe doubling every year).

I think there is real promise in SMP though. There are many SMP
applications that scale near linearly, and I think properly designed
routing can be one of them. If a linear SMP solution can be found
then there is at least one way to scale the routing infrastructure
to near infinate size simply for $$$'s.

Even if a modest 4 processor design using the latest technology
could only yeild a one time, 50x speed up that would give us 5+ years
(again, doubling every year) of not worrying about that end of it
to work on new protocols and the like. Heck, if you believe the
predictions in 5 years we'll be out of address space and ASN's
anyway, so a 'one shot' fix might take us to the end of the IPv4
days.

Bandy_Rush1 · August 29, 2001, 4:39pm

The Internet *FAILED* in 1994.
The Internet *FAILED* again in 1996

indeed. and the lessons you point out are horrifyingly germane.

As Randy (with his IETF hat) says: "send code".

i try to credit vince perriello for that one, though few here would
know him.

randy

Bandy_Rush1 · August 29, 2001, 4:51pm

If the size or the dynamicism of the global routing system grows
for a sustained period faster than the price/performance curve of
EITHER memory OR processing power, the Internet will FAIL again.

This is great FUD.

no. this is the wisdom of folk who lived through it and actually did
it, not those who blow clueless smoke on mailing lists about it.

<plonk>!

randy

John_Ferriby1 · August 29, 2001, 5:57pm

Look back. Routers have always lagged _WAY_ behind most other
computer technology in terms of processor power, RAM, and general
purpose IO. In fact, I would go so far as to venture that they
are falling further behind, that is not keeping up with moores
law even when it is true.

Gawd. This so true. Back in the moldy old days when DECnet was
far more common, Digital brought out a dedicated router. VAXen
were the common systems and host based routing of DECnet was very
common. What hardware platform did they use? PDP 11/24. Early
ciscos (pre AGS/AGS+/MGS/CGS/Trouter/IGS) were based on SUN hardware.
Once cisco "customized" their Motorola-based platform the state-of-
the-art slowed dramatically. SUN went the way of Sparc and cisco
hung out with CISC/680x0 chips until the life cycle of the 680x0
was nearly dead.

Fast forward 15 years. The ability for router manufacturers to
keep up with general purpose hardware is not any better.

The manufacturers appear to posture this as "we want to use
mature, stable technology." The market has changed, though,
and many of the components are a commodity. If it weren't for
the vast amount of custom code required to support their custom
hardware (ASICs and slave processors), we may be in a different situation.

The time is ripe for a hardware-abstraction software router. The
Linux Router project seems to be the closest and best thing going
in those terms. Cisco (and others) resemble Apple in how they
control the hardware platform. A company with the Microsoft
approach of "we make only the OS" could have thrived, but would
have be dependent upon manufacturers. Now the Linux Router
project comes along and we don't have to concede to a proprietary
O.S. but we are limited to PCI bus (and slower) support. For now.

Iljitsch_van_Beijnum · August 29, 2001, 6:59pm

> Look back. Routers have always lagged _WAY_ behind most other
> computer technology in terms of processor power, RAM, and general

[...]

Gawd. This so true. Back in the moldy old days when DECnet was
far more common, Digital brought out a dedicated router. VAXen
were the common systems and host based routing of DECnet was very
common. What hardware platform did they use? PDP 11/24. Early
ciscos (pre AGS/AGS+/MGS/CGS/Trouter/IGS) were based on SUN hardware.
Once cisco "customized" their Motorola-based platform the state-of-
the-art slowed dramatically. SUN went the way of Sparc and cisco
hung out with CISC/680x0 chips until the life cycle of the 680x0
was nearly dead.

It's even worse than that: as far as I know, they never used the 68060 or
even 68040 CPUs. These puppies are a LOT faster than a 68030.

_Greg_A_Woods · August 29, 2001, 7:02pm

[ On Wednesday, August 29, 2001 at 10:06:48 (-0400), Leo Bicknell wrote: ]

Subject: Re: What is the limit? (was RE: multi-homing fixes)

Don't even get me started on the discussion of why they were custom
designing a board for the route processor, when there are off the
shelf motherboards, or if it must fit in a form factor, motherboard
designs that would be less costly for them, use all off the shelf
parts, and would allow them to bring things to market quicker.

Take a look inside a Juniper router -- you should be pleasantly
surprised by the very standard CompacPCI processor board you'll find
inside of it that, among a few other things, does the route
processing.... and it's running mostly stock FreeBSD no less.... even
with a root prompt you can get at!

There was no reason for those past failures. It was a combination
of bean counters, cluelessness, and lazyness.

Interestingly some of the Juniper engineers are ex-Cisco from what I know....

The routing table
is growing slower than the number of lines of code in Windows, and
god help us if we can make Windows "work" we should be able to do
some simple routing.

Aint that the truth!

John_Ferriby1 · August 29, 2001, 7:19pm

Iljitsch van Beijnum wrote:

It's even worse than that: as far as I know, they never used the 68060 or
even 68040 CPUs. These puppies are a LOT faster than a 68030.

They did provide a path using the 68040. "CSC/4" in cisco parlance.
After that they piddled around with the MIPS chip. The 4500 was
the first unit to have it and then new CPUs starting showing up for
the 7000 series.

You're right though: the 68060 would have prolonged the working life
of a number of these units. I have some still in storage. They would
be good for the Smithsonian in a couple years.

-John

Leo_Bicknell1 · August 29, 2001, 7:20pm

Since several people have brought it up, I work with Junipers on
a daily basis. They have done some good things, and seem to have
an upper hand in routing performance when compared to Cisco's.
That said, I don't think they are immune to my complaints, and in
fact since they are using a lot of standard parts you'd think they
might be doing lots more than they are...

1112 · August 29, 2001, 8:48pm

CSC4 for the AGS used a 68040, didnt it?

-Dan

David_Luyer · August 30, 2001, 2:14am

CSC/4 was a 25MHz 68040/16M
CSC/3 was a 30MHz 68030/2M? or 4M? -- whatever it was, it
could only hold a very small BGP table
CSC/2 was a 33MHz 68020
CSC/1 I have no idea
IGS was a 16MHz 68020
STS-10x was a 68010... You had to be careful not to flood it's
memory with too large a RIP table...

The 2000, 3000, 4000 were also 68030 and the 7000 a 68040.

And many people still have numerous 25xx's still in service, those
are 68030's too... (and the 2511 still makes a really good console
server :-)).

The real revolution for the AGS+ was the flash cards. After requesting
IOS upgrades for half a dozen or more AGS+'s every month under
maintenance, Cisco sent us a stack of flash cards for free to stop us
requesting IOS updates.

That's the "definitely not compact" flash boards...

David.

Charles_Sprickman · August 30, 2001, 6:20am

Thus, the routing system FAILED twice: once because of memory, once
because of processor power.

I think this says a lot more about what we expect now versus what we
expected then.

1994: You could still run a usenet server with less than 10 gigs of
storage... 14.4 modems still in use... MCI and Sprint were still
wondering if this expensive toy would go anywhere. In short, the first
failure was a catastrophe for a small number of people.

1996: 28.8 modem, and I recall UUNet sucking and sucking and sucking in
the little market I call home, NYC.

In short, we expect better performance now, and if either of these
problems were to resurface tomorrow, most of the people running "real"
networks would find a way to work around both problems. Too many
prefixes? I know Randy knows a way to fix that. You're running 7000's in
your core? Shame on you. The internet is different now, there's no
comparison to what it was in '94 or '96. Those episodes were more
embarrasments than anything else. Would UUNet or ATT or Genuity think
twice about upgrading routers if there was impending danger of a lack of
memory or cpu?

Can a small ISP in Florida still clobber the whole internet? Learning
experiences, that's all...

C

Bandy_Rush1 · August 30, 2001, 6:32am

I think this says a lot more about what we expect now versus what we
expected then.

but that's what we thought then. such is life. it's just like the next
generation's music, the old generation thinks it sucks.

ask non-filtering peers of 3561 how well they performed when hit with the
15k route flap on 18 jan.

randy

Alex_Bligh1 · August 30, 2001, 9:18am

I'm betting yes. Cluelessness appears to be
growing faster than the routing table.

Given about once a year we have a small
(or not so small) ISP somewhere do it, I don't see
why the situation should change. Of course, they'll
have to pick a different and more original flavour
of misconfiguration / stupidity / vendor bug this
time.

John_Fraizer4 · August 30, 2001, 10:01am

Speaking of filtering:

AS64602 at Mae-West from Digex (AS2548) ASPATH=2548 64602 64602 64602
64602 64602 64602 IGP

AS65515 at Mae-West from Verio (AS2914) ASPATH=2914 10910 10910 10910
10910 10910 10910 11908 10530 5593 65515 IGP

AS65515 at AADS from Verio (AS2914) ASPATH=2914 10910 10910 10910 10910
10910 10910 11908 10530 5593 65515 IGP

AS65515 at Paix from Verio (AS2914) ASPATH=2914 10910 10910 10910 10910
10910 10910 11908 10530 5593 65515 IGP

Which part of (64512 - 65535 != valid ASN) do providers not understand?

Filter! Filter! Filter!

Philip_Smith · August 30, 2001, 10:32am

And your favourite router vendors have a useful command/knob to remove private ASes from announcements too. It's not hard to put that on all eBGP peerings...

philip

Leo_Bicknell1 · August 30, 2001, 1:03pm

I remember an amusing incident with another major provider who
leaked my employer 50k prefixes sometime last year. Our routers,
with lots of ram shugged, all the AS paths were longer, it didn't
even really affect routing. Meanwhile the leaking peer crashed
and burned on 7513's with 128M RAM, we watched their links bounce,
and traffic drop from large levels to nearly 0 before they got it
under control, 4-6 hours later.

At the time we had just completed upgrading most of the routers
from 128M to 256M, which is what saved us...and we had the proof
that a few thousand dollars worth of RAM could save hours of downtime
for hundreds (if not thousands) of customers.