Plethora of UUnet outages and instabilities

Has everyone else been seeing the almost daily UUnet outages recently,
does anyone know what the true causes of these have been?

Perhaps this has something to do with it:

Glitch slows UUNET to crawl

June 23, 1997
Network World

If the Internet felt slower than usual last Monday, you were
probably a victim of the brownout on UUNET Technologies' network
that slowed services in the Northeast.

No one is disputing that there was a brownout. But UUNET and Cisco
Systems, Inc. - which sold the Internet service provider its routers
- are at odds over how it happened.

The problem, both sides agree, was a router memory failure. After
that, the stories diverge. According to UUNET, the ISP's hubs in
Washington, D.C., and Newark, N.J., dropped packets because of a
software glitch on its Cisco routers. The hubs are like
points-of-presence where UUNET users' traffic is routed to and from
the Internet.

The routers suffered what the ISP is calling a ``memory leakage.''
Typically, after a router uses a block of memory, it puts the block
back into a pool of usable memory. But that is not what happened on
UUNET's routers last week. Instead of the memory being usable, it
was fragmented, said Alan Taffel, vice president of marketing at
UUNET.

Large clusters of data that came through the routers could not be
handled by the fragmented memory blocks. Once the routers ran out of
memory, they automatically reset.

UUNET notified Cisco, which sent over a software patch. The software
Band-Aid freed up more memory in the routers cache so it would not
overload, said Bob Michelet, director of corporate relations at
Cisco. The patch is a temporary measure meant to be used until UUNET
ups the memory on its routers, he said.

UUNET started upgrading the memory on the routers, which Cisco
contends was the real culprit. Cisco recommends that its ISP
customers use 128M bytes of total memory on their route switch
processor boards, Michelet said. UUNET uses 64M-byte memory boards
on most, if not all, of its 7,500 routers.

UUNET said it never received a recommendation from Cisco concerning
memory for its 7,500 routers. ``But clearly, once this event
occurred, we discussed the memory issue with Cisco, and we agreed
the right course of action would be to upgrade the routing memory to
128�M bytes�,'' said Jim McManus, vice president of systems
engineering at UUNET.

Almost every large to midsize ISP uses some Cisco equipment, so
should users be concerned? That depends on who answers the question.
Cisco claimed UUNET was the only ISP using 64M-byte route switch
processor boards.

One ISP agrees. Genuity, Inc., Bechtel Enterprises' ISP subsidiary,
said it would not run a router with less than 128M bytes of memory.
Genuity uses Cisco 7513 routers with fully loaded memory because the
routers have to store the routing table for the entire Internet,
said Rodney Joffe, chief technology officer at Genuity.

�Copyright 1997, Network World�

At 09:49 AM 7/1/97 +0200, Hank Nussbacher forwarded from Network World:

UUNET started upgrading the memory on the routers, which Cisco
contends was the real culprit. Cisco recommends that its ISP
customers use 128M bytes of total memory on their route switch
processor boards, Michelet said. UUNET uses 64M-byte memory boards
on most, if not all, of its 7,500 routers.

UUNET said it never received a recommendation from Cisco concerning
memory for its 7,500 routers. ``But clearly, once this event
occurred, we discussed the memory issue with Cisco, and we agreed
the right course of action would be to upgrade the routing memory to
128�M bytes�,'' said Jim McManus, vice president of systems
engineering at UUNET.

Ouch! A couple of questions:

1) Is the "7500" the actual number of routers they'll have to upgrade, or
are they referring to the Cisco 7500 product line? That's an awful lot of
routers to upgrade, so UUnet could have problems for awhile.

2) What could have caused the memory requirements to jump so dramatically?
And if it's due to the routing table "for the whole Internet", why weren't
others affected?

  Brian