RE: GSR, 7600, Juniper M?, oh my!

From: Alex Rubenstein [mailto:alex@nac.net]

On the 7500, you have RSPs and VIPs; the former performing routing
protocol work, vty's, RIB's, etc., the latter doing actually packet
forwarding.

While this sounds great on paper, our experience has us shying away from
dCEF and looking for something bigger and better... dCEF pushed the RSP
processor down to about 5%, but pushed up the VIP processors to about
90-95%...

VIP-Slot0>sh proc c
CPU utilization for five seconds: 13%/12%; one minute: 14%; five

minutes:

15%

I wish we could get our routers to do this...

Obviously, we run dCEF, which puts the VIP's in the position of

forwarding

everything on their own, as evidenced by the CPU measurements.

But each VIP is responsible for it's own traffic, so if a particular VIP
runs most of the traffic, it has much higher CPU usage... In our case,
we have a router loaded with VIP 4-50's and Enhanced ATM OC-3
adapters... Originally, we had a single OC-3 running about 120-130 Megs
constant and the VIP CPU was at 90-95%.... To combat this, we had to
put in additional OC-3 cards with additional VIPs and distribute the
load... Still, high CPU is a problem .. For instance :

CPU utilization for five seconds: 63%/63%; one minute: 63%; five
minutes: 65%
  30 second input rate 78227000 bits/sec, 17858 packets/sec
  30 second output rate 47944000 bits/sec, 12778 packets/sec

It seems to me that we should be able to do sooo much better... *sigh*
OC-12 adapters are an option, but they are rather expensive ...

However, to answer your question, even a modestly configured 7507 with
RSP4, and VIP2-50's will be substantially more capable than a

7206-NPE300.

Things may change on the NPE-400 or G1, but I have no direct

experience

with that.

The G1 processors, so far, have proven to be wonderful... We only have
experience with them running in the 7200 uBR chassis, but they've shown
a huge reduction in CPU utilization...

PS. Regards to stability; we have SUBSTANTIAL improvements in IOS
stability, especially in 12.3.5a mainline.

Heh.. *old* Cisco code scares me enough... Bleeding edge is simply
terrifying... *sigh*

-- Alex Rubenstein, AR97, K2AHR, alex@nac.net, latency, Al Reuben --
-- Net Access Corporation, 800-NET-ME-36, http://www.nac.net --

Jason Frisvold
Backbone Engineering Supervisor
Penteledata

hello!

The G1 processors, so far, have proven to be wonderful... We only have
experience with them running in the 7200 uBR chassis, but they've shown
a huge reduction in CPU utilization...

what is huge reduction for you? we upgraded from npe-400 to npe-g1 on ubr7200 and processor usage decreased 20-30%. And we are pushing about 100Mbps traffic from GigE to cable and about 20-30Mbps from cable to GigE.

Tarko,

What was your CPU utilization prior to the upgrade?

hello!

What was your CPU utilization prior to the upgrade?

Like always before the upgrade - 95% :slight_smile:

Currently on npe-g1 it's 80% on peak times with traffic numbers I mentioned before and 4500 online modems, 3000 cpe's

If only dCEF wasn't phuqed in so many versions of the IOS..... life would
be wonderful. We had to turn dCEF off and just run plain old ip cef on
our 7513 under 12.2.19a.... The RSP4 CPU spikes up to 80% then back down
then UP and down... weird.

Jason Frisvold wrote:

CPU utilization on a software based router is not linear, said in a
different way, even when CPU hits 100% it can still forward
significantly faster.

/Jesper

That would depend what is causing the CPU usage. If it is software based
IP header lookups, you're not going to get any more peformance out of it
by trying to do more lookups than your CPU can handle. If on the other
hand the CPU usage is interrupt load, then yes forwarding rates could
continue to go up even after the CPU hits 100% (assuming the priorities
aren't such that you kill the rest of the box, routing protocols etc), at
the expense of latency.

Surprisingly, that's usually not true. The cost of the IP header lookup is
generally much less than the cost of a task switch. So when the system is at
100%, it's probably doing this:

IP header lookup, wait for work / task switch, IP header lookup, wait for
work / task switch, repeat

  As the load increases, it will start doing two IP header lookups before
each task switch or yield. Then three. Then four.

  DS