NPE-G2 vs. Sup720-3BXL

We're stuck in an engineering pickle, so some experience from this
crew would be useful in tie-breaking...

We operate a business-grade FTTx ISP with ~75 customers and 800Mbps of
Internet traffic, currently using 6509/Sup2s for core routing and port
aggregation. The MSFC2s are under stress from 3x full route feeds,
pared down to 85% to fit the TCAM tables. One system has a FlexWAN
with an OC3 card and it's crushing the CPU on the MSFC2. System tuning
(stable IOS and esp. disabling SPD) helped a lot but still doesn't
have the power to pull through. Hardware upgrades are needed...

We need true full routes and more CPU horsepower for crunching BGP
(+12 smaller peers + ISIS). OC3 interfaces are going to be mandatory,
one each at two locations. Oh yeah, we're still a larger startup
without endless pockets. Power, rack space, and SmartNet are not
concerns at any location (on-site cold spares). We may need an
upstream OC12 in the future but that's a ways out and not a concern
here.

Our engineering team has settled on three $20k/node options:
- Sup720-3BXLs with PS and fan upgrades
- Sup2s as switches + ISIS + statics and no BGP, push BGP edge routing
off to NPE-G2s across a 2-3Gbps port-channel
- Sup2s as switches + ISIS + statics and no BGP, push BGP edge routing
off to a 12008 with E3 engines across a 2-3Gbps port-channel.

Ideas and constructive opinions welcome, especially software and
stability-related.

Many thanks,
-Dave

dstorandt@teljet.com (David Storandt) wrote:

Our engineering team has settled on three $20k/node options:
- Sup720-3BXLs with PS and fan upgrades

Still quite slow CPU wise. RSP's are supposed to be a lot faster
and actually usable.

- Sup2s as switches + ISIS + statics and no BGP, push BGP edge routing
off to NPE-G2s across a 2-3Gbps port-channel

The NPE-G2 - even an NPE-G1 - will do all that BGP stuff easily;
the CPU is fast enough. But...you might be in for a bad surprise
concerning the Portchannel.

Remember - it's done in software. So, depending on your packet
sizes, you might experience a throughput _drop_ once you bundle.
My experiments were done with very small packets though (DNS
queries and responses, avg. packet size around 140 Byte).

The devices I tested were the 1RU models (7301 for NPE-G1 and
7201 for NPE-G2). In "unbundled mode" they pushed around 940 kpps
(G1) and 1320 kpps (G2) with CPU loads between 85% and 100%.

Channel bundling took a lot out of the boxes. 7301 keeled over
at 470, 7201 at 660 kpps.

If you're only pushing big packets, though...

Yours,
  Elmar.

We've never pushed a NPE-G2 to 800Mb/s before but I would think they
would topple over... hopefully someone on here could confirm my
statement?

Moving the BGP to the 12008's would be my choice with PRP-2 processors
if the budget fits.... we're faced with a similar upgrade next year
possibly moving BGP from a pair of 7606's (Sup720-3BXL) over to a pair
of GSR's running PRP2 I think - the BGP processing etc. is pushing the
CPU's too high on the 7600's....

Someone else might suggest the RSP720's but haven't had them in a
production environment yet... we had PRP2's running on 12012 for a
while and found them rock solid even with older line cards etc...

Hope this helps a bit...:wink:

Paul

We're running several six 65xx Sup720-3BXL with 3 full transit views and
some 40-odd peers. We use two NPE-G1s for reflectors and some policy
manipulation. Also running MPLS in the core to allow for traffic
engineering and EoMPLS between certain services located in different
locations.

We're pushing up to between 800M and 1G at peak times (mostly outbound)
with this setup and peak CPU on the 3BXLs is running maybe 30% -- average
though of around 8 to 10%.

Hope this helps....

side-note:

I'm actually more worried about the distribution layer at the moment as it
relies heavily on STP and HSRP/GLBP for the various vlans and access layer
gunk. Currently these are 720-3B (non-XL), but looking eventually to
build a business case to upgrade these to VSS1440 to simplify the
architecture as well as providing better resilience and elimination of
STP/HSRP/GLBP limitations between the dist and access layers. Problem is
the budget side of that... blargh!

Ideally I'd like to go more for the Nexus platform for this part of the
network, given that we're doing a lot of virtualisation etc., but the
downsides with that are primarily the COST of the bloody things, and
secondly the fact that they don't currently support MPLS (planned for
later this year, apparently).

Leland

We ran into a similar quandary and have about the same amount of traffic as your network. When purchasing gear a year ago we decided against 7200's with an NPE-G2 as insufficient for the load. Have you looked at the 7304?

The Cisco 7304 with an NSE-150 processing engine on it offloads a lot of the packet processing to dedicated hardware, and doesn't have TCAM limitations for routes. You can hold several full feeds and do the amount of traffic you're talking about without breaking a sweat.

http://www.cisco.com/en/US/prod/collateral/routers/ps352/prod_bulletin0900aecd8060aac5.html

It is capable of supporting both legacy port adapters (from your Flexwan or 7200 routers) and SPA cards with the right add-in modules, which IIRC is only a few hundred dollars.

I'd be glad to answer any questions you have about our implementation.

--am

David Storandt wrote:

Cisco 7304 may not adequate for service provider.
It's CPU/IO-controller is tied together, and doesn't provide much of
benefit.

Cisco 7200/7300 is enterprise solution pretty much, and doesn't support
distributed CEF.

If you are considering SUP720-3BXL, why not considering RSP720-3CXL ?

Alex

Aaron Millisor wrote:

I would love to use the RSP720-3CXL, but cost and the PA OC3 are the
difficulties.

If the RSP720s will run in a 6500 chassis, great! We wouldn't have to
purchase new chassis and the increased downtime for the swap-out.

RSP720 don't support the older bus-only FlexWAN either with the OC3 PA
we're using, so we'd have to figure out a solution for that - SIPs,
Enhanced FlexWAN, or external routers. Bah.

...the RSP720s + chassis + OC3 solution more than double our $20k/node
budget, so that's a much tougher sell internally.

-Dave

I have used the 7304 in the past and was happy with it. In fact I still have a 6-port DS3 module for a 7304 which I need to find a home for if anyone has the need.

The 7304 originally had its own specific modules that went into it. But they also sell carrier card for it so you can use standard PA's, as well as the SPA's which is nice. Overall footprint is rather nice, and I use to use those 6-port DS3 cards which allowed for hefty DS3 termination.

Brian

David,

My 1st advice would be to look also at the other features/capabilities you
require, and not just at "feeds and speeds".

Some examples for functionality could be:
- QOS
- NetFlow
- DDoS resistance

In general the 6500 and the 12000 are hardware based platforms, with the
12000 being more distributed in nature, using linecard resources for data
plane (6500 does it too if you have DFC installed). 7200 is a CPU/software
based platform, so the same processor does packet forwarding and control
plane processing.

The 6500 (depends on specific module selection) is more restricted with QOS
and NetFlow functionality as it is designed to do very fast forwarding at a
relativly cheaper price.
The 12000 has everything implemented in hardware, and depends on the engine
types (don't use anything other than Eng 3 or 5) has all the support you may
dream of for things like QOS and other features.
The 7200 is a software based router, which means that it support any feature
you may ever dream of, but the scalability decreases as you turn them on.

Another option you should consider seriously should be the ASR1000 router,
which is a newer platform and has a new architecture. All its features are
based on hardware support, and it could actually prove the best choice for
what you need.
The ASR1002 comes with 4 integrated 1GE ports, which could be all that you
would ever need (but it has quite a few extension slots left).

Arie

You may also take a look at the Cisco ASR1000 line... Supposedly a
middle step between 7200 and 7600 router sizing..

Yeah, as long as you're using the NSE-150 and are using features supported by the PXF such that it's not punting to the RP, the performance is really good.

--am

Brian Feeny wrote:

David Storandt wrote:

We're stuck in an engineering pickle, so some experience from this
crew would be useful in tie-breaking...

We operate a business-grade FTTx ISP with ~75 customers and 800Mbps of
Internet traffic, currently using 6509/Sup2s for core routing and port
aggregation. The MSFC2s are under stress from 3x full route feeds,
pared down to 85% to fit the TCAM tables. One system has a FlexWAN
with an OC3 card and it's crushing the CPU on the MSFC2. System tuning
(stable IOS and esp. disabling SPD) helped a lot but still doesn't
have the power to pull through. Hardware upgrades are needed...

We need true full routes and more CPU horsepower for crunching BGP
(+12 smaller peers + ISIS). OC3 interfaces are going to be mandatory,
one each at two locations. Oh yeah, we're still a larger startup
without endless pockets. Power, rack space, and SmartNet are not
concerns at any location (on-site cold spares). We may need an
upstream OC12 in the future but that's a ways out and not a concern
here.

Our engineering team has settled on three $20k/node options:
- Sup720-3BXLs with PS and fan upgrades
- Sup2s as switches + ISIS + statics and no BGP, push BGP edge routing
off to NPE-G2s across a 2-3Gbps port-channel
- Sup2s as switches + ISIS + statics and no BGP, push BGP edge routing
off to a 12008 with E3 engines across a 2-3Gbps port-channel.
  

Have a look at the ASR1002 + ESP5/10G

Stable for BGP+ISIS as far as our experience goes.

adam.

ASR1002 + ESP5 was great for OSPF + BGP. 450M+ of traffic for me at
peek (proc at1-2%)

Steve Dalberg wrote:

Some things to remember about the MSFC2s when designing a deterministic
network:

Without the switch fabric module, the 6509 only has a 32 Gbps
contention-based BUS as a backplane. Also I believe only "classic" line
cards work without the switch fabric module. "Classic" line cards share
hardware port buffers between adjacent ports (groups of 8 ports for
copper cards) such that it is possible when one port is run at sustained
wire speed, the hardware port buffer capacity is exhausted, causing
drops on the other ports. The sup720 has the 720 Gbps switch fabric
module integrated on the supervisor engine freeing up slots 5/6 where
the switch fabric line cards are inserted with an MFSC2.

Julio Arruda wrote:

Steve Dalberg wrote:

David Storandt wrote:

We're stuck in an engineering pickle, so some experience from this
crew would be useful in tie-breaking...

We operate a business-grade FTTx ISP with ~75 customers and 800Mbps of
Internet traffic, currently using 6509/Sup2s for core routing and port
aggregation. The MSFC2s are under stress from 3x full route feeds,
pared down to 85% to fit the TCAM tables. One system has a FlexWAN
with an OC3 card and it's crushing the CPU on the MSFC2. System tuning
(stable IOS and esp. disabling SPD) helped a lot but still doesn't
have the power to pull through. Hardware upgrades are needed...

We need true full routes and more CPU horsepower for crunching BGP
(+12 smaller peers + ISIS). OC3 interfaces are going to be mandatory,
one each at two locations. Oh yeah, we're still a larger startup
without endless pockets. Power, rack space, and SmartNet are not
concerns at any location (on-site cold spares). We may need an
upstream OC12 in the future but that's a ways out and not a concern
here.

Our engineering team has settled on three $20k/node options:
- Sup720-3BXLs with PS and fan upgrades
- Sup2s as switches + ISIS + statics and no BGP, push BGP edge routing
off to NPE-G2s across a 2-3Gbps port-channel
- Sup2s as switches + ISIS + statics and no BGP, push BGP edge routing
off to a 12008 with E3 engines across a 2-3Gbps port-channel.

Have a look at the ASR1002 + ESP5/10G

Stable for BGP+ISIS as far as our experience goes.

adam.

ASR1002 + ESP5 was great for OSPF + BGP. 450M+ of traffic for me at
peek (proc at1-2%)

Any experience in how much more resilient is the ASR compared with 7600/6500, DDoS-wise :slight_smile: ?
And compared with NPE-G2 ?
And in terms of CoPP and etc ?

The ASR's Quantum Flow processors scale quite unpredictably depending upon features, apparently, so it's difficult to say.

I'm expecting 5-7Gbps on the ESP10 with my usage (no complex features in use, just forwarding and Netflow), though I've little data to base that on. (ESP on one device currently reports 2-3% usage at ~200Mbit). It'll handle a DDoS much, much, much better than a 7201/NPE-G1, but much, much, much worse than a 65/7500 (even without DFCs).

We use several ASRs with one at each entry point to the network (each transit provider / peering exchange) to spread potention DDoS across a lot of processors, that approach is working well for us at the moment.

Our only real issue is that the Netflow implementation on the ASRs seems to be a little 'sensitive' to configuration changes and sometimes just stops exporting flows.

adam.