Cisco ASR9010 vs Juniper MX960

I would like opinions of the differences between these two platforms if
possible.

I was going to buy a used Juniper MX960 Router MX960-PREMIUM2-AC-ECM with
2 x RE-S-1800X4-16G and 3 x SCBE-MX-S. Then I was going to load this up
with a couple of older DPCE-R-4XGE-XFP 4x10GE DPC Enhanced cards.

Now Cisco has offered me a new ASR9010 with dual ASR9K Route Switch
Processor with 440G/slot Fabric and 6GB, and two 4X10GE / 16X1G Combo
Linecard, Packet Transport Optimized for about the same price as the used
Juniper. The only catch is the Cisco's support and warranty looks
very expensive per year, but that's hard to compare since a used Juniper
has zero support and warranty included.

If these were both brand new with support and warranty which would you
choose? If it were the used Juniper vs new Cisco which would you choose?

I know Juniper makes newer MIC cards that probably better compete with
these Cisco cards, but that is not option due to price.

New, Juniper wants to sell me a MX104 for the same price that I can get
this Cisco ASR9010. I think that is a no brainer to go with the ASR at that
point. I asked for new pricing on a MX240/480/960, but that was not even
close to the ASR9010 numbers.

I can also buy two ASR 9001's for the same price and as the single ASR9010.

We have both and they’re both great boxes, however it’s sort of embarrassing that the ASR9k still can’t do virtualized routing, ie. logical-systems. Not sure if thats a deal breaker for you but just thought you’d like to beware. We also find OS configurations on the Juniper much easier than the cumbersome XR OS that the Cisco runs. The 9k does however get a huge win with the ability to apply a ‘pie’ or software patch while staying in service vs requiring a reload. Either way, I don’t think you’ll go wrong.

J~

Jason Bothe, Manager of Networking
Rice University
                               o +1 713 348 5500
                               m +1 713 703 3552
              jason@rice.edu <mailto:ason@rice.edu>

With GRES, can't you simply set the master RE as backup, apply firmware,
then switch back to master and upgrade the backup RE?

We have run into issues with GRES, and I think its an issue with the RE we have. I don’t actually perform the tasks so it may or may not be as big of an issue as I initially stated.

Jason Bothe, Manager of Networking
Rice University
                               o +1 713 348 5500
                               m +1 713 703 3552
              jason@rice.edu <mailto:ason@rice.edu>

Yeah, you might look into that. We're about to put 3 x MX960s in service
and with GRES and NSR we are not dropping traffic when taking the master RE
down.

Hey,

I would like opinions of the differences between these two platforms if
possible.

Summary, I think MX is better HW and SW right now.

Warning, rant incoming.

I liked ASR9k lot more before I needed to run it. On paper IOS-XR is
superior to JunOS, JunOS is old fashioned non-pre-emptive,
run-to-completion. In theory this is most efficient way to run code,
but in practice it means programmer needs to be hyper aware how long
any bit of code they are writing may execute, if they get it wrong,
and don't yield manually, simple things like parsing community list
while doing commit may cause IGP flap.

IOS-XR otoh has multiple processes scheduled either by QNX or Linux,
which means programmer does need to be so careful, Linux can pre-empt
the process and run something more important.
However, with this distribution comes problem of IPC, sharing-state in
fast and economical manner, and I believe IOS-XR has dropped the ball
here, I don't know if it's even possible to solve today, it is
probably a very hard problem. This is just speculation, but I feel
like Cisco underestimated the problem, and instead of rethinking
infrastructure, they are duplicating state in efforts to keep
performance acceptable, as IPC cannot be made fast enough. All this
adds complexity which adds bugs.

So in practice, I believe JunOS to be currently the better system. But
IOS-XR 6 may show some light behind the tunnel, unsure yet. (Isn't
this always the case, in two years time, everything will be great)

For hardware, ASR9k have trident and typhoon generation, which are
Israeli EZChip (since acquired) NPUs, and now tomahawk which is
completely different NPU. Juniper MX has DPCE and Trio, from microcode
POV both have two generations, but you can't buy anymore DPCE it's
very old, so all MX systems really are Trio only, which means JNPR
only needs to develope features once for single NPU generation. Cisco
needs to do it twice and operator needs to learn two platforms to
troubleshoot, and there is feature disparity with troubleshooting
commands.
I also believe that Trio NPU is better NPU than EZchip or the one in
Typhoon, they atypically have succeeded doing all lookups (FIB and
ALC) in RLDRAM, instead of TCAM which is easier to pull off but more
expensive. Trio can do more in HW, like fragmentation, can look deeper
in packet. Lot of flexibility is exposed to operator, like ability to
arbitrary firewall filters by checking specific bit-positions.
For multicast ASR9k is better, as it can replicate in fabric, where as
in MX replication is done by linecard, either binary or unary. But
this really is relevant unless you actually have large volume of
multicast replicated to many ports.

For troubleshooting/instrumentation, for some things MX is better,
like packet-via-dmem capture for all transit packets is god-sent. But
ASR9k has far more NPU counters for various drop/punt/limit
conditions, which most can be capture (at cost of stopping forwarding
for a moment). Most of the stuff in ASR9k is very new or just coming,
while MX has had sufficient instrumentation for years. ASR9k team is
focusing on this and lot of good stuff is in pipeline, which may make
ASR9k instrumentation better on the long run.

IOS-XR does not have any guaranteed machine parseable presentation of
data, in JunOS every command can be outputted as high quality XML. In
IOS-XR this is rarely possible, and even when it is, there is no
strong relation CLI, and often the actual output is just single
string-blob, so using it is no better than screee-scraping. JunOS
inherently will have this XML, much like TimOS would inherently have
SNMP presentation of data.
I don't imagine this being solved any time soon, because it's very
fundamental infrastructure issue. What is our truth source? Truth
source should be single presentation, out of which both CLI/XML/YANG
is extracted, so that there simply is no possibility of de-sync.

Lot of the stuff Cisco wanted to solve from Classic IOS are actually
worse in IOS-XR. Software management is worse, yeah you have SMUs but
managing them is a nightmare and most of them are reload or routign
flap anyhow, so it does not really help you. I actually prefer
managing Classic IOS software than XR. Most of the time we need to
upgrade, we need to do it because HW isn't supported. JunOS has
figured this out correctly as well, by having hardware abstraction
layer they can in-service add 'JAM' or new support for new hardware,
without changing the software.

For control-plane protection IOS-XR has pretty solid idea in 'LPTS',
the platform should know what is to be punted and what not, so why not
automatically program ACLs and policers for that stuff. It works
somewhat well, better than JunOS out-of-the-box. But for operator who
knows what they are doing, JunOS can be protected much, much better.
'LPTS' only has single policer for specific traffic-class, like
'BGP-known', if this is offended, all BGP suffer. Where as JunOS has
multiple levels of policers, aggregate policer, which is same as
IOS-XR, but there are also 'subscriber' level (L4 keys), 'ifl' level
and 'ifd' level. So even if single BGP neighbour flloods you tons of
frames, you can still have all other BGP sessions protected by having
the misbehaving BGP neighbour limited in its IFD, IFL or Sub level.

If I could get classicIOS with commit and RPL, I'd run that rather
than XR right now.

For MX you might want to ping account team about MX2008, which will
(IMHO) replace MX960 RSN. Main advantage on top of supporting newer
MPCs is that you don't have mid-plane, fabrics are connected to LC's
directly, so you never need to upgrade chassis to support higher rate
SERDES in future.

Jason Bothe wrote:

The 9k does however get a huge win with the ability to apply a ‘pie’
or software patch while staying in service vs requiring a reload.

SMUs are often "hitless", which is to say, "hitless" with scary quotes.
What this means in practice is that the SMU itself might be hitless but
it will depend on 47 other SMUs, thereby almost guaranteeing some form
of reload. Also, restarting processes is "hitless" (e.g. restarting
bgpd, ospfd, etc) or shutting down interfaces.

E.g.:

CSCuo47663: "Hitless/Optional SMU,aigp metric different in RIB & BGP
table". This will restart the bgp process.

CSCus26923: "traffic from SIP700 to 9000v is dropped when a link to
9000v flaps". Release notes state that the issue is not service
impacting, then "After the SMU installation , we need to apply
shut/noshut of the problematic interface to trigger the hardware
programming." Wuh??

In other words, "hitless" does not mean "not service impacting".

Nick

I would assume any SMU impacts traffic and requires a reboot or a line card reset. There are types of SMUs that touch low level parts and require a reboot, in which case I’ve often told Cisco they should just rev the release number.

Solving SMU dependencies is sometimes impossible. Right now the 5.3.3 SMU set posted on CCO can’t be installed with any of their automation/tools. We are waiting for Cisco to provide a fix. I’m not holding my breath.

- Jared

I don't think I'd trust any vendor's "ISSU" to be completely without impact...been more of a marketing term from my experience...

If DPC's are your only option on the MX960, then the ASR9010 will be a
better option with those line cards.

If you are able to run the newer MIC/MPC line cards on the Juniper,
capabilities will not vary much; but IMHO, the MX960 will have a slight
edge over the ASR9010.

Mark.

It can be hit & miss with the PIE's and SMU's.

We've been in situations where in-service updates (not ISSU) were
documented as being hitless, but ooops...

Mark.

Not always.

Multicast, for example, tends to not survive upgrades in GRES conditions
as a matter of protocol.

Mark.

+1.

Mark.