Arista Routing Solutions

NANOG,

I know Arista is typically a switch manufacturer, but with their recently
announced Arista 7500R Series and soon to be announced but already shipping
7280R Series Arista is officially getting into the routing game. The fixed
1U 7280R Series looks quite impressive. The 7500R series is your
traditional chassis and line card based solution.

Both of these products have the ability to hold the full internet routing
table, and Arista is working on MPLS features. Both of these new products
use the latest Broadcom Jerico chipsets.

I would like to know how viable of a product NANOG thinks these Arista
routers are compared to service provider grade routers from Cisco, Juniper,
ALU, and Brocade?

Cost wise, Arista seems to be much, much less per port. For example, the
1U Arista 7280R with 48x10GbE (SFP+) & 6x100GbE QSFP cost about the same as
what Juniper sells a MX104 with only four 10G ports for (Under 20K).

Can the Arista EOS software combine with their hardware based on the
Broadcom Jericho chipset truly compete with the custom chipsets and
accompanying software from the big guys?

Colton Conor wrote:

I know Arista is typically a switch manufacturer, but with their recently
announced Arista 7500R Series and soon to be announced but already

shipping

7280R Series Arista is officially getting into the routing game. The fixed
1U 7280R Series looks quite impressive. The 7500R series is your
traditional chassis and line card based solution.

Both of these products have the ability to hold the full internet routing
table, and Arista is working on MPLS features. Both of these new products
use the latest Broadcom Jerico chipsets.

We (Netflix) have been deploying the previous gen (7500E) as edge routers
for about two years in high traffic, low route count applications in our
CDN, and have been working with Arista for almost as long to improve route
scale so that we could turn off all our traditional routers.

The features that enable full routes on Jericho are running in our
production network today and we also have the 7500R and 7280R working with
full tables.

I can't speak to MPLS, but for our use case (all L3, very high-density
10/40/100G, BGP, IS-IS and light QoS), it's working well.

So, yes, I'd say those two products are quite viable and competitive
options in the edge router space.

Hey Colton,

Comments inline:

NANOG,

I know Arista is typically a switch manufacturer, but with their recently
announced Arista 7500R Series and soon to be announced but already shipping
7280R Series Arista is officially getting into the routing game. The fixed
1U 7280R Series looks quite impressive. The 7500R series is your
traditional chassis and line card based solution.

I must admit, i'm not usually excited by new hardware, but this
announcement did catch my eye!

Both of these products have the ability to hold the full internet routing

table, and Arista is working on MPLS features. Both of these new products
use the latest Broadcom Jerico chipsets.

I would like to know how viable of a product NANOG thinks these Arista
routers are compared to service provider grade routers from Cisco, Juniper,
ALU, and Brocade?

Honestly? I think you need to look at what you actually need out of a box.
At the end of the day, its a 1U switch. If you want to terminate a GRT a
the edge of your network and do some basic path selection then it sounds
like it would be an amazing and cheap fit. On the other hand, I don't think
we can start throwing away core routers yet :wink:

Cost wise, Arista seems to be much, much less per port. For example, the

1U Arista 7280R with 48x10GbE (SFP+) & 6x100GbE QSFP cost about the same as
what Juniper sells a MX104 with only four 10G ports for (Under 20K).

I'm consistently amazed at the density they are achieving for the $$ and I
think it all comes down to what the actual application is here. Most basic
BGP networks do not need all the bells and whistles of the MX104 and will
really benefit from the extra port density. That being said, I wouldn't be
replacing core PoPs in large ISPs with 1U switches!

Can the Arista EOS software combine with their hardware based on the
Broadcom Jericho chipset truly compete with the custom chipsets and
accompanying software from the big guys?

I've used Arista for a while now (Moving from Cisco / Extreme) and I truly
believe that their software is excellent. They just seem to be doing it
'the right way', If you've not watched it, this video is worth a bit of
your time! https://www.youtube.com/watch?v=VdJZq4dRjf4

Thats my $0.02 anyway

Tom

In broad strokes: for your money you're either getting port density, or
more features per port. The only difference here is that there's
suddenly more TCAM on the device, and I still don't see the above
changing too drastically.

If it works* for you, use it. :slight_smile:

* Assuming that you've done your due diligence before purchasing, and
not just skim-read the vendor PDF.

Yeah OP is comparing high touch chip (MX104) to low touch chip
(Jericho) that is not fair comparison. And cost is what customer is
willing to pay, regardless of sticker on the box. No one will pay
significant mark-up for another sticker, I've never seen in RFP
significant differences in comparable products.

Fairer comparison would be QFX10k, instead of MX104. QFX10k is AFAIK
only product in this segment which is not using Jericho. If this is
competitive advantage or risk, jury is still out, I lean towards
competitive advantage, mainly due to its memory design.

Saku,

Jericho is in no sense a low end chip, while there are some scale limitations (what can be done with SuperFEC, some bridging related stuff), from functionality prospective it is a very capable silicon.

One has to:
Understand how to program it properly (recursiveness, ECMP’s, etc)
Know how to enhance SDK
Have a rather rich control plane, which can be translated into rich forwarding functionality :slight_smile:

I’m not familiar with Arista’s feature set
NCS with XR would be a good proof

Watch for Jericho updates from DNX

Cheers,
Jeff

High Touch / Low Touch

Is this a measure of the amount of fiddle diddling required to get the chip to work as documented, or is it some other kind of code?

For example a "High Touch" chip needs lots of fiddle farting because it was designed by a moron and every possible thing that can be programmed incorrectly is programmed incorrectly, whereas in a "Low Touch" chip all the defaults are already set to the most useful and rational setting so that it can be used without touching it to fix all the defects?

Perhaps it is a measure of the babysitting required while the chip is running. "High Touch" chips require constant attention, nappy changes, positive re-inforcement of the settings, etc., while operating because they are inherently unreliable and badly designed whereas "Low Touch" chips once set up just work and require little ongoing supervision unless you want to change something?

Or is it just a strange translation for functionality (as in High End / Low End)?

High touch means very general purpose NPU, with off-chip memory. Low
touch means usually ASIC or otherwise simplified pipeline and on-chip
memory. Granted Jericho can support off-chip memory too.

L3 switches are canonical example of low touch. EZchip, Trio, Solar,
FP3 etc are examples of canonical high touch NPUs. What low touch can
do, it can do fast and economically.

But like few terms, it's not exact, and borders are hazy and even subjective.

Got it, thanks for the explanation!

Saku,

I guess you are right the QFX10002-36Q is probably a better comparison. But
let's be honest, Juniper is not going to sell a QFX10002-36Q for less than
$20k like Arista will do for a semi- similar box. Even with a high discount
(like 90 percent off list), the Juniper QFX10002-36Q at $360k list price
comes nowhere close on the price point. Cisco, Juniper, ALU, etc are all
not going to see a low cost high density fixed switch because that would
cannibalize on their sales on the larger platforms. I really think Arista
is kind of unique here as they don't have another routing platform to
cannibalize, so they are competitively pricing their platform.

So I guess the question becomes, what features are missing that Arista does
not currently have? They seems to be adding more and more features, and
taking more market share. Here is a list of features supported:
https://www.arista.com/en/support/product-documentation/supported-features
I have not personally used Arista myself, but I like what I am seeing as
far as price point, company culture, and repruatation in the market place.
I know their switching is solid, but I am not sure about their routing.

Arista claims to have much, much faster BGP convergence time than all the
other vendors.

Hey,

I guess you are right the QFX10002-36Q is probably a better comparison. But
let's be honest, Juniper is not going to sell a QFX10002-36Q for less than
$20k like Arista will do for a semi- similar box. Even with a high discount
(like 90 percent off list), the Juniper QFX10002-36Q at $360k list price
comes nowhere close on the price point. Cisco, Juniper, ALU, etc are all not
going to see a low cost high density fixed switch because that would
cannibalize on their sales on the larger platforms. I really think Arista is
kind of unique here as they don't have another routing platform to
cannibalize, so they are competitively pricing their platform.

20k seems a stretch, that's like 94.5% discount, it's not unheard off.
If you have volume, I would imagine it being doable.

> So I guess the question becomes, what features are missing that Arista does

not currently have? They seems to be adding more and more features, and
taking more market share. Here is a list of features supported:
https://www.arista.com/en/support/product-documentation/supported-features I
have not personally used Arista myself, but I like what I am seeing as far
as price point, company culture, and repruatation in the market place. I
know their switching is solid, but I am not sure about their routing.

Yeah they are ccertainly much behind in features, but if you don't
need those features, it's probably actually an advantage. For my
use-cases Arista's MPLS stack is not there.

Arista claims to have much, much faster BGP convergence time than all the
other vendors.

I wouldn't be surprised, but honestly the competition does not set the
bar high there.

> High Touch / Low Touch

High touch means very general purpose NPU, with off-chip memory. Low
touch means usually ASIC or otherwise simplified pipeline and on-chip
memory. Granted Jericho can support off-chip memory too.

L3 switches are canonical example of low touch. EZchip, Trio, Solar,
FP3 etc are examples of canonical high touch NPUs. What low touch can
do, it can do fast and economically.

Your analogy makes some sense, but what you classify as high-touch /
low-touch is just one dimension and could do with a more modern update.

I'd suggest a more modern analogy would be that historically the difference
between a L3 switch and a router is the former has a fixed processing
pipeline, limited buffering (most are just on-chip buffer) and limited
table sizes.
But more modern packet processors with fixed pipelines often have blocks or
sections that are programmable or flexible. e.g. with a flexible packet
parser, its possible to support new overlay or tunnel mechanisms, flexible
key generation makes it possible to reuse different table resources in
different ways, flexible rewrite engine means egress encap or tunnels or
logic can be done.
There's also often more capacity for recirc or additional stages as
required.

Specific to Jericho, the underlying silicon has all these characteristics.
We [*] used the flexibility in all of the stages both now and in previous
iterations (Arad) to add new features/functionality that wasn't natively
there to start with. And it uses a combination of on-chip & off-chip
buffering with VoQ

Its also not only Arista that call it a router cisco do too (NCS5K5).

Sure, using a NPU for packet processing essentially provided a 100%
programmable packet forwarding pipeline, and maybe even a "run to
completion" kind of packet pipeline where the pipeline could have a long
tail of processing. However, engineering is a zero sum game, and to do that
means you sacrifice power or density, or most often, both.

I agree the lines have been blurred as to the characteristics, and we'd
openly state that its not going to be useful in every use case of where a
router is deployed, but for specific use cases, it fits the bill and has
compelling density, performance and cost dynamics.

To the OPs question, there are people running with this in EFT and others
in production.
My suggestion would be that if you think its of interest, reach out to your
friendly Arista person [*] and try it out or talk through what it is you're
after. We are generally a friendly bunch and often we can be quite creative
in enabling things in different ways to old.

Yeah they are certainly much behind in features, but if you don't
need those features, it's probably actually an advantage. For my
use-cases Arista's MPLS stack is not there.

We've historically had the data-plane but not the control-plane. Thats a
work in progress.
Again, often there are creative solutions to ways of doing things that
aren't necessarily the same as old ways but achieve the same end result.

cheers,

lincoln.
[*] disclosure: i work on said products described ltd@arista.com.

Ryan,

What routing platform were you coming from before? What features does
Arista not have that you find limiting that the old platform did have?

How does Astira's Sflow only compare to having Cisco Netflow or Juniper
JFlow for traffic monitoring which I assume Netflix does alot of?

IOS-XR on ASR 9k and Junos on MX.

For our use case, there's no longer anything limiting as compared to those
platforms. BGP policy is perhaps not as rich as you might be used to if
your experience is with the sort of routers traditionally marketed to
service providers, but I'm sure that will get better, and it's probably
irrelevant if your policy is fairly static.

You are correct that we do collect a lot of flow data, both via sflow and
Netflow. We've been able to do everything we need with Arista's sflow
implementation.

While the QFX in general is similar to Jericho-based platforms, I think the
QFX10002 is perhaps not an ideal comparison. At 100G, there is a
significant density penalty on that platform, as you can use all 36 ports
at 40G, but only 12 ports at 100G.

BGP convergence in the newer EOS releases is indeed very, very fast.

Just wanted to interject, the port density of the Arista switches is quite
impressive, especially considering the price point they're at.

Ryan,
  Curious if you have any thoughts on the longevity of the 7500R and 7280R survival's with IPv4 full tables? How full are you seeing the TCAM getting today (I'm assuming they are doing some form of selective download)? And if we are currently adding 100k/routes a year, how much longer will it last?

-Peter

I can't speak for Ryan or Netflix, but we (Arista) are stating our
technique is good for 1M+ prefixes of IPv4+v6 combined. Internet right now
is at between 575K and 635K IPv4 and between 28K and 35K IPv6 right now and
its taken many many many years to get there, its foreseeable there's many
years of growth there.
Note that we don't do static partitioning between IPv4 and IPv6 and our how
we do it has more headroom in it than we state, so we're confident. We're
also not doing "selective download", this is every prefix in current table.

What I can share is two different scenarios today:

1. a traditional internet edge router with multiple transit/peer providers,
Internet as of right now, and a cloud customer that also has hundreds of
thousands of prefixes internally
Ryan's case might be different to others, but here are three scenarios
deployed today: 1. a large hosting provider with full tables and many
internal prefixes, 2. a cloud deployment.

The former is at 854K IPv4 and 35K IPv6 of 'internet' as of a few weeks ago:

7500R# show ip route summary | grep Total
Total Routes 575127
7500R# show ipv6 route summary | grep Total
Total Routes 35511
7500R# show hardware capacity | grep Routing
Forwarding Resources Usage

Table Feature Chip Used Used Free Committed Best
Case High
                              Entries (%) Entries Entries
Max Watermark

Entries
-------- ---------- --------- -------- ------ --------- -----------
----------- ---------
Routing Resource1 815 39% 1233 0
2048 817
Routing Resource2 469 45% 555 0
1024 471
Routing Resource3 14074 42% 18694 0
32768 14098
Routing V4Routes 696364 88% 89753 0
786432 697110
Routing V6Routes 0 0% 89753 0
786432 0

The latter is at 854K IPv4 + 45K IPv6:

7500R# show ip route summary | grep Total
Total Routes 854393
7500R# show ipv6 route summary | grep Total
Total Routes 45678
7500R# show hardware capacity | grep Routing
Forwarding Resources Usage

Table Feature Chip Used Used Free Committed Best
Case High
                              Entries (%) Entries Entries
Max Watermark

Entries
-------- ---------- --------- -------- ------ --------- -----------
----------- ---------
Routing Resource1 1319 64% 729 0
2048 1320
Routing Resource2 809 79% 215 0
1024 814
Routing Resource3 24102 73% 8666 0
32768 24104
Routing V4Routes 644336 83% 124302 0
786432 644364
Routing V6Routes 17792 12% 124302 0
786432 17795

One could ask Geoff Huston where he thinks combined IPv4+v6 will exceed 1M
entries but I would expect it to be many years away based on
http://bgp.potaroo.net/ and we'd welcome discussions about if it you want
to know our opinion [*] on how we're doing it will scale. What we're doing
doesn't explode at 1M, there's headroom in it hence why we say "1M+". Again
we're happy to talk about it, just ask your friendly arista person and if
you don't know who to ask, ask me and i'll put you in touch with the right
folks.

cheers,

lincoln. [*] ltd@arista.com

Well,

    Once you eliminate the ~160k superfluous prefixes (last time I
checked)... This is a none issue.

    Some work on some sort summary function would keep those devices
alive... but we all know there is more money to be made the faster the
device become obsolete :frowning:

Can you explain how this works? How can a router determine which prefix is superfluous? How does it cope when a suppressed prefix is withdrawn or a more specific prefix is added? Is this just one of those 'it works some of the time' solutions or is this something that can be done safely with an appropriate algorithm?

Thanks,
Laszlo