scaling linux-based router hardware recommendations

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco,
Juniper, etc. are going to always outperform a general purpose server
running gnu/linux, *bsd... but I find the idea of trying to use
proprietary, NSA-backdoored devices difficult to accept, especially when
I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with
a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server
adapters, and 16gig of ram, you still tend to get high percentage of
time working on softirqs on all the CPUs when pps reaches somewhere
around 60-70k, and the traffic approaching 600-900mbit/sec (during a
DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per
second counts would be a good thing to do. I just have no idea what is
out there that could meet these goals. I'm unsure if faster CPUs, or
more CPUs is really the problem, or networking cards, or just plain old
fashioned tuning.

Any ideas or suggestions would be welcome!
micah

Hi Micah,

There is a segment in the Hardware Side of the industry that produces "Network Appliances".
(Folks such as Axiomtek, Lanner Electronics, Caswell Networks, Portwell etc etc)

These appliances are commonly used as a commercial (OEM) platform for a variety of uses..
Routers, Firewalls, Specialized network applications etc.

Our internal testing ( informal), matches up with the commonly quoted PPS handling by the different product vendors who incorporate these appliances in their network product offerings.

i3/i5/i7 (x86) based network appliances will forward traffic as long as pps does not exceed 1.4million
               (In our testing we found the pps to be limiting factor and not the amount of traffic being moved)
               (will easily handle 6G to 10G of traffic

Core2duo (x86) based network appliances will forward traffic as long as pps does not exceed 600,0000 pps
               (will easily handle 1.5G to 2G of traffic)

Atom based (x86) network appliances will forward traffic as long as pps does not exceed 250,000 pps.

Has anyone tested these setups with something more beefy like dual Xeons of Sandybridge or later vintage? Waiting to hear back from one NIC vendor (HotLava) what they think can be done on larger hardware setups. Put in two big Xeons and you're looking at 24 cores to work with as opposed to the <8 on the desktop versions. The newer ones would also have PCIe 3, which would overcome bus speed limitations in PCIe 2.

Realistic to put 6x - 12x 10GigEs into a server with that much beef and expect it to perform well? What vintage of core ix do you run, Faisal?

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco,
Juniper, etc. are going to always outperform a general purpose server
running gnu/linux, *bsd... but I find the idea of trying to use
proprietary, NSA-backdoored devices difficult to accept, especially when
I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with
a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server
adapters, and 16gig of ram, you still tend to get high percentage of
time working on softirqs on all the CPUs when pps reaches somewhere
around 60-70k, and the traffic approaching 600-900mbit/sec (during a
DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per
second counts would be a good thing to do. I just have no idea what is
out there that could meet these goals. I'm unsure if faster CPUs, or
more CPUs is really the problem, or networking cards, or just plain old
fashioned tuning.

Any ideas or suggestions would be welcome!

DPDK is your friend here.

-Scott

Cumulus Networks has some stuff,

http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-2014-final.pdf

Pretty decent presentation with more details you like.

Mehmet

I know that specially programmed ASICs on dedicated hardware like Cisco,
Juniper, etc. are going to always outperform a general purpose server
running gnu/linux, *bsd... but I find the idea of trying to use
proprietary, NSA-backdoored devices difficult to accept, especially when
I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with
a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server
adapters, and 16gig of ram, you still tend to get high percentage of
time working on softirqs on all the CPUs when pps reaches somewhere
around 60-70k, and the traffic approaching 600-900mbit/sec (during a
DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per
second counts would be a good thing to do. I just have no idea what is
out there that could meet these goals. I'm unsure if faster CPUs, or
more CPUs is really the problem, or networking cards, or just plain old
fashioned tuning.

10-15 years ago, we were seeing early Pentium 4 boxes capable of moving
100Kpps+ on FreeBSD. See for example
http://info.iet.unipi.it/~luigi/polling/

Luigi moved on to Netmap, which looks promising for this sort of
thing.

I was under the impression that some people have been using this for
10G routing.

Also I'll note that Ubiquiti has some remarkable low-power gear capable
of 1Mpps+.

... JG

One thing to note about Ubiquiti's EdgeMax products is that they are not
Intel based. They use Cavium Octeon's (at least that's what my EdgeRouter
Lite has in it).

Oliver

Kind of unsurprisingly, the traditional network vendors are somewhat at
the forefront of pushing what an x86 server can do as well. Brocade
(Vyatta), Juniper, and Alcatel-Lucent all have virtualized routers using
Intel's DPDK pushing 5M+ PPS at this point. They are all also tweaking
what Intel is providing, and they are the ones with lots of software
developers with a lot of hardware and network programming experience.

ALU claims to be able to get 160Gbps full duplex through a 2RU server with
16x10G interfaces and two 10-core latest-gen Xeon processors. Of course
that's probably at 9000 byte packet sizes, but at Imix type traffic it's
probably still pushing 60-70Gbps. They have a demo of lots of them in a
single rack managed as a single router pushing Tbps.

A commerical offering you are going to pay for that kind of performance
and the control plane software. Over time though you'll see the DPDK type
enhancements make it into standard OS stacks. Other options include
servers with integrated network processors or NPs on a PCI card, there is
a whole rash of those type of devices out there now and coming out.

Phil

It really depends on the application that you are interested in beyond
forwarding, but not knowing that and to scale forwarding ³at a
reasonable price", things have to come off cpu and become more customized
for forwarding, especially for low latency forwarding. The optimization
comes in minimizing packet tuple copies, off load to co-processors and
network coprocessors (some of which can be in NICs) and parallel
processing with some semblance of shared memory across, all of which
takes customization beyond CPU and Kernel which in itself needs to be
stripped down bare and embedded. Ultimately that¹s what appliance vendors
do with different levels of hardware/firmware customization depending on
ROI of features, speeds and price. A generic OpenSource compatible OEM
product with multi-gig ports will generally be at least half to 5th the
price of a high end latest server architecture server product with ability
to support 10 gig interfaces in the same forwarding performance range
(which are in the market for a different scale problem in compute and net
I/O but exist at a price point that make them exorbitant to solve
forwarding speed).

Cheers,

Sudeep Khuraijam

I'm also in the research stage of building our own router. I'm interested in reading more if you can post links to some of this research and/or testing.

David

Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores.

The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth. For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all.

48 port GigE switches with a couple 10GigE can be had for $100. A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume.

I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it.

Haven't gotten a quote from AlcaLu yet.

Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS.

The FreeBSD world can bring zero software cost and a stable platform, but no MPLS.

Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it.

And the solution to this issue is - http://routerboard.com/ or http://www.mikrotik.com/software# on x86 hardware, plus any basic layer2 switch. Don't scoff until you have tried it, the price/performance is pretty staggering if you are in the sub 20gig space.

Must not have read my whole e-mail. :wink:

There aren't very many people outside of my group that know more about Mikrotik. Trainers, MUM presenters, direct-line-to-Janis guys, etc.

Still can't make those Latvians produce what we want.

Like Mike mentioned, the feature list in RouterOS is nothing short of impressive -- problem is that pretty much everything in there is inherently buggy.

That and one hell of a painful syntax-schema to work with too.

Different (configuration) strokes for different folks. I look at a Cisco interface now and say, "Who the hell would use this?" despite my decade old Cisco training.

I was corrected offlist that Vyatta does do MPLS now... but I can't find anything on it doing VPLS, so I guess that's still out.

The 5600's license (according to their SDNCentral performance report) appears to be near $7k whereas MT you can get a license for $80.

Aren't most of the new whitebox\open source platforms based on
switching and not routing? I'd assume that the "cloud-scale" data
centers deploying this stuff still have more traditional big iron at
their cores.

A L3 ethernet switch and a "router" are effectively indistinguishable.
the actual feature set you need drives what platforms are appropiate.

A signficant push for DCs particularly those with CLOS archectures is
away from modular chassis based switches towards dense but fixed
configuration switches. This drives the complexity and a signficant
chunk of the cost out of these switches.

The small\medium sized ISP usually is left behind. They're not big
enough to afford the big new hardware, but all of their user's
NetFlix and porn and whatever else they do is chewing up bandwidth.

Everyone in the industry is under margin pressure. Done well every
subsequent generation of your infrastrucuture is less costly per bit
delivered while also being faster.

For example, the small\medium ISPs are at the Nx10GigE stage now. The
new hardware is expensive, the old hardware (besides being old) is
likely in a huge chassis if you can get any sort of port density at
all.

If you're a small consumer based ISP how many routers do you actually
need the have a full table (the customer access network doesn't need it).

48 port GigE switches with a couple 10GigE can be had for $100.

I'm not aware of that being the case. With respect to merchant silicon
there a limited number of comon l3 switch asic building blocks which all
switch/router vendors can avail themselves of.

broadcom trident+ trident 2 and arad, intel fm6000, marvell prestera etc.

A
minimum of 24 port 10GigE switches (except for the occasional IBM
switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with
that more than just a couple 10GigEs are even more money, I'd assume.

a 64 port 10 or mixed 10/40Gb/s switch can forward more half a Tb/s
worth of 64byte packets, do so with cut-through forwarding and in a
thermal enevelope of 150 watts. device like that retail for ~20k, in
reality you need more than one. the equivalent gigabit product is 15 or
20% of the price.

you mention mpls support so that dictates appropriate support which is
available in some platforms and asics.

I thought vMX was going to save the day, but it's pricing for 10 gigs
of traffic (licensed by throughput and standard\advanced licenses) is
really about 5x - 10x what I'd be willing to pay for it.

The servers capable of relatively high-end forwarding feats aren't free
either nor are the equivalent.

Haven't gotten a quote from AlcaLu yet.

Vyatta (last I checked, which was admittedly some time ago) doesn't
have MPLS.

The FreeBSD world can bring zero software cost and a stable platform,
but no MPLS.

mpls implementions have abundant ipr, which among other things prevents
practical merging with the linux kernel.

Hows convergence time on these mikrotik/ubiquity/etc units for a full table?

/kc

Depends on the hardware. 30 - 45 seconds for the higher end stuff? I'm not sure how long it is on an RB750 (list price of like $40). :wink:

Latvian grammar is.. somewhat unusual.

Just be glad the development team wasn't Finnish. :slight_smile:

(Sorry, I couldn't resist. :slight_smile:

A Maxxwave Routermxx MW-RM1300-i7 (x86 mikrotik router) pulls full tables
from two peers and converges in about 40 seconds.