10G CPE w/VXLAN - vendors?

Hello, all.

I’m having difficulty finding vendors, never mind products, that fit my need.

We have a small but growing number of L2 (bridged) customers that have diverse fiber paths available, and, naturally, want to make use of them.

We have a solution for this: we extend the edge of our EVPN VXLAN fabric right to the customer premise. The customer-prem device needs 4x10G SFP+ cages (2 redundant paths, plus LAG to customer), and the switches we currently use, Arista 7020Rs, are quite expensive if I’m deploying one one per customer. (Nice switches, but overkill here – I don’t need 40/100G, and I don’t need 24 SFP+ ports. And they still take forever to ship.)

We use RFC7438 §6.3 “vlan-aware-bundle” mode, not §6.1 “vlan-based” mode, which limits our choices somewhat. I might be willing to entertain spinning up a separate VXLAN mesh using RFC7438 §6.1 (“vlan-based”) and static VTEPs if it saves me a lot of pain.

However, I’m having trouble finding small & cheap__er__ 1U (or even desktop/wallmount) devices that have 4 SFP+ cages, and can do VXLAN, in the first place.

Who even makes CPE gear with SFP+ ports? (Other than Mikrotik CRS309-1G-8S+IN / CRS317-1G-16S+RM, which are nice, but our policy requires vendor support contracts, so… no-go.)

Vendors? Model#s, if you happen to know any?

Reply here or privately, whatever floats your boat – any pointers appreciated!

Adam Thompson

Consultant, Infrastructure Services

[MERLIN logo]

100 - 135 Innovation Drive

Winnipeg, MB R3T 6A8

(204) 977-6824 or 1-800-430-6404 (MB only)

https://www.merlin.mb.ca

Chat with me on Teams

I think you’re probably overthinking this a bit.

Why do you need to extend your vxlan/evpn to the customer premise? There are a number of 1G/10G even 100G CPE demarc devices out there that push/pop tags, even q-in-q, or 802.1ad. Assuming you have some type of aggregation node you bring these back to, tie those tags to the appropriate EVPN instance at the aggregation point. Don’t extend anything but a management tag and an S-tag essentially to the device at the customer premise.

You can even put that management tagged vlan in it’s own L3 segment, or a larger L3 network and impose security. This way you’re not exposing your whole service infrastructure to a bad actor that might unplug your cpe device and plug into your network directly.

Putting the smart devices on the edge allows for a much-simplified core topology.

Either way, I was doing research on FPGA-based hardware a couple of weeks ago and came across this which may tick all the boxes. https://ethernitynet.com/products/enet-network-appliances/uep-60/ I do not know the vendor personally and have not worked on their hardware, so your mileage may vary.

Ryan

The redundant links to the customer site that traverse independent underlay carriers, and in some cases, equal-cost paths that we want to load-balance across, are the hard part. I’m not going to trust STP for that, and we aim for <3sec failover where we do have redundant paths. ERPS can handle the failover, but not the load-balancing. Any L2-over-L3 encapsulation protocol can handle the failover + ECMP features, but I need to do it at ~10G (~20G if ECMP) wire speed.

We provide IaaS services to our customers, which is why we’re stretching VLANs to them in the first place. Viewed from the IaaS perspective, this is a bunch of DC-DC connections… but relative to the overall network, the customer-prem devices fall into the traditional “CPE” category. (Most customers either just plug in bare fiber, or they connect to an intermediate carrier’s CPE.)

hey,

equal-cost paths that we want to load-balance across, are the hard part. I’m not going to trust STP for that, and we aim for <3sec failover where we do have redundant paths. ERPS can handle the failover, but not the load-balancing.

You have EVPN already, perhaps just use active-active multihoming lag over those two paths? It'll give you loadbalancing in both directions.

Not sure how much of “CPE” it needs to be, but for example the whole Cisco Catalyst 9K product line (including the smaller C9300 switches) support the whole EVPN/VXLAN stack).
A similar set of products exist on the Arista side (e.g. 7xx switches) as well as Juniper EX4400 products…

The Juniper EX4100-F-12T is pretty nice. Fanless, 1RU, 4x SFP+, 2x 10G Copper which can also be used to power up the switch, and 12x 1G Copper ports. EVPN/VXLAN requires an additional license. They don’t break the bank, our use case is for a CPE as well.

Brandon

The problem with these switch suggestions is the lack of RFC2544 testing, and jitter + latency monitoring required for meeting SLA. That is why I mentioned the FPGA solution.

Ryan Hamel

There may be a few more places to go searching. I am not saying you will find anything, but worth looking into, assuming Mikrotik won’t help. :slight_smile:

Check out what various SD-WAN vendors have to offer. Now, SD-WAN has about 46 definitions, as many as vendors (surviving vendors that is), but underneath all of them, it is some sort of box with a CPU, a semi-smart NIC with a bunch of ports and routing stack that happens to support L2 transport and can overlay it on top of any WAN transport, including regular IP underlay that can run on these fiber paths. The one of note is Versa. Besides BGP and overlaying, you may even get a useful multi-layer control plane out of it, which under the hood of all marketing definitions is all the things you are familiar with. And data plane that can actually do 10G.

Check out some of the Broadcom Qumran half-ru switches. Something like that:

https://www.etb-tech.com/dell-networking-s4112f-on-switch-12-x-10gb-sfp-3-x-qsfp28-ports-sw00237.html

There are a few other vendors besides Dell and Dell OS does have your basic P2P VXLAN and EVPN as VXLAN control plane. There are a few others including open source options. But you are using these small half-ru Broadcom Qumran and Trident reference designs.

And finally as you go on that search, you can always build your own. All you need is $100-200 mini-pc, Linux on it, some form of optimized forwarder and open source routing stack.

There are people out there who supposedly did that with Raspberry Pis and used Linksys routers. Not that you should do it, but shows that there are options and don’t count on 10G!

Yan

You will have trouble finding such a device at the price you need because it is atypical to have your customer’s CPE as part of your Metro-E backbone. Our sales people have asked for this more times than I can remember. We have continued to refuse for a reason. They’ve angled their query to extend our u-PE devices into the customer site, to which they can attach their CPE. We have refused that too, because most customer’s do not allow 3rd party fibre x-connects into their site (for example, some country’s embassy, a stock exchange building, a bank, e.t.c.), never mind the fact that most customer sites are not fitted with 24/7/365 availability and security. And we continue to refuse. My advice - don’t do it. But it sounds like you want to, so… Mark.

Huawei NE8000-M1C

Putting the smart devices on the edge allows for a much-simplified core topology.

Putting smart devices in the edge does simplify the network, yes. What doesn't is making the customer's site part of your edge.

We've been running MPLS all the way into the access since 2009 (Cisco ME3600X/3800X). It is simpler than running an 802.1Q or Q-in-Q Metro-E backbone, and scales very well. Just leave your customers out of it.

Either way, I was doing research on FPGA-based hardware a couple of weeks agoand came across this which may tick all the boxes. ENET UEP-60 - Ethernity Networks I do not know the vendor personally and have not worked on their hardware, so your mileage may vary.

There aren't a great deal of options in this space, unfortunately. What is making it worse is most traditional vendors are relegating devices designed for this to Broadcom chips, which is a problem because the closer you get to the customer, the more you need to "touch" their packets, and Broadcom chips, while fast and cheap, aren't terribly good at working with packets in the way the customers these devices need to address would like.

Cisco's ASR920 is still, by far, the best option here. Unfortunately, it has a very small FIB, does not do 10Gbps at any scale, and certainly does not 100Gbps. But, because most customers tend to run only p2p EoMPLS services on it (that doesn't require any large FIB), the box is still actively sold by Cisco even though in Internet years, it is older than my grandfather's tobacco pipe.

Juniper are pushing their ACX7024, which we are looking at as a viable option for replacing the ASR920. However, it's Broadcom... and while Nokia's Broadcom option for the Metro-E network is using the same chip as the Juniper one, they seem lazier to be more creative with how they can touch customer packets vs. Juniper.

Cisco's recommended upgrade path is the NCS540, also a Broadcom box; the heaviness that is IOS XR in a large scale deployment area like the Metro-E backbone notwithstanding. The rumour is that Cisco want to optimize Silicon One for their entire routing & switching range, small and large. I'll believe it when I see it. Until then, I wouldn't touch the NCS540.

Vendors are trying to do the least in the Metro-E space, knowing full well how high the margins are. It's a bit disingenuous, considering they will be shipping more Metro-E routers to customers than core or edge routers. But, it is what it is.

Mark.

The reason customers ask that their site be part of the customer’s Metro-E backbone is so that they can enjoy link redundancy without paying for it. Operators will generally have east and west links coming out of a Metro-E site. Customers who single-home into this device only have their last mile as the risk. But if the operator drops a Metro-E node into the customer’s site, and cables it per standard, the customer has the benefit of last mile redundancy, because the internal fibre/copper patch to the operator’s Metro-E switch does not really count as a (risky) last mile. Sales people like to do this to engender themselves with the customer. Customers like to do this to get a free meal. Don’t do it, because customer’s always assume that that Metro-E node that is in their building “belongs to them”. Mark.

I envy folk who aren't mobile operators that are brave enough to run Huawei for their IP/MPLS network deliberately, i.e., without influence from "management" because they got a good deal :-).

Not for us.

Mark.

We use MPLS for this. We can have as many as 6 paths coming out of a single Metro-E node. MPLS will handle it just fine. Any Layer 2 option won’t work the way you want it… they are simply not built for that level of redundancy or load balancing. We have not tried to do this with VXLAN, and don’t intend to. Mark.

Putting the smart devices on the edge allows for a much-simplified core topology.

Putting smart devices in the edge does simplify the network, yes. What doesn’t is making the customer’s site part of your edge.

If the customer’s site goes offline, that is their problem. A CPE device is still a CPE device, no matter how smart it is. Setup IS-IS, BGP to route servers, LDP + MPLS if you don’t go the VXLAN route, and that’s it. I know Ciena’s can do that on their more expensive 39xx models.

We’ve been running MPLS all the way into the access since 2009 (Cisco ME3600X/3800X). It is simpler than running an 802.1Q or Q-in-Q Metro-E backbone, and scales very well. Just leave your customers out of it.

There are a few tier 1’s that have delivered Ethernet transport circuits on those exact boxes in the field as I speak. It works very well.

I also agree with your stance on Broadcom, it’s hard to come up with alternatives that are not ADVA/Ciena/Cisco/RAD.

Ryan Hamel

I fully agree here too. That’s why I proposed a “smarter” CPE to replace the standard appliances deployed on site, where the only thing changing is the configuration on the device itself, not product being handed off.

So you have two issues here: Unless things changed, my understanding is Ciena’s implementation is MPLS-TP. Does anybody know if they now have full support for IP/MPLS in the way we have it with real router vendors? Don’t know what “teir 1’s” means :-). Well, the ME3600X/3800X has been EoL for quite some time now. But yes, it would work, especially if you don’t run BGP on it. So the optical OEM’s are not generally good options for routers of any kind. That knocks Adva, Ciena, Infinera, Xtera, Tejas, e.t.c., off the list. Nokia do have a decent IP/MPLS platform, thanks for ALU. But the Metro-E boxes they position for that segment - the 7250 IXR-e, IXR-s and IXR-x - are also using Broadcom. Not interested in Huawei. I like Mikrotik, but only as a self-managed CPE, and not for a service provider backbone. Arrcus are currently focusing on the data centre. Arista aren’t interested in the Metro-E space. HP/3Com, Dell, Extreme - very unknown quantities that I’m not motivated to look into. At the moment, the battle is really etween Cisco’s NCS540 and Juniper’s ACX7100/7200 platforms. Both are Broadcom-based, but I think Juniper have the slightly better idea in terms of how much they can squeeze out of Broadcom re: how much one can touch a customer’s packets. Mark.

I’m just always concerned about having my IP/MPLS core inside a customer site. But, YMMV. Personally, I wouldn’t. We always ensure anything inside a customer site is its own broadcast, IP and MPLS domain, separate from ours, regardless of whether it is managed by us or the customer. Mark.

I would never let the customer manage the CPE device, unless it was through some customer portal where automation can do checks and balances, nor have the device participate in a ring topology – home runs or bust. If the device fails or has an issue requiring a field dispatch, that is on the customer to help arrange that time and provide on-site contact info, otherwise the SLA clock stops ticking.

Now if the customer refuses to allow the vendor to pickup the CPE (regardless of make/model) and/or building aggregation/demarc + UPS hardware, the police can get called for theft of equipment depending on its value, or customer/landlord is sued depending on what the contract states.

As for Ciena’s SAOS feature set, I was only going by the RFC’s and protocols listed on some of the higher end CPE equipment. I do not have first hand experience with them.

Tier 1’s as in Cogent, Level3/Lumen, Zayo, etc.

Juniper’s ACX7024 does look interesting as a building demarc/agg device, but overkill for a single client CPE. It can’t hold full tables for transit handoffs, but the customer can establish multi-hop BGP sessions upstream for that.