BIRD / BGP-ORR experiences?

Thinking about setting up BGP-ORR on some BIRD VMs (https://bird.network.cz) for lab purposes, I’m sure its more than sufficient.

Does anyone use these in production? Any thoughts, experiences, caveats?

Do we even like BGP ORR?

Thanks in advance,

Deepak

yes, but it needs to be planned carefully.

Nick

Hey,

Do we even like BGP ORR?

I like it, I think ADD-PATH and ORR are mandatory features in modern
RR infra. However proper interaction between them may not exist in
every implementation. Basically you want

a) send all ECMPable paths
b) send one backup path

This will lead to superior to full-mesh by every reasonable metric:
- smaller rib (no routes that you don't need, but all the routes you care about)
- redundancy (one iBGP down, is not customer outage)
- less state changes, less code stress

ORR is not an RFC and there are some open questions. What to reflect,
when next-hop is not in IGP? Do we hope that receiver would recurse to
the same IGP next-hop? Juniper makes this assumption, which to me is
decidedly the common case. Cisco makes no assumption and doesn't
reflect if next-hop is not in IGP, but as I understand they will fix
to the same assumption as Juniper.

When we wanted this bad, it wasn't ready (2014), so we ended up
deploying an RR in every major PoP, since that wasn't too costly (there
was a time when a network I knew of used a Juniper M120 as an RR).

Nice to hear ORR has come a long way that it's somewhat usable.

Mark.

hey,

Nice to hear ORR has come a long way that it's somewhat usable.

It is usable, we have taken it even a step forward:

- virtualized RR
- add-path
- ORR
- IGP topology to RR via BGP-LS so we don't have to extend ISIS to VMs (there are some issues with SR-IOV)

Awesome!

Never been a fan of Add-Path or Diverse-Path, but good to know it's
working well for you and others.

Were you previously running IS-IS on a UNIX/Linux system running in a VM?

Mark.

hey,

Were you previously running IS-IS on a UNIX/Linux system running in a VM?

No, we had RR function inline on ASBRs.

To be clear, our RRs are not BIRD but Nokia VSRs.

I was asking in relation to your IS-IS + SR-IOV issues.

Mark.

hey,

I was asking in relation to your IS-IS + SR-IOV issues.

Well ISIS works with bridge but we like to keep our virtualized NFs simple so KVM hosts have dedicated 10G port for NFs (that connects directly to a metro node) and we run SR-IOV.

Got you.

Mark.

When we wanted this bad, it wasn't ready (2014), so we ended up deploying an RR in every major PoP, since that wasn't too costly (there was a time when a network I knew of used a Juniper M120 as an RR).

Nice to hear ORR has come a long way that it's somewhat usable.

Nice to hear ORR has come a long way that it's somewhat usable.

It is usable, we have taken it even a step forward:

- virtualized RR
- add-path
- ORR
- IGP topology to RR via BGP-LS so we don't have to extend ISIS to VMs (there are some issues with SR-IOV)

Do we even like BGP ORR?

I like it, I think ADD-PATH and ORR are mandatory features in modern RR infra. However proper interaction between them may not exist in every implementation. Basically you want

a) send all ECMPable paths
b) send one backup path

This will lead to superior to full-mesh by every reasonable metric:
- smaller rib (no routes that you don't need, but all the routes you care about)
- redundancy (one iBGP down, is not customer outage)
- less state changes, less code stress

ORR is not an RFC and there are some open questions. What to reflect, when next-hop is not in IGP? Do we hope that receiver would recurse to the same IGP next-hop? Juniper makes this assumption, which to me is decidedly the common case. Cisco makes no assumption and doesn't reflect if next-hop is not in IGP, but as I understand they will fix to the same assumption as Juniper.

Don't run Cisco ORR RR or have IGP next-hops :confused:

How is this approach working for you?

It's working out beautifully, since 2014.

We wanted ORR at the time, but it was immature, so this was our only option.

Yes, it's an old school approach, but it's simple, so we don't have to
enable any trickery.

We were considering BGP ORR on bare metal (maybe a VM if we can get OSPF into the VM) and putting a primary RR at each eBGP node (physical site) and at least one link to a backup RR at a different site, perhaps enabling BGP ORR for that to minimize suboptimal paths.

Adding a VM or a server node for this function is hardly the technical challenge it used to be with so many linux-based white box switches and things running around nowadays.

Well, our RR's run on ESXi (Cisco CSR1000v), but I'm sure your favorite
hypervisor will work just fine. Your main concern is whether you want to
run an established code base (Cisco, Juniper, Nokia, e.t.c.), or if you
want to run something more open source. We chose the former considering
our iBGP is a critical piece of infrastructure.

Mark.

I should imagine NEXT_HOP=self still works in an ORR world, non :-)?

Mark.

Does it break NEXT_HOP=self in Cisco-land?

Mark.

We’re testing ORR at the moment as part of core upgrades (XRv on ESXi), and next-hop self not only works, it’s required for ORR to work properly

I’ve not noticed any major issues with it yet but it’s still early days in terms of our deployment

Yes, that would be my simple 1+1, as it's all about optimizing for the
best IGP exit for far-away nodes.

Mark.

That would be in IGP, so that'll work. The other way that some people
do this, is that next-hop is CE, which is in iBGP, but recurses to
loop0. There are some TE reasons why people might do this, and it
would not work with Cisco ORR.