RFC 5549 - IPv4 Routes with IPv6 next-hop - Does it really exists?

Let’s just jump all the arguing about lack of IPv4, the need of IPv6, and etc…

I must confess that I don’t know all the RFCs.
I would like it, but I don’t!

And today, I reached on https://tools.ietf.org/html/rfc5549

I knew that was possible to transfer v4 routes over v6 BGP sessions, or v6 routes over v4 BGP sessions.
But I got surprised when I saw this youtube vídeo of AMS-IX guys considering use a v6 only Lan, and doing v6 next-hops to v4 routes.

https://www.youtube.com/watch?v=uJOtfiHDCMw

Well… I guess that idea didn’t go to production.

But the questions are:
There is any network that really implements RFC5549?
Can anyone share some information about it?

Hello,

This is implemented in FRR and will also be available in BIRD 2.0.8.
Linux accepts IPv6 next-hop for IPv4 natively since 5.3 (no tunnels).
This is the solution Cumulus is advocating to its users, so I suppose
they have some real users behind that. Juniper also supports RFC 5549
but, from the documentation, the forwarding part is done using
lightweight tunnels.

Maybe David Ahern is reading this list and could comment more. I don't
use this solution myself as the vendor support is still quite limited
but if I were to start a network from scratch, I would definitively go
for it.

I'm not sure if you claim otherwise, but no real 'tunneling' takes
place, as far as I know, it's internal implementation detail having
IPV6 next-hop for IPV4. I don't think there is any additional headers
or any additional lookup or cost.
Cisco supports extended nexthop encoding too, so it is fairly well
supported by shipping products.

❦ 29 juillet 2020 12:13 +03, Saku Ytti:

This is the solution Cumulus is advocating to its users, so I suppose
they have some real users behind that. Juniper also supports RFC 5549
but, from the documentation, the forwarding part is done using
lightweight tunnels.

I'm not sure if you claim otherwise, but no real 'tunneling' takes
place, as far as I know, it's internal implementation detail having
IPV6 next-hop for IPV4. I don't think there is any additional headers
or any additional lookup or cost.

I didn't test, but the documentation states:

Starting in Release 17.3R1, Junos OS devices can forward IPv4 traffic
over an IPv6-only network, which generally cannot forward IPv4
traffic. As described in RFC 5549, IPv4 traffic is tunneled from CPE
devices to IPv4-over-IPv6 gateways. These gateways are announced to
CPE devices through anycast addresses. The gateway devices then create
dynamic IPv4-over-IPv6 tunnels to remote customer premises equipment
and advertise IPv4 aggregate routes to steer traffic. Route reflectors
with programmable interfaces inject the tunnel information into the
network. The route reflectors are connected through IBGP to gateway
routers, which advertise the IPv4 addresses of host routes with IPv6
addresses as the next hop.

https://www.juniper.net/documentation/en_US/junos/topics/topic-map/multiprotocol-bgp.html#id-configuring-bgp-to-redistribute-ipv4-routes-with-ipv6-next-hop-addresses

If you have a pointer around the subject on Juniper, I would be quite
interested!

Thanks.

I think only disconnect here is definition of tunnel, there are no
additional headers and I don't think the document implies it and the
RFC it refers to does not. I've not tried it myself, but my
expectation is that internally the next-hop is represented as ipv6
with ipv4 resolution copied for L2 so I anticipate the magic to be
local here and when they talk about tunnel, I suspect they refer to
that adjacency as tunnel.

Long time ago I tried it out:

https://blog.acostasite.com/2013/02/publicar-prefijos-ipv4-sobre-una-sesion.html

https://blog.acostasite.com/2013/02/publicando-prefijos-ipv6-sobre-sesiones.html

I did not like, difficult troubleshooting in case something goes wrong (however I can understand it’s a nice feature to have and in might be useful in some scenarios).

But you are right I do not know much about networks doing it, I also would like hear about it.

Alejandro,

Hey,

https://blog.acostasite.com/2013/02/publicar-prefijos-ipv4-sobre-una-sesion.html
https://blog.acostasite.com/2013/02/publicando-prefijos-ipv6-sobre-sesiones.html

I did not like, difficult troubleshooting in case something goes wrong (however I can understand it's a nice feature to have and in might be useful in some scenarios).

Your experiment predates extended nexthop encoding, but otherwise it
is indeed the very same thing. Just less operational overhead now.

Of course everyone has done 6PE and 6VPE longest time, because
obviously you can fit IPv4 next-hop in IPv6 coding, so nothing was
needed. This extended nexthop encoding only exists to fix the problem
that wire-format didn't support signalling IPv6 next-hop for IPv4
NLRI.

Douglas Fischer writes:

And today, I reached on https://tools.ietf.org/html/rfc5549

[...]

But the questions are:
There is any network that really implements RFC5549?

We've been using it for more than two years in our data center networks.
We use the Cumulus/FRR implementation on switches and FRR on Ubuntu on
servers.

Can anyone share some information about it?

Sure. We found the FRR/Cumulus implementation very easy to set up. We
have leaf/spine networks interconnecting hundreds of servers (IPv4+IPv6)
with very minimalistic configuration. In particular, you generally
don't have to configure neighbor addresses or AS numbers, because those
are autodiscovered. I think we're basically following the
recommendations in the "BGP in the Data Center" book including the "BGP
on the Host" part (though our installation predates the book, so there
might be some differences).

The network has been working very reliably for us, so we never really
had anything to debug. If you're coming from a world where you used
separate BGP sessions to exchange IPv4 and IPv6 reachability
information, then the operational commands take a little getting used
to, but in the end I find it very intuitive.

For example, here's one of the "show bgp ... summary" commands on a leaf
switch:

    leinen@sw-f:mgmt-vrf:~$ net show bgp ipv6 uni sum
    BGP router identifier 10.1.1.46, local AS number 65111 vrf-id 0
    BGP table version 96883
    RIB entries 1528, using 227 KiB of memory
    Peers 54, using 1041 KiB of memory
    Peer groups 2, using 128 bytes of memory

    Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
    sw-o(swp16) 4 65108 953559 938348 0 0 0 03w5d00h 688
    sw-m(swp18) 4 65108 885442 938348 0 0 0 03w5d00h 688
    s0001(swp1s0.3) 4 65300 748971 748977 0 0 0 03w5d00h 1
    s0002(swp1s1.3) 4 65300 661787 661794 0 0 0 03w1d23h 1
    s0003(swp1s2.3) 4 65300 748970 748977 0 0 0 03w5d00h 1
    s0004(swp1s3.3) 4 65300 661868 661875 0 0 0 03w1d23h 1
    s0005(swp2s0.3) 4 65300 748970 748976 0 0 0 03w5d00h 1
    [...]

Note the host names/interface names - this is how you generally refer to
neighbors, rather than using literal (IPv6) addresses.

Otherwise it should look very familiar if you have used vendor C's
"industry-standard CLI" before.

(In case you're wondering, the first two neighbors in the output are
spine switches, the others are servers.)

Cheers,

Are the names based on DNS look-ups, or is there some kind of protocol
association between the device underlay and its hostname, as it pertains
to neighbors?

Mark.

afaik, this is an implementation of draft-walton-bgp-hostname-capability.

Nick

Nice.

I'm curious to know if this is after-the-fact, as I can't think of a way
that BGP would find hostnames to setup sessions with, outside of some
kind of upper layer name resolution capability.

The draft isn't clear on how this happens, if it is, indeed,
before-the-fact.

Mark.

it's a capability negotiation, so is handled on session setup.

Nick

I'm not sure I understand what the option space is. This is like ISIS
TLV137, protocol will populate some trash there and you'll politely
access. It won't allow you to refer to the peer with any name prior to
having the session up. Much like you won't see ISIS neighbours name
when session is establishing, until it has actually loaded and
processed the TLV137.

In reality, next hop isn’t really a layer 3 address. The layer 3 address is a stand-in that is resolved to
a layer 2 address for forwarding. The layer 3 next-hop address never makes it into the packet.
As such, the relationship between the destination address family and the next-hop address
family is mostly to avoid breaking the brains of humans. Software to handle mixed-address-families
in next hop vs. destination should be a relatively trivial difference from software that requires the
address families to match.

Owen

I wish you had shared in the draft process so they could have
benefitted from your insight into the proper verbiage.

Meaning the initial setup would still require the use of literal IP
addresses?

Mark.

The IS-IS comparison came to mind as well, yes. But IS-IS is different
in that LSP's are dynamically flooded upon activation on an interfaces,
and those LSP's carry router information, including hostname.

BGP is not dynamically setup, so in my mind, you still need literal IP
addresses to set sessions up, and then this BGP hostname capability
would translate those IP addresses to remote hostnames after the
sessions have been established.

I just wanted to clarify if this is the practical implementation and
operation of the same, as the draft isn't specific on it, as I'm sure
others may consider the same thought process.

Mark.

Unless your (e.g. DC equipment) is set up for automatic bgp neighbour
discovery using IPv6 ND+RA [2]. Then yes.

[0]: https://github.com/FRRouting/frr/commit/04b6bdc0ee6275442464edec1d14b3f4d3eaa246
[1]: https://github.com/FRRouting/frr/search?p=3&q=hostname&type=Commits
[2]: https://docs.cumulusnetworks.com/cumulus-linux/Layer-3/Border-Gateway-Protocol-BGP/#configure-bgp-unnumbered-interfaces

You can't use hostnames, if that's what you're asking. FRR will also do
unnumbered BGP with auto-config.

Nick

You can't use hostnames, if that's what you're asking.

Yes, couldn't fathom how.

So really it's convenience of troubleshooting, not convenience of setup
:-). I can live with that.

  FRR will also do
unnumbered BGP with auto-config.

Interesting...

Mark.