Best practice for BGP session/ full routes for customer

Hello everyone!

I have quick question on how you provide full BGP table to downstream
customers?

Most of large networks have few border routers ("Internet gateways") which
get full table feed and then they have "Access routers" on which customers
are terminated. Now I don't think it makes sense to push full routing table
on the "access routers" and simply their default points to border routers.

In this scenario what is best practice for giving full table to downstream?

   1. Having multi-hop BGP session with a loopback on "border router" for
   injecting full table in customer router and another BGP session with access
   router for receiving routes? (messy!)

   2. Injecting full table in just all access routers so that it can be
   provided whenever needed?

   3. Any other?

Thanks in advance!

1. You already know that multihop is very ugly. If it's for a one-off, it's probably fine. But building a product around multi-hop wouldn't be my first choice.

2. Most of the router/switch vendors that can support a full table are pretty expensive, per port. Your best bet here might be to look into some way of transparently dragging customer traffic from the PE to the BGP speaker, which leads me to:

3. If your network is MPLS enabled, you can do a routed pseudowire from a BGP speaking router with a full table to the access router (PE). Other tunnelling technologies can probably do the same thing; GRE, L2TPv3 and also a plain'ol VLAN can do it too, depending on your network topology. Do some sort of OAM over top of either of those (if your platform supports it) and it looks just like a wire to the end customer.

In our case, we have three types of edge routers; Juniper
MX480 + Cisco ASR1006, and the Cisco ME3600X.

For the MX480 and ASR1006 have no problems supporting a full
table. So customers peer natively.

The ME3600X is a small switch, that supports only up to
24,000 IPv4 and 5,000 IPv6 FIB entries. However, Cisco have
a feature called BGP Selective Download:

  http://tinyurl.com/nodnmct

Using BGP-SD, we can send a full BGP table from our route
reflectors to our ME3600X switches, without worrying about
them entering the FIB, i.e., they are held only in memory.
The beauty - you can advertise these routes to customers
natively, without clunky eBGP Multi-Hop sessions running
rampant.

Of course, with BGP-SD, you still need a 0/0 + ::/0 route in
the FIB for traffic to flow from your customers upstream,
but that is fine as it's only two entries :-).

If your system supports a BGP-SD-type implementation, I'd
recommend it, provided you have sufficient control plane
memory.

Cheers,

Mark.

We prefer Layer 2 bundling technologies like 802.1AX, POS
bundles or ML-PPP.

However, some customers just can't support this, but have
multiple links to us and need load sharing. In this case,
eBGP Mulit-Hop is a reasonable use-case.

Mark.

Nasty, as I generally walk away from centralization.

However, if that's your only option...

Mark.

Mark,

BGP to RIB filtering (in any vendor implementation) is targeting RR which
is not in the forwarding path, so there¹s no forwarding towards any
destination filtered out from RIB.
Using it selectively on a forwarding node is error prone and in case of
incorrect configuration would result in blackholing.

Cheers,
Jeff

BGP to RIB filtering (in any vendor implementation) is targeting RR which
is not in the forwarding path, so there�s no forwarding towards any
destination filtered out from RIB.
Using it selectively on a forwarding node is error prone and in case of
incorrect configuration would result in blackholing.

there are other drawbacks too: the difference in convergence time between <
24k prefixes and a full dfz is usually going to be large although I
haven't tested this on an me3600x yet. Also these boxes only have 1G of
memory might be a bit tight as the dfz increases. For sure, it's already
not enough on a bunch of other vanilla ios platforms.

Nick

As with every feature on a router, you need to know what
you're doing to make it work.

Don't blame the cows if you turn on knobs you have no
business using, or don't care to learn the risks of.

We use this feature in our network successfully, because we
know what we're doing, and care to understand the risks.

If I use it in a manner other than previously directed
(while I know it's a use-case, I've never heard of any
vendor saying it ONLY targeted out-of-path route reflectors,
but then again, I don't generally walk vendor corridors for
the scoop), well, welcome to the Internet; where core
routers can either be behemoths that move air the size of a
football field and could be mistaken for seismic detection
machines, or last generation's x86 home desktop running
Quagga and grandma's health app :-).

Mark.

Thanks everyone for insightful answers!

there are other drawbacks too: the difference in
convergence time between < 24k prefixes and a full dfz
is usually going to be large although I haven't tested
this on an me3600x yet.

Not having to install the routes into FIB (even on software-
based platforms) makes a ton of difference.

Our testing when using this feature on the ME3600X has
shown:

  1. The switch will download a full copy of the IPv6
     table of 18,282 entries in 1 second. This is from
     2x local route reflectors, so no latency.

  2. The switch will download a full copy of the IPv4
     table of 499,437 entries in 3 minutes, 10
     seconds. This is from 2x local route reflectors,
     so no latency.

The IPv4 convergence was consuming between 12% - 30% CPU
utilization during the table download. This was on the IPv4
table, given its size. The IPv6 didn't bother the switch in
any way.

The CPU on the ME3600X is a little slow; we've seen far
better IPv4 BGP table download times on meatier CPU's, and
the CSR1000v, which runs on servers that kick typical router
CPU's into the stone age.
  

Also these boxes only have 1G
of memory might be a bit tight as the dfz increases.
For sure, it's already not enough on a bunch of other
vanilla ios platforms.

Total memory utilized (for 2x full BGPv4 and BGPv6 feeds,
and after IOS deducts system memory for itself) came to
370MB.

That left 424MB of memory free.

Code is 15.4(2)S.

Cheers,

Mark.