BGP prefix filter list

If you’re going whitebox, I would check out Netgate’s new product called TNSR. It uses VPP for the data plane, which does all its processing in user space, thus avoiding the inefficiencies of the kernel network stack. That’s particularly important at higher speeds like 40G or 100G.

Disclaimer: I have not tried it myself but I’ve only heard good things.

Hello

What is the most common platform people are using with such limitations? How long ago was it deprecated?

We are a small network with approx 10k customers and two core routers. The routers are advertised as 2 million FIB and 10 million RIB.

This morning at about 2 AM CET our iBGP session between the two core routers started flapping every 5 minutes. This is how long it takes to exchange the full table between the routers. The eBGP sessions to our transits were stable and never went down.

The iBGP session is a MPLS multiprotocol BGP session that exhanges IPv4, IPv6 and VRF in a single session.

We are working closely together with another ISP that have the same routers. His network went down as well.

Nothing would help until I culled the majority of the IPv6 routes by installing a default IPv6 route together with a filter, that drops every IPv6 route received on our transits. After that I could not make any more experimentation. Need to have a maintenance window during the night.

These routers have shared IPv4 and IPv6 memory space. My theory is that the combined prefix numbers is causing the problem. But it could also be some IPv6 prefix first seen this night, that triggers a bug. Or something else.

Regards,

Baldur

Hello Baldur,

What routers are you running?

-Mike

My purpose is not to shame the vendor, but anyway these are ZTE M6000. We are currently planing to implement Juniper MX204 instead, but not because of this incident. We just ran out of bandwidth and brand new MX204 are cheaper than 100G capable shelves for the old platform.

Regards,

Baldur

We’re an eyeball network. We accept default routes from our transit
providers so in theory there should be no impact on reachability.

I’m pretty concerned about things that I don’t know due to inefficient
routing, e.g. customers hitting a public anycast DNS server in the wrong
location resulting in Geolocation issues.

Ah! Understood. The default route(s) was the bit I missed. Makes a lot of
sense if you can’t justify buying new routers.

Have you seen issues with Anycast routing thus far? One would assume that
routing would still be fairly efficient unless you’re picking up transit
from non-local providers over extended L2 links.

We’ve had no issues so far but this was a recent change. There was no
noticeable change to outbound traffic levels.

+1, there is no issue with this approach.

i have been taking “provider routes” + default for a long time, works great.

This makes sure you use each provider’s “customer cone” and SLA to the max while reducing your route load / churn.

IMHO, you should only take full routes if your core business is providing full bgp feeds to downstrean transit customers.

I wouldn’t call it shaming the vendor. There are a ton of platforms out there by nearly every vendor that can’t accommodate modern table sizes.

You can’t do uRPF if you’re not taking full routes.

You also have a more limited set of information for analytics if you don’t have full routes.

You can’t do uRPF if you’re not taking full routes.

I would never do uRPF , i am not a transit shop, so no problem there. BCP38 is as sexy as i get.

You also have a more limited set of information for analytics if you don’t have full routes.

Yep, i don’t run a sophisticate internet CDN either. Just pumping packets from eyeballs to clouds and back, mostly.

As an eyeball network myself, you’ll probably want to look at those things. You don’t need to run a CDN to know where your bits are going.

At a previous company , about 10-ish years ago, had the same problem due to equipment limitations, and wasn’t able to get dollars to upgrade anything.

The most effective thing for me at the time was to start dumping any prefix with an as-path length longer than 10. For our business then, if you were that ‘far away’ , there wasn’t any good reason for us to keep your route. Following default was going to be good enough.

It’s still a reasonable solution I think in a lot of cases to filter out a lot of the unnecessary prepend messes out there today.

It is a quagmire, isn't it?

The revenue from capacity (Ethernet, IP, DWDM, SDH) is falling every
year, to a point where it stops becoming a primary revenue source for
any telecoms provider. However, the cost of equipment is not following
suit, be it on the IP, Transport or Mobile side, terrestrial, marine or
wireless.

Work that is going on in the open space around all of this for hardware
and software needs to pick its pace up, otherwise this disconnect
between the loss of revenue and the cost of capex will remain.

Mark.

Hi Baldur,

Have you tried disabling storage of received updates from your upstream on your edge/PE or Border? Just remove soft-reconfiguration inbound for eBGP peering with your upstream/s. This will resolve your issue.

If you have multiple links to different upstream providers and you want to simplify your network operation, you might want to introduce a pair of route reflectors to handle all your IP and MPLS VPN routes…

Cheers,
Ahad

Ca, taking a self-originated default route (with or without an additional partial view of the global routing table) from your transit provider’s edge router seems to make the assumption that your transit provider’s edge router either has a full table or a working default route itself. In the case of transit provider outages (planned or unplanned), the transit provider’s edge router that you peer with may be up and reachable (and generating a default route to your routers), but may not have connectivity to the greater internet. Put another way, if your own routers don’t have a full routing table then they don’t have enough information to make intelligent routing decisions and are offloading that responsibility onto the transit provider. IMHO, what’s the point of being multi-homed if you can’t make intelligent routing decisions and provide routing redundancy in the case of a transit provider outage?

Speaking of "intelligent routing", this is why doing some targeting on what you filter by some criteria other than prefix or as-path length is a good idea. Either manually every once in a while (just make sure that you at least check the situation every few weeks), or in an automated manner (better). You just need more data (usually *flow/ipfix based) in order to be able to take the good decisions.

You can use traffic levels (or better - lack of traffic), traffic criticality (?!?! cirticity ?!?!) and prefix count saving as criteria.

directly correlated to the data they have available (their routing protocol and table). For example, with static default routes one can only make the simplest of routing decisions; with dynamic default routes one can make more informed decisions; with a partial view of the internet one can make even better decisions; with a full view of the internet one can make good decisions; and with a routing protocol that takes into account bandwidth, latency, loss, or other metrics one can make the very best decisions.

Determining how intelligent one wants his or her decisions to be, and how much he or she is willing to spend to get there, is an exercise for the reader. Not all routers need a full view of the internet, but some do. The cost of routers that hold a full routing table in FIB is generally more than those that do not, but overall is not cost prohibitive (in my opinion) for the folks that are already paying to be multihomed. Single homed networks (or those with a single transit provider and additional peers), probably won't benefit from holding more than a default route to their transit provider and therefore may be able to get by with a less capable router. Each network is different and the choices driven by the needs for redundancy, availability, performance, and cost will come out differently as well.

I wanted to mention one additional important point in all these monitoring discussion.
Right now, for one of my subnets Google services stopped working.
Why? Because it seems like someone from Russia did BGP hijack, BUT, exclusively for google services (most likely some kind of peering).
Quite by chance, I noticed that the traceroute from the google cloud to this subnet goes through Russia, although my country has nothing to do with Russia at all, not even transit traffic through them.
Sure i mailed noc@google, but reaching someone in big companies is not easiest job, you need to search for some contact that answers. And good luck for realtime communications.
And, all large CDNs have their own "internet", although they have BGP, they often interpret it in their own way, which no one but them can monitor and keep history. No looking glass for sure, as well.
If your network is announced by a malicious party from another country, you will not even know about it, but your requests(actually answers from service) will go through this party.

For me, routing table and available routing protocols are not the only things needed for intelligent routing. And the router is not the only component involved in "intelligent routing". Not these days/not anymore.

One thing that can help immensely in an internet environment is knowing where the data goes and where it comes from. Knowing your "important" traffic source/destinations is part of it.

You can say "I can no longer keep all the routes in FIB, so I'll drop the /24s", then come to a conclusion that that you have loads of traffic towards an anycast node located in a /24 or that you exchange voice with a VoIP provider that announces /24. you just lost the ability to do something proper with your important destination. On the other hand, you may easily leave via default (in extreme cases even drop) traffic to several /16s from Mulgazanzar Telecom which which you barely exchange a few packets per day except the quarterly wave of DDoS/spam/scans/[name your favorite abuse]. Or you may just drop a few hundred more-specific routes for a destination that you do care about, but you cannot do much because network-wise it is too far away.

Of course, such an approach involves human intervention, either for selecting the important and non-important destinations or for writing the code that does it automagically. Or both. There is no magic potion. (as a friday afternoon remark, there used to be such a potion in France, the "green powder", but they permanently ran out of stock in 2004 - see http://poudreverte.org/ - site in fr_FR).

Did this get resolved? if not please email me directly.

Radu, you're absolutely correct that BGP does not include the metrics often needed to make the best routing decisions. I mentioned metrics like bandwidth, delay, and loss (which some other routing protocols do consider); and you mentioned metrics like importance (I assume for business continuity or happy eyeballs) or the amount or frequency of data exchanged with a given remote AS/IP network. BGP addresses some problems (namely routing redundancy), but it has some intentional shortcomings when choosing the cheapest path, best performing path, or load balancing (not to mention its security shortcomings). Some folks choose to improve upon BGP by using BGP "optimizers", manual local pref adjustments, or similar configurations. And as this discussion has shown, other folks choose to introduce their own additional shortcomings by ignoring part of what BGP does have to offer. Perhaps in the future we will be able to agree on a replacement to (or improvements upon) BGP that addresses some of these shortcomings; we may also find that technology solves the limitations that currently force some folks to discard potentially valuable routing information.

Can you check the actual FIB usage? With 2m IPv4 divided into v4 and v6 * Fast ReRoute could hit the limit.