Setting sensible max-prefix limits

As I understand by now, it is highly recommended to set a max-prefix limit for peering sessions. Yet, I can hardly find any recommendations on how to arrive at a sensible limit.

I guess for long standing peers one could just eyeball it, e.g., current prefix count + some safety margin. How does that work for new peers? Do you negotiate/exchange sensible values whenever you establish a new session? Do you rely on PeeringDB (if available)? Do you apply default values to everyone except the big fishes?

Apart from your peers, do you also apply a limit to your transit sessions?

Best regards,

Lars

We always use PeeringDB data and refuse to peer with networks not in PeeingDB ( I think there are only 2 exceptions ) Automation keeps the max_prefix numbers up to date.

Our transits we use data from the weekly routing table reports and allow some expansion room.

So far this works for us

Regards

Steve

- review max prefix suggestions from the peer itself, either from the
email or peeringdb
- check actual current prefix count (bgp.he.net et all)
- check whether the disparity between the two matches your expectation
of a safety margin, based on your own operational experience and
context
- defaults for low prefix count peers
- actually monitor warning/critical levels of max-prefix counts

Don't use too small a safety margin, you don't want to spend your days
adjusting max-prefix levels all the time.

I don't have strict rules for the safety margin itself; it depends
very much on the network (size, growing rate, trust, history).

lukas

If you have automation in place. Another approach is to count the
received prefix. Store the counted value in a database. Based on the
avg prefix count over X (time period). Add e.g. 10 - 25 % headroom
over the avg prefix count and use the calculated value as the
max-prefix limit?

PeeringDB data (sometimes or often?) be somewhat misleading (in
contrast to actual avg prefix count) in the sense 'some' networks will
input a value with headroom percentages already included.

That's what it's for.

There is no point in periodically copying the actual prefix-count into
peeringdb records, that would just be redundant data which would be
wrong more often than not.

PeerginDB tool tips:
Recommended maximum number of IPv4 routes/prefixes to be configured on
peering sessions for this ASN
Recommended maximum number of IPv6 routes/prefixes to be configured on
peering sessions for this ASN

Lukas

You are missing two important questions
a) should I apply it to before or after policy
b) what should I do when it triggers, should I reset or stop accepting new

When I read through [1] earlier today, I had the feeling that these question rather strictly translate to:

a) Do I keep rejected routes around?

b) Can I traffic-wise afford dropping the session to send a strong signal to my peer?

Hence, I didn't dig deeper.

Best regards,

Lars

[1] BGP Maximum Prefix Limits

Okay, so some automated PeeringDB-based approach seems to be the preferred road.

~30% and ~40% of IPv4 and IPv6 PeeringDB prefix count recommendations are 0. How do you treat those cases? Does it also boils down to a simple "we don't peer with them" ?

Best regards,

Lars

That works but all too often people forget to update it. Set a quarterly reminder in your calendar to check max-prefix setting.

-Hank

While there are good solutions in this thread, some of them have scaling issues with operator overhead.

We recently implemented a strategy that I proposed a couple years ago that uses a bucket system.

We created 5 or 6 different buckets of limit values (for v4 and v6 of course.) Depending on what you have published in PeeringDB (or told us directly what to expect), you’re placed in a bucket that gives you a decent amount of headroom to that bucket’s max. If your ASN reaches 90% of your limit, our ops folks just move you up to the next bucket. If you start to get up there in the last bucket, then we’ll take a manual look and decide what is appropriate. This covers well over 95% of our non-transit sessions, and has dramatically reduced the volume of tickets and changes our ops team has had to sort through.

Of course, we can also afford to be a little looser in limits based on the capability of the equipment that these sessions land on, other environments may require tighter restrictions.

Maybe because there isn't a simple, universal approach to setting it.
Probably like a lot of people, historically I'd set it to
some % over the current stable count and then manually adjust when the
limits were about to be breached, or often was the case when they were
and I wasn't ready for it. Not ideal.

I've never felt the automation of this setting however was worth the
effort. Of course I am not usually responsible for hundreds of routers
and thousands of peering sessions.

At the risk of advocating for more junk in BGP or the RPKI, a max prefix
setting might be something that could be set by the announcing peer in
a BGP message, or possibly as an RPKI object with an associated ASN.
I'll let the masses debate how that would work and all the reasons that
isn't ideal, but I'm not sure there is a one-size-fit all solution for
this in the near term.

John

Our semi-automated process...
Check the peering routers for any peers that have a prefix limit set (we don't set limits on transit or iBGP, so we skip those groups)

Record what the current limit is.

Check peeringDB for what the network says the limit should be.

If configured max prefix < peeringDB, inform a config change is needed;
if configured max prefix > peeringDB, the network isn't keeping its record up to date. no need for change

I've thought about adding additional headroom to what is advertised in peeringDB, but we haven't had the limits triggered in so long, it may not be worth it.

We did a variant of this at NTT, with certain baseline settings. Sometimes networks would advertise more routes because they onboarded a large customer and it would cause manual updates to be necessary.

Polling daily and snapshotting these values is important to understand what is changing. The reason I just posted a message about Akamai max-prefix is we have been giving some general guidance that is out of line/norm compared to what perhaps what we want. This won’t cause a service outage per-se but will cause suboptimal routing as we continue to make improvements and upgrades to our network.

- Jared

Depending on what failure cases you actually see from your peers in the wild, I can see (at least as a thought experiment), a two-bucket solution - "transit" and "everyone else". (Excluding downstream customers, who you obviously hold some responsibility for the hygiene of.)

How often do folks see a failure case that's "deaggregated something and announced you 1000 /24s, rather than the expected/configured 100 max", vs "fat-fingered being a transit provider, and announced you the global table"?

My gut says it's the latter case that breaks things and you need to make damn sure doesn't happen. Curious to hear others' experience.

Thanks,
Tim.

Thus spake Chriztoffer Hansen (ch@ntrv.dk) on Wed, Aug 18, 2021 at 12:03:51PM +0200:

> I guess for long standing peers one could just eyeball it, e.g., current
> prefix count + some safety margin. How does that work for new peers?

sadly, this is the state of the art.

If you have automation in place. Another approach is to count the
received prefix. Store the counted value in a database. Based on the
avg prefix count over X (time period). Add e.g. 10 - 25 % headroom
over the avg prefix count and use the calculated value as the
max-prefix limit?

PeeringDB data (sometimes or often?) be somewhat misleading (in
contrast to actual avg prefix count) in the sense 'some' networks will
input a value with headroom percentages already included.

Our code tries all 3:

a) using the max values in peeringdb
b) expand all the routes in the IRR record from peeringdb
b.1) if no object is specified, try to guess if it's named ASnnnnn
c) count the currently received prefixes

Many times the values in peeringdb can be off, or increasingly this is a good
warning not to peer with a negligent operator. For some peers 'b' can expand
to a huge, unrealistic set (not always their fault), so if it's substantially
larger than 'a' we throw it out. (c) has proven the most reliable.

The count chosen then fits in the appropriate sized bin and given 30% headroom.
The code compares all this and gives the user a warning that in proactive gets
ignored for option 'c'. (For example we can override 'b' with a more appropriate
object record in our provisioning db)

Dale

Depending on what failure cases you actually see from your peers in the wild, I can see (at least as a thought experiment), a two-bucket solution - “transit” and “everyone else”. (Excluding downstream customers, who you obviously hold some responsibility for the hygiene of.)

Although I didn’t say it clearly, that’s exactly what we do. The described ‘bucket’ logic is only applied to the ‘everyone else’ pile ; our transit stuff gets its own special care and feeding.

How often do folks see a failure case that’s “deaggregated something and announced you 1000 /24s, rather than the expected/configured 100 max”, vs “fat-fingered being a transit provider, and announced you the global table”?

I can count on one hand the number of times I can remember that a peer has gone on a deagg party and ran over limits. Maybe twice in the last 8 years? It’s possible it’s happened more that I’m not aware of.

We have additional protections in place for that second scenario. If a generic peer tries to send us a route with a transit provider in the as-path, we just toss the route on the floor. That protection has been much more useful than prefix limits IMO.

Hi,

We always use PeeringDB data and refuse to peer with networks not in PeeingDB

You are aware that PeerinDB refuses to register certain networks, right? It is most certainly not a single source of truth.

Thanks,

Sabri

Hi,

We always use PeeringDB data and refuse to peer with networks not in PeeingDB

You are aware that PeerinDB refuses to register certain networks, right? It is most certainly not a single source of truth.

Would you care to expand on this?

Matthew Walster

I am extremely interested in hearing about this as well.

Specific examples would be useful.

As I understand by now, it is highly recommended to set a max-prefix
limit for peering sessions. Yet, I can hardly find any recommendations
on how to arrive at a sensible limit.

Maybe because there isn't a simple, universal approach to setting it.
Probably like a lot of people, historically I'd set it to
some % over the current stable count and then manually adjust when the
limits were about to be breached, or often was the case when they were
and I wasn't ready for it. Not ideal.

We tackled this problem at $work recently after I wrote some code to audit configured prefix-limits and found how inconsistent we were. My guess was, this was due to combination of each engineer "doing their own thing" with regard to how to set prefix-limits based on what was published in peeringdb and growth (peers increasing the suggested limits over time, after we'd configured [some of] their sessions).

The solution I implemented was:

In the script that builds peering config, fetch the peer's suggested limits from peeringdb via API (I still miss the open mysql access).

Multiply those values by 2.

If that's too close to the "full table size", try suggested limits * 1.5.

If that's still too close to the "full table size", just use the suggested limits.

I've never felt the automation of this setting however was worth the
effort. Of course I am not usually responsible for hundreds of routers
and thousands of peering sessions.

Yeah...that changes things when you have thousands of peering sessions to maintain.

At the risk of advocating for more junk in BGP or the RPKI, a max prefix
setting might be something that could be set by the announcing peer in
a BGP message, or possibly as an RPKI object with an associated ASN.

It actually sounds like a cool feature, and could be implemented entirely on the sender's side. i.e. You configure a peer with a self-imposed limit of 1000 routes. If you screw up your routing policy facing that peer and leak the full table, once you hit 1001 advertised routes, your router's BGP process terminates the session.

Who hasn't had a new peer leak full routes to them at least once?

Who hasn't configured a new peer, only to have them immediately trip your prefix-limit because they haven't updated peeringdb for "some time" and advertise more routes than their suggested limits?