Max Prefixes Configured on Customer BGP (WAS Re: ALGX problems?)

As much as it would be nice if everyone used prefix-lists on their
customer BGP sessions, but sometimes this is not possible, or cumbersome.
I know from past experience as a transit customer, that I have personally
shyed away from ISP's that have restricted me to having their NOC update
my ACL.

However, I don't really see a reason why ISP's shouldn't implement
max-prefixes on their customer sessions; This would not prevent against
very small prefix leaks, but would prevent partial and whole routing table
leaks that impact many networks.

How many of you that currently do not filter your customer BGP sessions
have max-prefixes configured?

How many of you that currently do not filter your customer BGP sessions
and do NOT have max-prefixes configured would be willing to configure the
sessions to support this?

Joe

* joew@accretive-networks.net (Joe Wood) [Fri 16 Aug 2002, 02:16 CEST]:

As much as it would be nice if everyone used prefix-lists on their
customer BGP sessions, but sometimes this is not possible, or cumbersome.
I know from past experience as a transit customer, that I have personally
shyed away from ISP's that have restricted me to having their NOC update
my ACL.

But instead you prefer a "lazy" NOC, where you need manual intervention in
case you screw up a filter list on your end to re-enable the BGP session?

  -- Niels.

No, instead I prefer to do all route filtering on my (cust) side, and have
the ISP do filtering based on AS PATH, be it ^CUST-AS_ or configured off
the RADB......

It's been my experience that a lot of the providers that do prefix
filtering on customer BGP sessions take great amounts of time before they
act on the prefix-filter update request. This much fun when it's 5pm or
later and you really need to announce a new customer netblock.

Joe

If you're using a Cisco, and they leak, their session stays down until a
human clears it. It also does very little to prevent leaking of a single
route (like one of Phil Rosenthal's /24s), impacting someone else. As a
customer, I would always insist on being prefix-listed and not
prefix-limited.

I far prefer a prefix list automatically built from IRR entries, with a
NOC and even a website capable of triggering a manual update if you need
to get routes out now. It's all a bit of a hack, but its workable. IMHO AS
Path filters are useless and redundant if you have proper prefix-lists.

* joew@accretive-networks.net (Joe Wood) [Fri 16 Aug 2002, 02:38 CEST]:

I know from past experience as a transit customer, that I have
personally shyed away from ISP's that have restricted me to having
their NOC update my ACL.

But instead you prefer a "lazy" NOC, where you need manual intervention in
case you screw up a filter list on your end to re-enable the BGP session?

No, instead I prefer to do all route filtering on my (cust) side, and have
the ISP do filtering based on AS PATH, be it ^CUST-AS_ or configured off
the RADB......

(Well, if a customer is accidentally leaking a full table then ^CUST-AS_
will still match everything they send you...)

Filtering from RADB has its own problems. It's much better now than
it was a few years ago, with RPSL, PGP authentication and not the
free-for-all it used to be. :slight_smile:

It's been my experience that a lot of the providers that do prefix
filtering on customer BGP sessions take great amounts of time before they
act on the prefix-filter update request. This much fun when it's 5pm or
later and you really need to announce a new customer netblock.

My only experience in this regard is with UUNet, and they're pretty
quick. Conceded that this was during a Europe-wide outage and the
slightly too strict filter was on a transit connection in the US.

Configuring off an IRR is a Good Thing. Doing it in an automated
fashion without some sort of supervision can at best be called risky.

Take care,

  -- Niels.

[...]

It's been my experience that a lot of the providers that do prefix
filtering on customer BGP sessions take great amounts of time before they
act on the prefix-filter update request. This much fun when it's 5pm or
later and you really need to announce a new customer netblock.

  How often do you run into that situation?
  I understand that you may not like it when you do, but a little
planning on someone's part somewhere down the line eases the pain for
everyone.
  It depends upon experience. You'd never have formed that opinion
if you were my customer.

Peter E. Fry

No, I never thought much of emoticons.

If you're using a Cisco, and they leak, their session stays down until a
human clears it. It also does very little to prevent leaking of a single
route (like one of Phil Rosenthal's /24s), impacting someone else. As a
customer, I would always insist on being prefix-listed and not
prefix-limited.

The intent of this discussion isn't whether prefix-filtering is
appropriate or not. It is up to the individual ISP to determine what
degree of filtering is appropriate for their BGP customers.

However, for ISP's that do NOT use any sort of prefix filters, wouldn't
you prefer that your BGP session was limited to a number of prefixes, in
case of a routing leak?

While leaking a /24 may be impacting, it (in most circumstances, don't
beat me up over this one) is not nearly as impacting as leaking a whole
routing table.

I far prefer a prefix list automatically built from IRR entries, with a
NOC and even a website capable of triggering a manual update if you need
to get routes out now. It's all a bit of a hack, but its workable. IMHO AS
Path filters are useless and redundant if you have proper prefix-lists.

I would also prefer prefix lists that were built automatically from an
IRR, with a manual update feature.... If you find a provider who can
claim to do this, let me know :slight_smile: The best I've found is providers who can
manually add entries into the filters, and let them update off the IRR
once you've added the proper route object. Most providers that I've dealt
with (that configure off an IRR) won't even touch their filters, and will
only allow the once a day update.

Joe

I can say all but one of my past transit providers has taken an
unappropriate amount of time to do filter updates, especially after-hours.

Proper planning is fine, when you have all the correct information. I've
run into situations more than once where a customer will not provide us
with the correct information until the last minute.

Joe

Joe Wood <joew@accretive-networks.net> typed:

However, for ISP's that do NOT use any sort of prefix filters, wouldn't
you prefer that your BGP session was limited to a number of prefixes, in
case of a routing leak?

We'ld prefer that such ISPs identify themselves here so we can
straighten them out. Wasn't that your intention when you asked this
question:

    How many of you that currently do not filter your customer BGP
    sessions have max-prefixes configured?

That seemed to me to be a small trick to get unsuspecting ISPs to
wave their hands "Over here!", so that we could whack'em.

-mark

True, but my point is that if ISP is doing filtering based on ^CUST-AS_
they should be implementing _some_ sort of protection against full table
leaks.

Regards,

Joe

* ras@e-gerbil.net (Richard A Steenbergen) [Fri 16 Aug 2002, 03:01 CEST]:
[..]

IMHO AS Path filters are useless and redundant if you have proper
prefix-lists.

Did you ever run into that bug in IOS where if you had `ip as-path
access-list 1 permit ^(1|2|3)+_$' (where 2 and 3 would be customers of
AS1 who is one of your peers), and the third number is higher than the
second number? IOS seemed to optimise that case out and reject AS paths
of ^1_2$ if the filter was written as ^(1|3|2)+_$.

This was several years ago though.

  -- Niels.

* joew@accretive-networks.net (Joe Wood) [Fri 16 Aug 2002, 03:30 CEST]:

True, but my point is that if ISP is doing filtering based on ^CUST-AS_
they should be implementing _some_ sort of protection against full table
leaks.

Yes, and I'm in violent agreement with that. :slight_smile: Apologies if that
wasn't clear from my wording.

Cheers,

  -- Niels.

;>

Actually, my intent was to get more of a representation on who actually
cares enough to try and prevent these leaks that I see from my peers
frequently. Those that don't really do anything to prevent these leaks
would be the peers that I would begin doing agressive filtering on our
sessions.

I would think that all the ISP's represented on this list get wacked
enough on other list topics, they probably don't need any more beatings
about this....

Take care,

Joe

warning: operational content

  in 12.0(22)S there was a new max-prefix feature added that
people running this software (or similar) can enable to shut down
your customers who leak routes.

  Most customers don't advertize 8k prefixes, so a simple
setup like this:

(config-router)#nei 1.2.3.4 maximum-prefix 8000 restart ?
  <1-65535> Restart interval in minutes

  and configure some reasonable number of minutes (lets say 15)
and the session will come back up for them and flap again until they
fix it.

  - Jared

(follow-ups should probally go to cisco-nsp@puck.nether.net or a similar
cisco specific related list)

warning: operational content

Thank you Jebus!

  in 12.0(22)S there was a new max-prefix feature added that
people running this software (or similar) can enable to shut down
your customers who leak routes.

  Most customers don't advertize 8k prefixes, so a simple
setup like this:

(config-router)#nei 1.2.3.4 maximum-prefix 8000 restart ?
  <1-65535> Restart interval in minutes

  and configure some reasonable number of minutes (lets say 15)
and the session will come back up for them and flap again until they
fix it.

  - Jared

(follow-ups should probally go to cisco-nsp@puck.nether.net or a similar
cisco specific related list)

This isn't a terribly cisco-specific reply so I'll keep it here.

The problem with restart systems (btw thank you cisco for finally adding
this) is, think about how much damage can be done by announcing 8k routes
for the 30 seconds (or 5-10 minutes if there is a Foundry in the mix :P)
before you get to the limit and kill the session. Now add in the damage
caused by this happening every 15 minutes, and the dampening. Or even
worse, someone who turns up more routes and happens to hit right around
the exact number or close to it. Imagine a session which goes over by 1
route, trips, stays down for 15 minutes, comes back up and this time has 1
less route, and noone notices the prefix limit needs to be raised. You
should make sure that the restart time exceeds the number/length of flaps
necessary to trigger dampening, which on a connect you transit is pretty
darn hard to accurately guess.

IMHO, using only prefix limits on a customer is actually doing them (and
the rest of the internet that listens to your announcements) a disservice.

A better system might be where the session is kept up (or periodically
polled, if you want to make it obvious to the other party that there is a
problem) without installing the routes, and kept in a "quarantine" state
for X amount of time to make sure that things stay below a configured
number. This would be at least a slightly better way of recovering quickly
once the "problem" has passed, without mucking things up every 15 minutes
in the process.

Couldn't you do this with route-dampening?

So the first leak will of course be propagated before the max-prefix
takes effect. But once these routes are withdrawn, this should
create entries in the history table for these prefixes.

Depending on your dampening parameters, you should be able to configure
selective ASes to have very low tolerance for dampening, if you don't
already have a low tolerance for dampening.... Once the BGP session is
activated and if the offending prefixes reappear and trigger the
max-prefix threshold and are then withdrawn again, BGP dampening should
dampen the routes for 45 minutes or X, depending on your maximum
suppression value........

That X minutes should hopefully be enough time for customer to solve
problem, or for the ISP NOC to get on the phone with the customer.

While this still propagates the leaked routes at least twice, it does
prevent the routes from being constantly propagated every 15 minutes....

Please correct me if I'm wrong......The BGP Dampening route-map feature is
new to me. ;>

Regards,

Joe

I believe you are correct as long as you inteligently apply
this restart-timer on max-prefix along with your dampening policy.

  YMMV depending on what your defaults are set for.

  - Jared

Can anyone recommend some links/mailing lists/resources
for HP Openview and/or Ciscoworks?

Thanks,

In a message written on Thu, Aug 15, 2002 at 11:41:17PM -0400, Richard A Steenbergen wrote:

IMHO, using only prefix limits on a customer is actually doing them (and
the rest of the internet that listens to your announcements) a disservice.

I think you might be missing a highly useful case of using max-prefix
with customers. Many customers will want to deaggregate their
blocks, and/or leak more specifics. While I don't want to argue
if that is good or not, the end result is most ISP's allow this in
some form. Consider the difference between:

Case 1: a.b.0.0/16 exact match prefix filter

        Customer calls in, asks for change.

        a.b.0.0/17 + a.b.128.0/17 exact match prefix filter.

Case 2: a.b.0.0/16 le 19, max prefix 6

The second case allows customers to make changes with no delays,
and reduces the amount of work for the ISP. It still enforces some
level of aggregation automatically to protect the system, but also
gives the customer some flexability.

Generally I'd recomend something around twice the number of prefixes,
with some sort of floor. So, if you registered 200 prefixes, you
could announce 400 routes from them, with a maximum length as set
by your ISP.

That's why you make sure that any incidents where max-prefix is tripped is
caught by a syslog watcher and brought to the immediate attention of whoever's
sitting in your NOC. Honestly, if all you're dealing with is customer BGP
session, I would propose that 90% of them don't advertise more than 10 prefixes,
so a max-prefix number higher than, say, 100 should do for most cases. And for
that last 10%, max-prefix is a per-session configuration, so that number can always
be set higher. IMO, advertising 100 routes for 30 seconds is far less damaging
than 8000 routes.

Also, don't forget about the warn option - if a customer's organic growth puts
them close to the prefix limit, you should get a heads-up in most cases.

I recall an incident where we brought up a customer advertising around 600
routes, and sent the prefix list our upstream, who dutifully added all
600 routes to the prefix list, but neglected to raise their maximum-prefix limit
from 300. This, of course, had predictable results. Doh.

-C