intra-AS messaging for route leak prevention

I am a co-author on a route-leak detection/mitigation/prevention draft
in the IDR WG in the IETF:
https://tools.ietf.org/html/draft-ietf-idr-route-leak-detection-mitigation-03

Based on private conversations with a few major ISPs, the following
common practice for intra-AS messaging (using Community tagging in iBGP)
for prevention of route leaks is described in Section 3.2 of the draft:

<begin quote>
“Routes are tagged on ingress to an AS with communities for origin,
   including the type of eBGP peer it was learned from (customer,
   transit-provider or peer), geographic location, etc. The community
   attributes are carried across the AS with the routes. Routes that
   the AS originates directly are tagged with similar origin communities
   when they are redistributed into BGP from static, IGP, etc. These
   communities are used along with additional logic in route policies to
   determine which routes are to be announced to which eBGP peers and
   which are to be dropped. Route policy is applied to eBGP sessions
   based on what set of routes they should receive (transit, full
   routes, internal-only, default-only, etc.). In this process, the
   ISP's AS also ensures that routes learned from a transit-provider or
   a lateral peer (i.e. non-transit) at an ingress router are not leaked
   at an egress router to another transit-provider or peer.

   Additionally, in many cases, ISP network operators' outbound policies
   require explicit matches for expected communities before passing
   routes. This helps ensure that that if an update has made it into
   the routing table (i.e. RIB) but has missed its ingress community
   tagging (due to a missing/misapplied ingress policy), it will not be
   inadvertently leaked.”
<end quote>

Question: Are there other means of conveying this information
in common use today (i.e. for prevention of route leaks)?

Also, the following publicly available references can be
possibly cited in support of the above:
https://www.nanog.org/meetings/nanog40/presentations/BGPcommunities.pdf
http://showipbgp.com/bgp-tools/bgp-community-list/91-level3-as3356.html
Pointers to any other relevant references would be very welcome as well.
Thank you.

Sriram

There is the "human network" approach, where operators share information
with each other which be used to generate config to help block
"unlikely" announcements from eBGP neighbors.

For instance AT&T and NTT agreed (through email) that there should be no
intermediate networks between 2914 & 7018, therefore NTT blocks
announcements that match as-path-regexp '_7018_' on any and all eBGP
sessions, except the direct sessions with 7018. NTT calls this concept
"peerlocking".

I'll cover this approach at the upcoming NANOG meeting in Chicago:
https://www.nanog.org/meetings/abstract?id=2860

Kind regards,

Job

> I am a co-author on a route-leak detection/mitigation/prevention draft
> in the IDR WG in the IETF:
> draft-ietf-idr-route-leak-detection-mitigation-03
>
> Question: Are there other means of conveying this information
> in common use today (i.e. for prevention of route leaks)?

[snip]

For instance AT&T and NTT agreed (through email) that there should be no
intermediate networks between 2914 & 7018, therefore NTT blocks
announcements that match as-path-regexp '_7018_' on any and all eBGP
sessions, except the direct sessions with 7018. NTT calls this concept
"peerlocking".

I'll cover this approach at the upcoming NANOG meeting in Chicago:
https://www.nanog.org/meetings/abstract?id=2860

Dropping unexpected AS vectors was frequently used in the 1990s
by folks, especially in the context of seeking to ensure traffic
intended for direct/private interconnections stayed on them. I
know some folks would also just filter "big networks" (to avoid
that marketing term) from other peers to sidestep the impact of
leaks.

It doesn't fit for all peers/networks (eg content which will
seek alternate paths around congestion), but if you can fold
it into your automation it is helpful.

Cheers!

Joe

I suppose if one is performing prefix- and ASN-based filtering, then you
"should" not learn peer paths via customers. If you augment that with
AS_PATH-based filtering, you "will not" learn peer paths via customers.

One thing we do to reduce opportunistically hazardous vectors is to not
learn customer paths via peers.

Mark.

Thanks for the inputs about the inter-AS messaging and route-leak prevention
techniques between neighboring ASes. Certainly helpful information and also useful
for the draft (draft-ietf-idr-route-leak-detection-mitigation).

However, my question was focused on "intra-AS" messaging.
About conveying from ingress to egress router (within your AS),
the info regarding the type of peer from which the route was received at ingress.
This info is used at the egress router to avoid leaking a route.

Question: Is the "common practice" described in the original message
http://mailman.nanog.org/pipermail/nanog/2016-June/086242.html (see the stuff in quotes)
sufficient or are there other ways in common use in which network operators
convey the said information from ingress to egress router?

Sriram

"There are more routing policies in heavan and earth, Sriram
Than are dreamt of in your draft."

But in my experience, community tagging is by far the widest
deployment due to the broad support and extent of information
which can be carried. It is useful to note that AS_PATH if
often also involved on egress decisions.

The sadness is that some platforms' processing of prefixes
and policies coupled with certain operational practices mean
we still see leaks beyond intended scope during maintenance
windows.

cheers!

Joe

Agree.

We use BGP communities extensively on all eBGP sessions to identify
upstreams, peers, customers, special partners, e.t.c., on the inbound
routing policy.

Outbound routing policies will depend on the type of neighbor, i.e.,
upstream, peer, customer, special partner, e.t.c. At any rate, we use
communities to determine what routes will be announced to what eBGP
neighbor. Those communities will need to match the intended source of
the route at some other point in the network.

The only time we look at prefix lists is to ensure we send (or accept)
nothing longer than a /24 (IPv4) or a /48 (IPv6) to (and from) an eBGP
neighbor of any kind. That said, further granularity in the outbound
routing policy toward upstreams will allow for transmission of
longest-match (/32 IPv4 and /128 IPv6) to support RTBH requirements.
This is a co-ordinated routing policy, so it is not harmful to us, our
upstreams or the wider Internet. We'd also accept these prefix lengths
from BGP customers as part of their standard RTBH capability they get
when they buy IP Transit from us; again, highly controlled and
co-ordinated to never cause any harm to us, the customer or the wider
Internet, while still being 100% functional for the customer.

Ultimately, once a routing policy is in place on a specific router, we
are never touching that router again as the edge moves around, i.e.,
customers, peers, special partners, e.t.c., come on-board, move around,
e.t.c. This creates natural safe guards against cock-ups, although the
goal is always to eliminate cock-ups from the get-go (automation of
repetitive provisioning tasks makes this goal easier to attain).

Coupled with our insistence on creating matching prefix and AS_PATH
filters for all customers (after being checked against the relevant RIR
WHOIS database to avoid hijack), we've been fortunate to never be in a
position where our network is leaking routes, unintentionally or
otherwise. Work continues to further harden this so that it never happens.

Mark.

This is great...the kind of inputs/insights I was hoping for.
Thank you :slight_smile:

Sriram

Hi All,

> Thanks for the inputs about the inter-AS messaging and route-leak
> prevention techniques between neighboring ASes. Certainly helpful
> information and also useful for the draft
> (draft-ietf-idr-route-leak-detection-mitigation).
>
> However, my question was focused on "intra-AS" messaging. About
> conveying from ingress to egress router (within your AS), the info
> regarding the type of peer from which the route was received at
> ingress. This info is used at the egress router to avoid leaking a
> route.
>
> Question: Is the "common practice" described in the original message
> http://mailman.nanog.org/pipermail/nanog/2016-June/086242.html (see
> the stuff in quotes) sufficient or are there other ways in common
> use in which network operators convey the said information from
> ingress to egress router?

But in my experience, community tagging is by far the widest
deployment due to the broad support and extent of information which
can be carried.

I second this. One of NTT's design principles is to be very strict in
what we accept (e.g. "postel was wrong") at the ingress point. At the
ingress point the route announcement is weighted, judged, categorized &
tagged. This decides 99% of what happens next: the egress points are
merely executing what was "decided" at ingress (but exceptions are
possible).

It is useful to note that AS_PATH if often also involved on egress
decisions.

You say 'often', but I don't recognise that design pattern from my own
experience. A weakness with the egress point (in context of route leak
prevention) is that if you are filtering there, its already too late. If
you are trying to prevent route leaks on egress, you have already
accepted the leaked routes somewhere, and those leaked routes are best
path somewhere in your network, which means you've lost.

Having said that: as a conscious effort to mitigate (known) fragility in
one's 'ingress policy deployment' an egress AS_PATH filter might be a
good second layer of defense. It doesn't protect your own network but it
helps block further spreading of garbage.

Kind regards,

Job

​so I can't be a customer of you and a network you peer with?
(I'm sure I got your meaning wrong)​

I second this. One of NTT's design principles is to be very strict in
what we accept (e.g. "postel was wrong") at the ingress point. At the
ingress point the route announcement is weighted, judged, categorized &
tagged. This decides 99% of what happens next: the egress points are
merely executing what was "decided" at ingress (but exceptions are
possible).

Agree. We do the same.

You say 'often', but I don't recognise that design pattern from my own
experience. A weakness with the egress point (in context of route leak
prevention) is that if you are filtering there, its already too late. If
you are trying to prevent route leaks on egress, you have already
accepted the leaked routes somewhere, and those leaked routes are best
path somewhere in your network, which means you've lost.

Agree.

We don't do any AS_PATH filtering on egress.

The only AS_PATH-anything we do on border routers is signal
customer-initiated prepends via BGP communities. Those prepends are done
at the border routers carrying the interested transit network.

Otherwise, all egress filtering is based on BGP communities + general
"no longer then /24, /48" rule as a fail-safe.

Mark.

You can, but we won't learn your paths via the peering session we would
have with your other ISP.

Mark.

​oh, so I didn't misunderstand.. that makes 'backup isp' less useful, no?​

With regard to reaching our network, not true. You would still be able
to reach our network if your primary service with us failed, but not via
a local peer.

Mark.

so I can't be a customer of you and a network you peer with?

You can, but we won't learn your paths via the peering session we would
have with your other ISP.

Wouldn't "learn but depref" be preferred and more common? E.g. customer routes get tagged with "customer route" community and local-pref'd to 150 or something; peer routes get tagged with "peer route" community and local pref'd somewhere below that.

Else any of your other customers that are single-homed to you can't reach your dual-homed customer A in cases where customer A's link to you is down, but customer A has other transits with whom you peer?

Unless it's mitigated by you accepting customer A's prefixes from any transits you have, which at the least seems sub-optimal (now reaching them via transit rather than peering if customer A's circuit is down) or possibly also up-ended if you also similarly apply "don't accept customer prefixes from transits".

No?

​I'm clearly misunderstanding something.
I suppose if it works for your customers it must be ok.​

In a message written on Fri, Jun 10, 2016 at 10:50:17AM +0200, Job Snijders wrote:

You say 'often', but I don't recognise that design pattern from my own
experience. A weakness with the egress point (in context of route leak
prevention) is that if you are filtering there, its already too late. If
you are trying to prevent route leaks on egress, you have already
accepted the leaked routes somewhere, and those leaked routes are best
path somewhere in your network, which means you've lost.

It does mean the provider creating the leak has already lost, but
that doesn't mean it still isn't vital to protecting the larger
internet. A good example of this is fire code. Most fire codes
do not do much to prevent you from starting a fire in your own
house/condo/apartment, but rather prevent it from spreading to your
neighbors.

For instance, if you filter Customer A to A's Prefix list on ingress,
B to B's, C to C's, it may also be prudent to filter outbound to
your peers based on A+B+C's prefix list. When the ingress filter
to A fails (typo, bug, bad engineer), your own network is hosed by
whatever junk A ingested, but at least you won't pass it on to peers
and spoil the rest of the Internet.

Basically both ingress and egress filtering have weaknesses, and
in some cases doing both can provide some mitigation. It's the old
adage "belt and suspenders".

This.

Mark.

It does mean the provider creating the leak has already lost, but
that doesn't mean it still isn't vital to protecting the larger
internet. A good example of this is fire code. Most fire codes
do not do much to prevent you from starting a fire in your own
house/condo/apartment, but rather prevent it from spreading to your
neighbors.

I've found communities to be robust at filtering very effectively.

I have heard of software issues that may cause filters to stop working,
but I have not yet encountered any such issues myself that had nothing
to do with a mis-configuration or lack of understanding about how
policies are evaluated by the router.

For instance, if you filter Customer A to A's Prefix list on ingress,
B to B's, C to C's, it may also be prudent to filter outbound to
your peers based on A+B+C's prefix list. When the ingress filter
to A fails (typo, bug, bad engineer), your own network is hosed by
whatever junk A ingested, but at least you won't pass it on to peers
and spoil the rest of the Internet.

That does not scale, and was probably one of the primary reasons
communities were developed.

Basically both ingress and egress filtering have weaknesses, and
in some cases doing both can provide some mitigation. It's the old
adage "belt and suspenders".

We've been operating purely community-based filtering on border and
peering routers for years. I've never ran into an issue with the
software that broke that.

The folk I know who have suffered this either mis-configured their
policies, did not understand BGP and did not get a good handle on how
their router OS implements filtering and filter evaluation.

Mark.

One thing we do to reduce opportunistically hazardous vectors is to not
learn customer paths via peers.

​so I can't be a customer of you and a network you peer with?
(I'm sure I got your meaning wrong)​

sure you can. just don't expect packets from job's cone when your link
to him is down.

didn't we go through all these lessons a couple of decades ago?