Managing CE eBGP details & common/accepted CE-facing BGP practices

Justin_Shore · December 21, 2008, 1:22am

Does anyone have any preferred ways to manage their customer-facing BGP details? I'm thinking about the customer's ASN (SP assigned private ASN or RIR assigned ASN), permitted prefixes, etc? While I'm sure this could be easily stored in a spreadsheet I'm not sure if there is any merit to storing some of these details outside of the configuration on the PE (assuming of course that the PE's config is regularly archived). Now if the PE's BGP config was auto-generated via a script then it would make sense for all the details to be stored off in a DB in the NOC. Beyond that is there a good reason to do archive it in a textual format off of the PE and if there is a sound reason to do it, is therea good or preferred way to do accomplish this?

We're moving beyond our typical residential and very small SMB service to larger customers over the next few months. These areas have larger, more advanced customers and I'm sure we'll run into multi-homed environments and customer who will expect BGP peering options. I would like to be prepared with sound practices before we get our first customer that wants to get a default route via BGP, wants full tables, or has their own ASN and is bringing their own PI space with them. Some of this of course implies multiple processes to confirm that the ASN belongs to the customer in question, that the PI space belongs to the customer in question, notifying our upstreams to accept the customer's PI space, etc. It's hammering out the scalable and best practice config details that I'm concerned with at the moment.

When assigning private ASNs to customers, are there any gotchas to be aware of? Is it possible to use the same private ASN for more than one customer on the same PE?

What are common and accepted CE-facing BGP practices? MD5 AUTH, GTSM, max prefix limits? Which is preferred, route-maps or prefix-lists for controlling advertised and/or received routes? Do any SPs utilize AS-Path ACLs to check that prefixes from an customer's ASN are claimed to originate from there? Are there any SPs out there offering BFD support for BGP or CE-facing peering sessions?

Should we have the customer announce their PA space to us or do we advertise it for them (redist a static)? Do SPs restrict access to tcp/179 on the CE from the Internet in the CE-facing ACL? Do SPs block access to the PE-CE subnet from the outside world like what was described in the Router Security Strategies book (pages 189-193)? What about dropping incoming traffic to everything but the CE IP?

While I don't predict our CE-facing BGP load to be terribly significant at this point, I would like to establish sound practices now rather than down the road once we're neck deep in temporarily production workarounds.

Is there any consensus on what's best practice for CE-facing BGP? I imagine most SP engineer's BGP practices could be better equated to a religious holy war on par with Chevy vs Ford or Mac vs PC. I would be interested in hearing what they are though and learning from the group's expertise.

Thanks
Justin

Suresh_Ramasubramani · December 21, 2008, 2:11am

Heck, you could store all that in Rancid .. even cvs/svn

http://homepage.mac.com/duling/halfdozen/RANCID-Howto.htm

http://www.devx.com/enterprise/Article/21647

Or Alexi Roudnev used to post this to nanog often enough - nice tool
though a bit old - CCR, cisco config repository

http://sourceforge.net/projects/snmpstat

srs

Justin_M_Streiner · December 21, 2008, 2:11am

Does anyone have any preferred ways to manage their customer-facing BGP details? I'm thinking about the customer's ASN (SP assigned private ASN or RIR assigned ASN), permitted prefixes, etc? While I'm sure this could be easily stored in a spreadsheet I'm not sure if there is any merit to storing some of these details outside of the configuration on the PE (assuming of course that the PE's config is regularly archived). Now if the PE's BGP config was auto-generated via a script then it would make sense for all the details to be stored off in a DB in the NOC. Beyond that is there a good reason to do archive it in a textual format off of the PE and if there is a sound reason to do it, is therea good or preferred way to do accomplish this?

You could certainly store all of the relevant config details in a database of some sort, and it certainly can't hurt to do so. Same goes for backing up your device configurations - always a good idea. As far as storing things like ASNs, allowed prefixes, etc, you may want to look at storing that information in an RPSL format. Many providers require their customers to register route/AS/policy objects either in one of the Internet Routing Registries (RADB, AltDB, etc...), or in a similar system that's operated by the provider. They then use this information for pushing out configuraiton changes such as the list of prefixes that a customer is allowed to announce, etc.

We're moving beyond our typical residential and very small SMB service to larger customers over the next few months. These areas have larger, more advanced customers and I'm sure we'll run into multi-homed environments and customer who will expect BGP peering options. I would like to be prepared with sound practices before we get our first customer that wants to get a default route via BGP, wants full tables, or has their own ASN and is bringing their own PI space with them. Some of this of course implies multiple processes to confirm that the ASN belongs to the customer in question, that the PI space belongs to the customer in question, notifying our upstreams to accept the customer's PI space, etc. It's hammering out the scalable and best practice config details that I'm concerned with at the moment.

It's good that you're thinking about this now, particularly the "is this customer legitimately allowed to announce prefix ABC, or source their traffic from AS XYZ"? Most larger providers put at least some limitations on what they will allow a customer to announce, though the level of investigation done to attempt to establich legitimacy for those announcements varies. Some providers require customers to provide some sort of Letter of Authorization/Agency for the prefixes they want to announce.

When assigning private ASNs to customers, are there any gotchas to be aware of? Is it possible to use the same private ASN for more than one customer on the same PE?

Yes. As long as the organizations that are using the private AS aren't 1. trying to advertise the same space, 2. possibly connect directly to each other directly, or 3. expecting to be able to connect to multiple upstream providers, then you should be OK. VZB (former UUNET) did something similar, using AS7046 (not a private AS, but the principle is the same), and I believe other carriers have had similar arrangements for customers.

What are common and accepted CE-facing BGP practices? MD5 AUTH, GTSM, max prefix limits? Which is preferred, route-maps or prefix-lists for controlling advertised and/or received routes? Do any SPs utilize AS-Path ACLs to check that prefixes from an customer's ASN are claimed to originate from there? Are there any SPs out there offering BFD support for BGP or CE-facing peering sessions?

MD5 is good, but most providers I've seen make this an opt-in feature - they don't force their customers to use it. Setting a reasonable max-prefix limit and adjusting it as the number of prefixes a customer announces is always a good idea, and I'd consider it to be a best practice. Prefix lists and route maps can do different things, or accomplish the same task in different ways. It also depends on what functionality you want to offer your customers. Do you plan to publish and support a consistent set of customer-settable BGP communities for doing things like selective AS prepends? Do you plan to tag incoming advertisements with communities to identify them as customer routes, and pass those communities to your customers and peers? Some providers use AS path ACLs, and others avoid them at all costs.

Should we have the customer announce their PA space to us or do we advertise it for them (redist a static)? Do SPs restrict access to tcp/179 on the CE from the Internet in the CE-facing ACL? Do SPs block access to the PE-CE subnet from the outside world like what was described in the Router Security Strategies book (pages 189-193)? What about dropping incoming traffic to everything but the CE IP?

If the customers need to connect to multiple upstream providers, it's much less hassle to have them originate their prefixes from their own AS, and then you just need to propagate them. If they are single-homed, you can announce the prefix(es) for them and then just statically route them to the customer. Other things to consider are sound ingress/egress filtering policies, loose/strict RPF etc. Implementing a Netflow based monitoring solution, and applying flow caching to PE-CE interfaces is also agood idea.

While I don't predict our CE-facing BGP load to be terribly significant at this point, I would like to establish sound practices now rather than down the road once we're neck deep in temporarily production workarounds.

Again, it's great that you're being proactive about this. Setting up good policies now will make things much easier to maintain and provision in the future as your network grows. Don't forget to document those policies thoroughly, particularly if other people will be tasked with implementing it on a dat yo day basis, i.e. handling new customer turn-ups, making BGP prefix list modifications, etc.

Is there any consensus on what's best practice for CE-facing BGP? I imagine most SP engineer's BGP practices could be better equated to a religious holy war on par with Chevy vs Ford or Mac vs PC. I would be interested in hearing what they are though and learning from the group's expertise.

It seems like you have a pretty good handle on what you need to do. I don't know that there will be much of the holy war angle to the responses. Different people adopt different customer routing policies for different reasons - some technical, some business, some political. While there are different ways of accomplishing this task, most of the more scalable ways have already been a part of most large providers' policies for awhile.
One-offs are bad, in my opinion, so the more you do to avoid them now, the fewer headaches you will have down the road.

jms

Chris_Owen · December 21, 2008, 2:21am

http://homepage.mac.com/duling/halfdozen/RANCID-Howto.html

Chris

- ------------------------------------------------------------------------------
Chris Owen - Garden City (620) 275-1900 - Lottery (noun):
President - Wichita (316) 858-3000 - A stupidity tax
Hubris Communications Inc www.hubris.net
- ------------------------------------------------------------------------------

Justin_Shore · December 21, 2008, 3:01am

Suresh Ramasubramanian wrote:

Heck, you could store all that in Rancid .. even cvs/svn

I should have said it earlier when I mentioned config backups. I'm already a heavy user of RANCID, archiving my configs hourly. Been using it since right around v2.0-2.1 which would be several years ago (feels like a lifetime). So my config backups are more than taken care of. What I'm interested in is if I should also document the PE-CE BGP config details elsewhere or if I should just leave them in the PE and let my backups cover me.

Part of what's driving this is my desire to create a book of templates for our assorted product offerings that covers both PEs and CEs. Eventually I won't be able to handle everything myself and staff will have to be added. Eventually we'll have to separate operations, engineering, security and maybe even install/turnup tasks. I'd like there to be a solid practices established and documented in a solutions bible of sorts before that happens. My brain can only store so much info and I can only do so much in a day. Plus having all these details ironed out sooner rather than later, and documented, will help keep me honest (ie, no band-aides that I plan on removing when I get time <g>). The other added benefit is that as I figure out how to do something rather fancy or in a simple and elegant manner I can document it for my own benefit and others.

So back to the original topics, does anyone have suggestions for CE-facing BGP config or the management and documentation of the CE details? I'm experimenting with peer-policy and peer-session templates right now. I'm sure with dozens or hundreds of peers their benefits would be more evident. So far they only seem to reduce my default-only test peers by 3 lines of config each. I'm sure this would be more saved lines of config if I was doing something more fancy.

Thanks for the input
Justin

Suresh_Ramasubramani · December 21, 2008, 3:46am

Sounds like a job for svn with a web based frontend to track configs.

--srs

Danny_Thomas · December 21, 2008, 9:47pm

Suresh Ramasubramanian wrote:

I should have said it earlier when I mentioned config backups. I'm already
a heavy user of RANCID, archiving my configs hourly. Been using it since
right around v2.0-2.1 which would be several years ago (feels like a
lifetime). So my config backups are more than taken care of. What I'm
interested in is if I should also document the PE-CE BGP config details
elsewhere or if I should just leave them in the PE and let my backups cover
me.

Sounds like a job for svn with a web based frontend to track configs.

we've been using rancid for years and the configs are used by many scripts
  * rancid modified to no longer remove the lines at the top of "sh run"
     to identify times config and NVRAM last updated and send email
     if the NVRAM is substantially older than running config
  * ip-helpers are compared against our dhcp config
  * subnets, routers, vlans, gateway and hsrp addresses are compared
     against our database of network information
  * the configs are analysed by a script to identify items not matching
     our config standard; ideally configs would be generated from templates
     but I'm not in a position to make that happen
  * routed address-space is compared against reverse dns
  * when we had border bogon filtering, those ACLs and null-routes
     were compared against a freshly-downloaded aggregated bogon list

one of the problems with rancid is that the code makes it hard to do
other things, e.g. compare the active and standby configs in our 6500s

I'd also like to add a feature that recognizes the significant blocks in a config
and store in a database so you can do queries like "when was vlan777 modified"

Danny

Nathan_Ward · December 21, 2008, 10:41pm

This sounds like a job for captain SNMP(trap/inform) or RADIUS or TACACS+!

Justin_Shore · December 22, 2008, 12:19am

Evening, Justin. Thanks for the reply.

Justin M. Streiner wrote:

You could certainly store all of the relevant config details in a database of some sort, and it certainly can't hurt to do so. Same goes for backing up your device configurations - always a good idea. As far as storing things like ASNs, allowed prefixes, etc, you may want to look at storing that information in an RPSL format. Many providers require their customers to register route/AS/policy objects either in one of the Internet Routing Registries (RADB, AltDB, etc...), or in a similar system that's operated by the provider. They then use this information for pushing out configuraiton changes such as the list of prefixes that a customer is allowed to announce, etc.

RPSL could definitely be useful when we starting reaching that class of customer. It's probably too grand for our users at this point (especially having them register anything on their own). I'll definitely do more research on this though.

It's good that you're thinking about this now, particularly the "is this customer legitimately allowed to announce prefix ABC, or source their traffic from AS XYZ"? Most larger providers put at least some limitations on what they will allow a customer to announce, though the level of investigation done to attempt to establich legitimacy for those announcements varies. Some providers require customers to provide some sort of Letter of Authorization/Agency for the prefixes they want to announce.

The process hasn't been established yet for validating a request to permit a prefix announcement. I expect it will be a manual verification process involving WHOIS lookups on the prefix, route-view queries to see if it's currently being advertised and probably a historical check on that prefix to see if it was previously advertised and from where. The best way to avoid network abuse issues is to not let them happen to begin with. I think we will also require some sort of written legal agreement stating that the prefix in question belongs to the customer and that we're authorized to permit it's advertisement across our network. If anyone has any sample documents for use in this process I would be interested in seeing them.

Yes. As long as the organizations that are using the private AS aren't 1. trying to advertise the same space, 2. possibly connect directly to each other directly, or 3. expecting to be able to connect to multiple upstream providers, then you should be OK. VZB (former UUNET) did something similar, using AS7046 (not a private AS, but the principle is the same), and I believe other carriers have had similar arrangements for customers.

Ah, yes connecting to each other could be a problem. I would think that it would only be an issue if I carried the private ASN across my iBGP infrastructure, each customer received full routes from me and I let them see the private ASNs as well. I could mitigate that problem with remove-private-as, I believe. I'd need to think on that some more. If the customer wants to be multi-homed and expect reachability then they should get an ASN. Otherwise both SPs are advertising their prefixes and the customer won't have much or possibly any control over which inbound path was preferred.

MD5 is good, but most providers I've seen make this an opt-in feature - they don't force their customers to use it. Setting a reasonable max-prefix limit and adjusting it as the number of prefixes a customer announces is always a good idea, and I'd consider it to be a best practice. Prefix lists and route maps can do different things, or accomplish the same task in different ways. It also depends on what functionality you want to offer your customers. Do you plan to publish and support a consistent set of customer-settable BGP communities for doing things like selective AS prepends? Do you plan to tag incoming advertisements with communities to identify them as customer routes, and pass those communities to your customers and peers? Some providers use AS path ACLs, and others avoid them at all costs.

I think I'll make MD5 part of the default config and let the customer ask for it to be removed if they choose to not have it. Same for GTSM. I'm a fan of max-prefixes. I think double the routes I expect to receive, 75% warning and a restart interval of 5m would be a good place to start. That would let me catch things happening before they got out of hand (in normal circumstances) and give me a fail-safe in case they decide to get crazy.

I do plan on implementing a BGP community solution but for now I'm going to keep it simple. I have bigger fish to fry at the moment but I'll try to get it done before we get asked for it by a customer. I will tag transit and customer routes. The ISP Essentials book had some good insight on that if memory serves me correctly. How fancy it gets will depend on my time and customer demand. I've seen some extremely complex setups that I could not replicate if my life depended on it.

The AS-Path filtering should only come into effect when we get a customer with their own ASN in which cases we'll actually pass on their advertisements (whereas with private ASNs we're only carrying them internally and summarizing on the upstream edges). That way I can ensure 1) that they claim to source the prefixes from their ASN and not someone else's and 2) that they don't insert BS ASNs into the path.

If the customers need to connect to multiple upstream providers, it's much less hassle to have them originate their prefixes from their own AS, and then you just need to propagate them. If they are single-homed, you can announce the prefix(es) for them and then just statically route them to the customer. Other things to consider are sound ingress/egress filtering policies, loose/strict RPF etc. Implementing a Netflow based monitoring solution, and applying flow caching to PE-CE interfaces is also agood idea.

I have a good set of residential and tiny SMB customer ACLs. I'm trying to decide what I should use for these larger SMBs. I'm sure some will be custom but I need to come up with a sane default set of ACLs. Some things I refuse to not block. The customer can find a different provider if they want certain things to not be blocked.

Until we get a multi-homed customer that wants inbound reachability we'll go with strict uRPF; that's already in my interface templates.

I have NFSen set up but all it's really doing right now is filling my hard drives. I haven't had much time to do anything else with it I'm afraid. I'll point NF to MARS box when it gets here though. I forget, does route-cache flow alter the packet switching capabilities of an interface these days or is that just to turn on NF on an interface? Seems like since CEF it only turns on NF.

Again, it's great that you're being proactive about this. Setting up good policies now will make things much easier to maintain and provision in the future as your network grows. Don't forget to document those policies thoroughly, particularly if other people will be tasked with implementing it on a dat yo day basis, i.e. handling new customer turn-ups, making BGP prefix list modifications, etc.

That's my goal. I prefer to be organized from the get go whenever possible. I don't like surprises in my work (unless it's a big, unexpected raise).

It seems like you have a pretty good handle on what you need to do. I don't know that there will be much of the holy war angle to the responses. Different people adopt different customer routing policies for different reasons - some technical, some business, some political. While there are different ways of accomplishing this task, most of the more scalable ways have already been a part of most large providers' policies for awhile.

I have an idea of what all needs to be implemented but I'm short on time to hone my skills by trial and error.

One-offs are bad, in my opinion, so the more you do to avoid them now, the fewer headaches you will have down the road.

Definitely. One-offs are like using duct tape to repair a Porsche. It's just not right.

Thanks for the input. It's much appreciated.
Justin

Nathan_Ward · December 22, 2008, 10:27am

While I'm sure this could be easily stored in a spreadsheet

I think the best piece of advice I ever saw RE network management, is teach your network ops people basic SQL. Spreadsheets work OK for one-off calculations, use SQL for any sort of data storage. This is a perfect example of where that would be useful.

What are common and accepted CE-facing BGP practices? MD5 AUTH, GTSM, max prefix limits? Which is preferred, route-maps or prefix-lists for controlling advertised and/or received routes? Do any SPs utilize AS-Path ACLs to check that prefixes from an customer's ASN are claimed to originate from there? Are there any SPs out there offering BFD support for BGP or CE-facing peering sessions?

My recommendation here, when talking about incoming advertisements from customers is to use prefix lists *and* route-maps.

Use prefix lists for an overall accept/reject.
Use route-maps to tag received prefixes with communities. Use these communities on your border/peering routers to control advertisements. This way, you make your border/peering routers low-touch - adding a new customer prefix only needs to be done on the customer edge, and that can be done by your provisioning guys leaving your more "important" routers to only be touched by people who *really* know what they're doing, in theory.

Having some well known communities to allow your customers to control their advertisements would be nice as well - to encourage you to prefer/not prefer, and also control how you advertise their prefixes outside your network.

Have a read after "Communities accepted from customers" in the RADB WHOIS for AS3356 for a fairly comprehensive example. Other's might have better examples, but I've often used this one as being pretty good.
(whois -h whois.radb.net AS3356)

michael.dillon · December 22, 2008, 10:53am

Have a read after "Communities accepted from customers" in
the RADB WHOIS for AS3356 for a fairly comprehensive example.
Other's might have better examples, but I've often used this
one as being pretty good.
(whois -h whois.radb.net AS3356)

You can also read this here:
<http://www.db.ripe.net/whois?searchtext=as3356>

--Michael Dillon