Dual Homed BGP for failover

Ahmed_Yousuf · January 18, 2011, 6:32pm

Hi,

I'm looking at a setup where we use BGP to announce PI space to two upstream
ISPs. ISP A provides a 30Mb/s connection and ISP B provides a 10Mb/s.
Originally the plan was to use ISP B's link as a backup and local pref
traffic outbound via ISP A and pref inbound using AS prepend via ISP A. It
has now been requested to be able to distribute traffic across both links
rather than preference traffic to the higher speed link. We are going to be
using Juniper SRX210s to do this. I have some questions:

- Is this really a good idea, as the BGP process won't care what
the utilisation of the links are and you will see situations where the lower
speed link gets used even though the high speed link utilisation is 0?

- If we are doing this, I don't want to take a full routing table,
I would rather just take the ISPs routes and perhaps their connected
customers. One ISP has said they will only provide full routing table or
default. I really don't want to take a full table, is receiving default
only going to be a problem for my setup?

- Any advice on how to avoid situations where the low bandwidth
link is being used even though there is 0 utilisation on the high bandwidth
link?

Thanks

Ahmed

Jack_Carrozzo · January 18, 2011, 6:40pm

You can just accept directly-connected peers from each network (or within 2
AS's, etc) then point a default at each one with different preferences. You
can do with with two edges if you like also: iBGP between the edges, and
push default into OSPF from both.

WRT dynamic load balancing... generally if your network is large enough for
two upstreams you'll have a pretty good distribution of flows so once you
get the prefs and prepends setup the way you like, thing won't shift that
rapidly. In my experience at least...

-Jack Carrozzo

Max_Pierson · January 18, 2011, 6:54pm

You really limit yourself when you just take a default from a provider. If
you take 2 default's (one from each provider) for whatever reason, once you
change the local pref on one of them, it's all your traffic outbound or
none.

I always request a full table + default, so you can filter to best suit your
needs. This way, you can just accept /8's and get some sort of balancing at
least (even if you just say all even /8's pref'd on one gateway and all odd
/8's from the other provider, etc). Of course this won't be symmetrical, but
thats the nature eBGP on the internet. You'll have to watch it and adjust as
needed so that you won't saturate your slower link.

Max

George_Bonser · January 18, 2011, 6:59pm

From: Ahmed Yousuf
Sent: Tuesday, January 18, 2011 10:32 AM
To: nanog@nanog.org
Subject: Dual Homed BGP for failover

- Is this really a good idea, as the BGP process won't care
what
the utilisation of the links are and you will see situations where the
lower
speed link gets used even though the high speed link utilisation is 0?

It is possible. But one thing, and I know it is a semantics nit but it
is really important. There is no difference in the "speed" of the
links. There is a difference in the capacity of the two but the traffic
flows at the same "speed" across both.

That said, have you actually tried seeing what the "natural" breakdown
of the traffic is? Without any AS prepend or local pref adjustment,
what is the natural ratio of traffic on the two links? Generally
different ISPs have different connectivity and some destinations will be
favored via one path and others via the other path. It might be useful
to determine how BGP naturally routes things first and then you can get
an idea of what needs adjusting.

- If we are doing this, I don't want to take a full routing
table,
I would rather just take the ISPs routes and perhaps their connected
customers. One ISP has said they will only provide full routing table
or
default. I really don't want to take a full table, is receiving
default
only going to be a problem for my setup?

Interesting. Most ISPs offer "default", "full", or "customer routes".
You can take a full table but simply filter out any that aren't from
your ISPs ASN or within one hop of it and only install the routes that
meet those criteria. In addition to using AS prepending, your providers
might offer communities that allow you to control redistribution of your
routing information to their peers. You might want to tell the ISP on
the smaller link not to announce your routes to a major peer. That
major peer will now find its path to you via the larger pipe.

- Any advice on how to avoid situations where the low
bandwidth
link is being used even though there is 0 utilisation on the high
bandwidth
link?

If that happens, it would mean that the world does not see your path via
the high bandwidth pipe as being an attractive path. As mentioned
above, you might be able to append communities to your routes to the
lower bandwidth ISP that control how they redistribute your routes. One
example might be something like "don't redistribute my routes if you see
them coming from another source" in which case that ISP only
redistributes your routes when they don't see the announcement via the
high bandwidth provider and effectively acts as a backup outside of
their own AS but you would still receive traffic originated within their
AS over the low bandwidth connection.

Ahmed

G

William_Herrin · January 18, 2011, 7:00pm

It
has now been requested to be able to distribute traffic across both links
rather than preference traffic to the higher speed link.
- Is this really a good idea, as the BGP process won't care what
the utilisation of the links are and you will see situations where the lower
speed link gets used even though the high speed link utilisation is 0?

Hi Ahmed,

This really isn't an either/or situation. You can prefer the higher
speed link without excluding the lower speed link. One common way to
do this (there are better ones but this one is easy) is to prepend the
AS path you send and receive on the lower speed link so that it's
longer.

- If we are doing this, I don't want to take a full routing table,
I would rather just take the ISPs routes and perhaps their connected
customers. One ISP has said they will only provide full routing table or
default. I really don't want to take a full table, is receiving default
only going to be a problem for my setup?

IMO, that would be a mistake. Taking significantly less than a full
table severely limits your options for balancing traffic between the
links.

- Any advice on how to avoid situations where the low bandwidth
link is being used even though there is 0 utilisation on the high bandwidth
link?

Any particular communication is either going to go through one link or
the other. I'm generalizing here, ignoring some subtleties, but if
packets between two particular hosts have picked the low speed link,
they will take that one instead of the high speed link. So in a sense
it isn't possible to prevent that situation. However, you can adjust
the preferences for one path versus the other so that you're not
leaving either circuit underused overall and the disparity between
your circuits (30 and 10) is not enough to cause major performance
issues in and of itself.

Regards,
Bill Herrin

Jack_Bates · January 18, 2011, 7:12pm

It should also be noted that taking a full table, doesn't mean you have to use the full table. Apply filters to smaller routes or long ASPATHs that you don't want, and then assign preferences, communities, prepends, etc as necessary for the routes you actually accept.

This means your sync time is longer and you'll have more updates, but it will still keep the local routing table much lower.

Jack

Brandon_Kim · January 18, 2011, 7:57pm

Someone should advise him that if he wants to take in a full BGP routing table
that he makes sure his router can handle it! I would hate for him to open the floodgates
and his production router shuts down. LOL....

George_Bonser · January 18, 2011, 8:05pm

One can take a full feed but filter so only a subset of the routes are
actually installed. For example, filter all routes that are more than
one AS away from the immediate upstream.

Jack_Bates · January 18, 2011, 8:57pm

You should still be careful, as most processors keep a copy of filtered routes as well, so while your forwarding table may not increase, your route processor memory most likely will.

I haven't checked, but I presume IOS and Junos have a knob to disable this feature?

Jack

Jack_Carrozzo · January 18, 2011, 9:03pm

I don't think this is the case, on IOS at least. Some years ago I was
rocking some 7500s with $not_enough ram for multiple full tables, but with a
prefix list to accept le 23 they worked fine.

-Jack Carrozzo

Jack_Bates · January 18, 2011, 9:19pm

On JunOS, I know I can view pre and post filtered bgp updates ingress and egress. I seem to recall seeing similar functionality introduced into IOS, though I'm less certain. It's still always advisable to be careful.

Jack

Jack_Carrozzo · January 18, 2011, 9:21pm

Yep, the great thing about IOS without 'commit confirmed' is when you remove
a bgp filter, it runs out of memory, reboots, brings up peers, runs out of
memory, reboots... meanwhile if you're trying to get in over a public
interface you're cursing John Chamber's very existence. Not that that's ever
happened to me of course...

-Jack Carrozzo

Randy_Carpenter · January 18, 2011, 9:28pm

I would be hesitant to do full tables on an SRX210, particularly if you only have an SRX210B with 512MB of RAM. I'm not sure what filtering would do in terms of memory usage, because I have not tried it. I generally put a separate edge device in to handle the upstream and BGP, and use the SRX purely for firewall. You can even have completely redundant edge routers and redundant firewalls, and mesh them with iBGP. This is the setup we are using in our office (2 Cisco 2821 routers on the edge, and 2 Juniper SRX240H firewalls right behind them). Since each of the 2 uplinks we have are ethernet, I have both routers connected to both providers. This gives us ultimate redundancy at very low cost.

-Randy

Max_Pierson · January 18, 2011, 9:29pm

Me <3's "commit confirmed" ... maybe someone from Cisco should be watching

Michel_de_Nostredame · January 18, 2011, 11:14pm

I remember in IOS the BGP config should not have "soft-reconfiguration
inbound" for this uplink session, otherwise routing-engine will still
keep one copy of full table in memory.

Ahmed_Yousuf · January 19, 2011, 10:23am

Thanks to all for the responses, certainly illuminating. I'm now more aware
of what I can do and what tools are available. The following makes sense to
me:

- Take full routing tables and default from both ISPs and decide
how I filter the routes that get installed in my routers.

- Originally apply the same filters on both and monitor the links
to see what the natural distribution is, when we let the BGP process decide
how the traffic is routed. Need to think more about which filters to apply
here, the SRX210s are quoted as having capacity for 16k routes.

- Once we have a better idea of the traffic profiles start changing
the filters to preference certain traffic over the higher speed link. One
way this might be done, is to filter based on RIPE or ARIN addresses. We
are most concerned about maintaining capacity for European traffic, so
install RIPE routes on the higher capacity link and ARIN routes on the lower
capacity links.

- Accept that we are never going to get an ideal distribution of
traffic and continue monitoring and adjusting local pref/prepends etc. as
and when we need to change the distribution of traffic. Hopefully we don't
need to do this that often.

Thoughts?

Ahmed

Randy_McAnally · January 19, 2011, 2:00pm

^ This. You're fighting a loosing battle with such slow links. Given the
limited route capacity of your router you might as well set up statics aimed
at each link and forget about BGP shaping. Just keep a floating default
pointed at each peer.

-Randy

Ahmed_Yousuf · January 19, 2011, 2:26pm

We're doing BGP to announce our PI space and make sure that our PI space is
reachable through both ISPs in case one link goes down. This is the primary
need to do the BGP here. Unfortunately my boss has requested that we make
use of the capacity of both links, rather than pref traffic out of the
higher capacity link.

Randy_McAnally · January 19, 2011, 2:55pm

Understood! you would _still_ take default BGP routes, I was implying more
along the lines (in cisco speak):

! Tweak as necessary to get a good balance
ip route 0.0.0.0 128.0.0.0 <peer1>
ip route 128.0.0.0 128.0.0.0 <peer2>

Set up SLA tracking on the peer IPs to retract the routes if either peer goes
down.

Either that or get more RAM on your router and go the BGP-only method.

-Randy