RE: Resilience - How many BGP providers

I suppose I could take the whole resilience thing further and further and further. One of the replies used a phrase which I thing captured the problem quite nicely: "diminishing returns".
Basically I could spend lots and lots of money to try and eliminate all single points of failure. Clearly I don't have the money to do this and what I'm really trying to establish is at what
point do the returns start to diminish with regards to obtaining multiple transit providers. The answer appears to be "it depends". So if getting a third BGP peering with divergent paths,
separate last mile, separate facility and separate router will increase costs by 5x but only increase resilience by 0.001% is it really worth it? I'm trying to quantify the resilience of my
Internet connectivity and quantify the effects of adding more providers. Now to run through my case:

- I have one facility to locate BGP routers at. Thats not changing for the moment.
- I can afford two BGP routers.
- The facility I'm located at tell me they have divergent fibre paths and multiple entries into the facility. (Still need to verify this by getting them to walk the routes with me)
- I am going to take transit from two upstreams.
- I could ask the question as to whether I can peer with separate routers on each of the upstreams. i.e. to protect against router failures on their side.
- I will make sure that neither upstream peers with the other directly. (Does this give me some AS path redundancy?)

So from the above:

- I have no resilience with regards to datacentre location. i.e. if a plane fell out of the sky etc., I'm done.
- I can afford some BGP router resilience on my side. So I should be able to continue working if a router failure which only affects one of my routers occurs.
- I have some resilience in terms of actual fibre paths to the facilites where I will be picking up the BGP feeds from. (to be verified)
- I have some "AS resilience" if this is the right term. So if the AS of one of my upstreams drops off the face of the Internet, I can still get to the Internet through the AS of my other
provider
- Peering with separate routers may give me some resilience for router failure on the side of my upstreams? (not totally sure on this)

In this situation, if I add another peering with another upstream, am I really getting much return in terms of resilience? Or should I spend this money examining the many other SPOFs in
my architecture? I'm perfectly sure there is absolutely no point me peering with 6 providers, but maybe some gains in peering with 3? I'm trying to figure out at what point is adding
another peering in my case a waste of money.

I haven't gone into switch and power redundancy, because I "think" I understand it. I wanted to concentrate on the multiple upstreams question. Heads starting to whirl right about now.

Adel

On Wed 5:27 PM , "Dylan Ebner" dylan.ebner@crlmed.com sent:

It is wise to stack the deck in your favor, but you'll never really
know how much real redundancy you've purchased:

http://www.atis.org/ndai/ATIS_NDAI_Final_Report_2006.pdf

David

* adel@baklawasecrets.com

- I could ask the question as to whether I can peer with separate
routers on each of the upstreams. i.e. to protect against router
failures on their side.

If you're getting transit from two different upstreams, you're pretty
much guaranteed to be connected to two different routers. Unless you're
thinking about establishing redundant connections to each provider, that is.

What you should ensure, though, is that the PoPs of the two upstreams
are not found in the same physical building (or neighbourhood for that
matter), and that the fibres that connects you to those PoPs never
cross - it doesn't really help that much with two trenches on each side
of your building if the paths converge 1km away from it. You might also
want to consider getting the fibres from two different providers to
guard against contract-related disputes, unexpected bankruptcies, or
similar that would cause the fibre provider to terminating/suspending
your service.

- I will make sure that neither upstream peers with the other
directly.

This does not make any sense, if you're talking about peering. Peering
is a good thing for reliability and performance. I see from the rest of
your e-mail that you're mixing up the terms peering and transit, though,
so if you're taking about your provider A purchasing transit from
provider B, it makes perfect sense - at least if provider A is _only_
getting transit from B. If on the other hand provider A is getting
transit from C, D, and E in addition to B, it's not really a problem.

It might also be the case that A and B both get transit from C only,
which would make C a single point of failure for you.

Best regards,