RE: multi-homing fixes

<quote>
Multihoming is popular because the cost of transmission circuits is
plummeting, making it less expensive to buy Internet access services from
two or more ISPs. At the same time, companies are more concerned about the
reliability of their networks and less willing to trust one service
provider.
</quote>

This gal only has it half right. It isn't the reduced cost of circuits, it's
it's the uncertainty that that circuit's provider will still be in business
next month, or that a change in that provider's business plans, or M&A
activity, will make them abandon that circuit altogether. This coupled with
2+ month provisioning schedules. A dropped circuit will result in a 2+ month
business outage. A 2 week business outage will put most companies out of
business for good. How many, of Northpoint's 100K customers survived
Northpoint's business failure? How many, of those that did, were
multi-homed? How many of them are multi-homed now?

<quote>
"Half of the companies that are multihomed should have gotten better service
from their providers," says Patrik Faltstrom, a Cisco engineer and co-chair
of the IETF's Applications Area. "ISPs haven't done a good enough job
explaining to their customers that they don't need to multihome."
</quote>

Is Patrik Faltstrom still an IETF co-chair? Is he still helping the
[failing] credibility of the IETF? Maybe, that's why? How can any ISP, or
anyone else, credibly guarantee that they'll still be in business next year?
Or, that they wont sell out to the very rich bad guys? Or, that circuit
provisioning will drop to under 5 calendar days? Because, that is the
*only* way you will convince business customers that they don't need to
multi-home.

At $99US for 512MB of PC133 RAM (the point is, RAM is disgustingly cheap and
getting cheaper), more RAM in the routers is a quick answer. Router clusters
are another answer, and faster CPUs are yet another. All of the above,
should get us by until we get a better router architecture. If the IETF is
being at all effective, that should start now and finish sometime next year,
so that we can start the 5-year technology roll-out cycle.

The next time that PF goes out in public, he should either have his lips
perma-bonded together, or have the lower part of his face covered in duct
tape. Maybe then, he can resist the urge to chew on his feet.

You almost make some good arguments. I pick up on this one for
two reasons:

1) You clearly haven't priced Cisco RAM lately. :slight_smile:

2) You've missed the issue completely. You dance around ISP's
   providing more reliable service (eg, by adding RAM to their
   routers), and then dismiss that in the face of poor service and
   cheap prices people will buy multiple links.

Much like your $99 RAM argument, customers today can get two or
three T1's for the same price as one just a year ago. More bandwidth,
more reliability, often less cost. Who would say no?

Clearly ISP's should offer better service, but at the current
bandwidth prices even with an ISP that took every precaution I, as
a customer, would always buy from two people. The price really is
that cheap. Even if ISP's (from a backbone perspective) delivered
real 100% uptime, many people would buy two circuits (to different
CO's) to avoid localized fiber / cable cuts.

Multi-homing is here to stay, in a big way. It will only become
more popular, no matter how good the ISP's become, for a number of
reasons. Any future protocol or policy discussions should take
this as a given.

<quote>
"Half of the companies that are multihomed should have gotten better service
from their providers," says Patrik Faltstrom, a Cisco engineer and co-chair
of the IETF's Applications Area. "ISPs haven't done a good enough job
explaining to their customers that they don't need to multihome."
</quote>

Is Patrik Faltstrom still an IETF co-chair? Is he still helping the
[failing] credibility of the IETF? Maybe, that's why? How can any ISP, or
anyone else, credibly guarantee that they'll still be in business next year?
Or, that they wont sell out to the very rich bad guys? Or, that circuit
provisioning will drop to under 5 calendar days? Because, that is the
*only* way you will convince business customers that they don't need to
multi-home.

        Rather than just bash the IETF (which is easy), it might be
just slightly more productive to wander over, subscribe to the
relevant list(s), and inject some operational perspective and/or clue.

        Browsing http://www.ietf.org will yield information on current
draft, WG charters, and how to join any lists of interest.

At $99US for 512MB of PC133 RAM (the point is, RAM is disgustingly cheap and
getting cheaper), more RAM in the routers is a quick answer. Router clusters
are another answer, and faster CPUs are yet another. All of the above,
should get us by until we get a better router architecture. If the IETF is
being at all effective, that should start now and finish sometime next year,
so that we can start the 5-year technology roll-out cycle.

        One belief (right or wrong) is the end-to-end path convergence
algorithm in BGP is close to hitting its scaling limits. That's
a problem that infinite RAM could not solve. A proof that the
algorithm is not a danger here would be most welcome in many circles.
If you've got such a proof, please do share.

Ran
rja@inet.org

Multi-homing is here to stay, in a big way. It will only become

> more popular, no matter how good the ISP's become, for a number of
> reasons. Any future protocol or policy discussions should take
> this as a given.

    Please don't confuse "I need more than one pipe into the internet"
with "my organization must place its prefixes into the default free
zone".

    It is possible to effectively use multiple pipes into many ISPs
without the cost to all of us that introducing your prefix into the
DFZ has.

Daniel Hagerty <hag@linnaean.org> writes:

    Please don't confuse "I need more than one pipe into the internet"
with "my organization must place its prefixes into the default free
zone".

    It is possible to effectively use multiple pipes into many ISPs
without the cost to all of us that introducing your prefix into the
DFZ has.

I am all ears as to how you propose I achieve provider independence
without introducing my prefix(es) into the DFZ.

eg, say I am connected to two upstreams, I would like to still
continue reachability with the global net in case of failure of one of
the upstreams (as wide failure), or a failure of an upstream POP
(which happens to be the only one in my LATA) from one of the
providers.

/vijay

From: Vijay Gill <vgill@vijaygill.com>

> Date: 23 Aug 2001 23:20:48 +0000
>
> eg, say I am connected to two upstreams, I would like to still
> continue reachability with the global net in case of failure of one of
> the upstreams (as wide failure), or a failure of an upstream POP
> (which happens to be the only one in my LATA) from one of the
> providers.

    Take prefixes from both providers and use them. Route your egress
traffic appropriately.

    My point wasn't that "there is no need to BGP multihome", but that
many seem to see this as the only way of achieving use of multiple
providers worth of pipe. There are other alternatives, depending on
your application.

Daniel Hagerty <hag@linnaean.org> writes:

    Take prefixes from both providers and use them. Route your egress
traffic appropriately.

    My point wasn't that "there is no need to BGP multihome", but that
many seem to see this as the only way of achieving use of multiple
providers worth of pipe. There are other alternatives, depending on
your application.

This is a possible solution (and similar ideas have been bought up in
the v6 arena as well). This runs into two things:

1) it is hard to maintain and manage (you've doubled the counting,
storing and allocation burden on the end user), as well as debugging.
Having been on the enterprise side of things, I believe that these are
non trivial problems to solve for a large number of people.

2) proper end unit (host) source address selection.

the way around #2 is to use some sort of a NAT scheme, and number
internally out of say, net10, and NAT appropriately at the autonomous
system edge. With the "servers" as it were (mail, http, ftp et al.)
being configured to listen on public address or special ports, etc.

These impose a significant added burden upon the end user.

It costs them significantly less to "graze upong the commons" as Randy
so eloquently put it at the IETF plenary in london.

/vijay

Daniel Hagerty <hag@linnaean.org> writes:

> Take prefixes from both providers and use them. Route your egress
> traffic appropriately.
>
> My point wasn't that "there is no need to BGP multihome", but that
> many seem to see this as the only way of achieving use of multiple
> providers worth of pipe. There are other alternatives, depending on
> your application.

This is a possible solution (and similar ideas have been bought up in
the v6 arena as well). This runs into two things:

[ snip #1 ]

2) proper end unit (host) source address selection.

the way around #2 is to use some sort of a NAT scheme, and number
internally out of say, net10, and NAT appropriately at the autonomous
system edge. With the "servers" as it were (mail, http, ftp et al.)
being configured to listen on public address or special ports, etc.

Or by SCTP - an elegant, protocol-level solution to most of the basic
reasons for small businesses (IE, those not likely to qualify for a
provider-independant block) to want multihoming.

Unfortunately, it appears to be even more poorly adopted than IPv6 so far,
and until it's present and turned on by default in end user systems (hello,
Redmond), it won't be terribly useful, since it requires the ability to run
end-to-end. Though arguably it would help matters greatly if it were just
supported by the proxies most ISPs force dialup users through, anymore.

"Joel Baker" <lucifer@lightbearer.com> writes:

Or by SCTP - an elegant, protocol-level solution to most of the basic
reasons for small businesses (IE, those not likely to qualify for a
provider-independant block) to want multihoming.

Unfortunately, it appears to be even more poorly adopted than IPv6 so far,

The fatal sentence.

To forestall ratholing on this list, I urge people to take this to the
rathole-by-charter: multi6

http://www.ietf.org/html.charters/multi6-charter.html

/vijay "bring the noise" gill

Having the "Cisco" name on it and it working in Cisco equipment is a
different story. Altough, we found out the hard way that RSP4 memory does
NOT work in an RSP8. It'll boot on it but DON'T try to make it do
anything after-the-fact.

That said, you can get RAM for "cisco" products at many multiples less
than Cisco wants to charge for it.

Yes and this non cisco ram can make Cisco's tac have a cor or two..

Brian "Sonic" Whalen
Success = Preparation + Opportunity

Cisco have certified a number of RAM vendors; you can happily buy
RAM from them (at non-inflated prices) without voiding your warranty
or causing an abnormal palpitations in the TAC.

For example:

  http://www.kingston.com/memory/routerzone/default.asp

Last time I priced RAM upgrades for RSP4s (which was a few years
ago now) cisco list price was around five times greater than
Kingston's price for cisco-approved SIMMs. Hooray, etc.

Joe

Well I remember seeing a message on groupstudy where somone from the tac
came on and made a statement along the lines that a substantial quantity
of cases they get are a result of non Cisco memory.

Brian "Sonic" Whalen
Success = Preparation + Opportunity

This gal only has it half right. It isn't the reduced cost of
circuits, it's it's the uncertainty that that circuit's provider
will still be in business next month, or that a change in that
provider's business plans, or M&A activity, will make them abandon
that circuit altogether [...]

I don't think anyone here is concerned that vZn, SBC, WCOM, Q, etc
will go out of business, or undergo drastic business changes
prohibiting them from continuing to provide us with the TDM, xWDM,
dark fibre, etc services they do now by this time next year.

But yes, customers multihoming is a good thing(tm) for other reasons
outlined. And putting all your eggs in one basket -- be it a large
and stable telco, or a small DSL aggregator of questionable clue and
financial stability -- is never wise, even if it will save you some
coin. And at the end of the day, either our IP providers' racks have
power, or they don't; either their cross-connects are live, or they're
cut with a razor blade...

At $99US for 512MB of PC133 RAM (the point is, RAM is disgustingly
cheap and getting cheaper), more RAM in the routers is a quick
answer. Router clusters are another answer, and faster CPUs are yet
another.

Throwing more RAM and CPU into our routers (assuming for a moment that
they're most certainly all Linux PC's running Zebra) is not the
solution you're looking for; the problem of RIB processing still
remains.

Getting a forwarding table requires extracting data from the RIB, and
this is the problem, because RIBs are very large and active, and are
being accessed by lots of reading and writing processes. RIB
processing is substantial, and is only getting worse.

If the IETF is being at all effective, that should start now and
finish sometime next year, so that we can start the 5-year
technology roll-out cycle.

Roeland, The IETF is eagerly awaiting your solution. Send code. See
Tony Li's presentation at the Atlanta NANOG on why this solution of
jamming RAM and CPU into boxes is not a long term viable answer:

  <http://www.nanog.org/mtg-0102/witt.html>

In short, state growth at each level must be constrained and must not
outstrip Moore's law, and to be viable in an economic sense, it must
lag behind Moore's law. Things that cause heartache normally involve
memory bandwidth from CPU to RIB memory when you need to spend a whole
lot of time walking tables as an ever larger percentage of your tables
slosh around like yo yos.

should either have his lips perma-bonded together, or have the lower
part of his face covered in duct tape. Maybe then, he can resist the
urge to chew on his feet.

Hmm, always a good idea.

-adam

Leo is exactly right. The real reasons that folks multihome are:

1) Backbone and/or routing instability striking one upstream provider
2) Local loop/fiber cuts

That's pretty much it. Although there might be a small element of fear that
a provider might go out of business, there is normally plenty of notice that
a provider is going down, provided you are using a reputable business ISP
and service, as opposed to DSL.

I don't think multhoming needs to be limited, currently. The size of the
routing table is increasing at a more or less linear rate, now. Even at a
higher than linear rate, modern core routers used by most carriers (i.e.
Juniper M-series and Cisco GSRs) can certainly handle a much large routing
table, even in base configurations, with no memory upgrades. The number of
routing updates is certainly not taxing router CPUs, either. We may run into
scalability problems with the algorithm at some point, but it hasn't hit us
yet. And, for now, CPU speed has been growing faster than the processing
requirements of the table.

The fact is, bandwidth is really cheap now. It is a "best practice" to have
multiple providers of any resource that has a long lead-in time and that
some or all of your business functions are dependant on.

This article in NW is part of a distressing genre of similar skreeds, which
have themes like "we are running out of IP address space, and must switch to
IPv6, right now" or "the routing table is too big and the internet will melt
down tomorrow". These articles appear in places like NW, Boardwatch, and
Interactive Week. A greybeard from IETF is almost always trotted out as part
of these tabloid-esqe little dramas. The uninformed and the semi-informed
have a moment (or longer) of panic, then resume their lives, occasionally
internalizing the misinformation. Multihoming is good for almost everyone
who needs it, and NW writers need to do better research.

- Daniel Golding

I don't think multhoming needs to be limited, currently. The size of the
routing table is increasing at a more or less linear rate, now.

please look at slides 11 and 15 of

    <http://psg.com/~randy/010809.ptomaine.pdf>

the /24s of small multihomers is half the routing table (see geoff's data)
and is growing radially (if you are silly enough not to filter that stuff).

randy

Does anyone have a graph of the number of allocated AS numbers? I
ask because in a perfect world each AS would originate 1 prefix
only, as they got enough address space in their first alloaction
to service them forever. In that case growth of the AS table would
be the growth of the routing table.

The real world would never work like that of course, but it is an
absolute lower bound on the table size, I think. I do believe we
can get much closer to this world with address space sizes like
those available in IPv6, however it's not clear to me that people
are really trying to think that way.

You mean there are more smaller guys than larger ones?

Bogggle.

And what's small? CNN with a /24? eBay with a /24? Traffic wise they are
certainly not small, visability wise they are certainly not small, and I'm
pretty sure no one here will claim that. Yet both annouce /24's out of the
Classful C space and get listened to (atleast by Verio who is a known filterer).

Been following this discussion closely, as multihoming has always been an interest of mine, going back to my first NANOG presentation in San Francisco. I am working on a couple of things in this area, which were presented at the last IETF: http://www.ietf.org/internet-drafts/draft-berkowitz-multireq-02.txt and http://www.ietf.org/internet-drafts/draft-berkowitz-tblgrow-00.txt The first deals with a broad framework for multihoming and related topics, but from the user requirements standpoint. The second proposes heuristics for trying to get a better understanding why the table is growing -- multihoming, lack of clue, traffic engineering, etc.

Part of the problem is that user perceptions and desires may or may not result in picking the right tool to get what they really want, which typically is high availability and load distribution. Honest, I had a customer that insisted their Internet connection never go down. I arranged BGP multihoming to two ISPs, and had one of the connections engineered so that it came directly off a major provider's dual SONET that entered their office park.

Some time later, I was in their computer room, and found -- count 'em -- ONE application server. I inquired what they planned to do if it went down, and they assured me that they were OK because they backed up on tape. Horrible shocked expressions when I inquired to what they expected to restore the tape.

I started the multihoming framework paper when I was consulting on pre-sales design to a major carrier, that would get weird customer perceptions but have no independent references to educate them. I've also written books in this area -- not intended as a plug, but another resource.

What I sense that we, as operators, vendors to operators, etc., need is some common vocabulary and methodology to understand what the customer is trying to do, and help them see the correct picture and the correct tools. The tools might be server clustering, redundant and diverse local loops to the same provider, contractual requirements for upstream route diversity, DNS redirection, etc.

My increasing feeling is that we lack a focal point for such information. I've variously presented drafts to IDR, MULTI6, and PTOMAINE, but the fit is never quite right. My inclination is to suggest BOF-level activities both at the operational forums and IETF, and try to replicate some of what we did in the PIER working group on renumbering -- not invent tools, but provide operational guidance.

I'm thinking of requesting a BOF on this both at NANOG in Oakland, and then the IETF in Salt Lake City. Tentative name: "Multihoming User Requirements at All Layers (MURAL)".

Might this help the situation?

Howard Berkowitz
Nortel Networks