design of a real routing v. endpoint id seperation

Joe_Maimon1 · October 20, 2005, 11:49am

This is what I meant by suggesting that source routing was an original attempt at a seperation from routing/locating and endpoint identifiers.

You can replace the concept of "source routing" in below with mpls TE, l2tpv3 or any other suitable encapsulation mechanism.

The concept is that there would need to be some hierarchy of global routing tables in order for routing to scale.

Currently circulated ideas for independence between routing and endpoint identifier have certain modes for operation.

A)
- end node sends packet to destnode
- destnode location is looked up in <SOMEWHERE> <--- Today that is the Global routing tables, indexed by destnode ID

- sent to there

Most proposals wish to replace/supplement SOMEWHERE with some amorphous protocol and/or some external VeryLargeDatabase.

B)

- end node sends packet to destnode
- destnode location is routed through normative route table lookups
- inband signalling provides other alternatives for session/transaction/conversation continuation

C)

- end node performs locator lookup
- end node encaps
- destnode decaps

This could be as easy as performing IPinIP with srv records and DDNS.

This is in direct contrast to all other proposals in that it is much closer to being implementable Today with Todays technology.

A chunk of ipv6 space is carved off. This is assigned to multihoming
desiring sites.

All routers that currently carry Global Routing Tables {can | should } filter this space from their tables
completely by default - except the single prefix covering the entire space.

A customer with a prefix assigned from this chunk has to connect with an
ISP who has

* a Very Large Multihoming (to handle scaling concerns) router somewhere
in its network that peers to other ISP Very Large Multihoming routers.

ISP operating a VLMrouter to offer multihoming service to their
customers would originate the entire multihoming space prefix to their
customers AND to all their peers.

These would have ALL the prefixes from the Multihoming Space.

* the customer would peer with the VLMrouter, receive no routes and
advertise their prefix.

* source routing allowed on ingress IF the destaddr is in the multihoming
space AND the route-option is the Very Large Multihoming router

* source routing is allowed within the ISP network

The VLMrouter would make a SOURCE routing decision, putting a source
route destination to the customer.

* The ISP allows egress source routed packets

What this means is that there are 2 tables on the internet, the table
that ALL internet routes need have (like today) and the table that only
an ISP offering access to multihoming need have. The ISP offering such
access would only need, say one box per POP or so.

So the scaling problem becomes much smaller in scope. Now only ISP
wishing to offer multihoming services need to track the multihoming
table. Additionaly, the tables are actually halved, the VLMrouter need
not contain the normal internet routes and vice versa.

The downside is that an ISP performing as multihoming table hoster would
be a magnet for traffic that would possibly transit in and out.

Smaller multihoming hosting ISPs would probably try to prepend the
prefix mightily, or arrange not to originate it at all, and simply
receive prefix source routed from an ISP they connect to who also hosts
multihoming hosting AND originates the prefix.

No changes to stacks, endpoint nodes or anything else needed.
(if source routing still works in ip6?)
Some source routing filtering capabilities needed for border patrolling

something like this

config-if#ip source-routing prefix-list multihoming-prefixes
access-group allowed-source-routes

Joe

Owen_DeLong · October 20, 2005, 6:42pm

A customer with a prefix assigned from this chunk has to connect with an
ISP who has

* a Very Large Multihoming (to handle scaling concerns) router somewhere
in its network that peers to other ISP Very Large Multihoming routers.

ISP operating a VLMrouter to offer multihoming service to their
customers would originate the entire multihoming space prefix to their
customers AND to all their peers.

These would have ALL the prefixes from the Multihoming Space.

So... Let me get this straight. You think that significantly changing
the economic model of every ISP on the planet (or at least every large
ISP on the planet) is easier than changing the code in every core router?

ROTFLMAO

Owen

Joe_Maimon1 · October 20, 2005, 8:30pm

Owen DeLong wrote:

A customer with a prefix assigned from this chunk has to connect with an
ISP who has

* a Very Large Multihoming (to handle scaling concerns) router somewhere
in its network that peers to other ISP Very Large Multihoming routers.

ISP operating a VLMrouter to offer multihoming service to their
customers would originate the entire multihoming space prefix to their
customers AND to all their peers.

These would have ALL the prefixes from the Multihoming Space.

So... Let me get this straight. You think that significantly changing
the economic model of every ISP on the planet (or at least every large
ISP on the planet) is easier than changing the code in every core router?

ROTFLMAO

Owen

ISPs who wish to connect customers who have allocations from the multihoming space must

a) announce the whole space aggregated
b) peer with other providers who host other customers

ISPs who dont wish to connect these customers should feel free not to, and that will have no bearing on the rest of those who do.

If you are referring to the affect that this will attract "unwanted" traffic, that would be considered a COB.

In essence, the previous discussion about LNP suggested that telco's must do the same thing, attract unwanted traffic, traffic they must switch right back out of their network.

michael.dillon1 · October 21, 2005, 9:35am

ISPs who wish to connect customers who have allocations from the
multihoming space must

a) announce the whole space aggregated
b) peer with other providers who host other customers

As mentioned, this huge aggregate attracts unwanted traffic.
It would make more sense if this so-called multi-homing
aggregate was to be carved up into smaller aggregates
based on the geographical topology of the network. That
way, providers whose PoPs are geographically close to
each other (in the same city) could use multihoming
addresses from the same aggregate. Providers could then
choose to only offer multihoming services in those
cities where they peer with other multihoming providers.
The number of aggregate routes announced mushrooms to
about 5,000 because there are that many cities in the
world with a population greater than 100,000 people.

This geotopological address aggregation will still
result in some unwanted traffic but a provider is
at liberty to carry more detail internally and hand
it off closer to the source. For instance, a provider
with PoPs in New York and Paris, could elect to carry
all Paris routes in New York in order to shed peer
traffic before it crosses the Atlantic.

I wonder if the solution to these issues would
be facilitated by carrying some additional policy
info in a routing protocol. Attributes like
ROUTE_COMES_FROM_A_PEER_WHO_SELLS_MULTIHOMING
or similar. If there are only 5000 or so
peering locations in the world, then perhaps
an attribute like ROUTE_HEARD_FROM_PEER_IN_CITY_439
would also be useful.

--Michael Dillon

Jeroen_Massar1 · October 21, 2005, 11:58am

<SNIP>

C)

- end node performs locator lookup
- end node encaps
- destnode decaps

This could be as easy as performing IPinIP with srv records and DDNS.

There is an 'example possible alternate use' in the following document:
http://unfix.org/~jeroen/archive/drafts/draft-massar-v6ops-ayiya-02.txt

page 20, section 9.3 which describes something that could be called:
- double NAT
or:
- encapsulation

The problem though is that this requires the end-site/host to upgrade on
both sides otherwise you loose this special multihoming capability. You
need to detect that, which costs overhead etc and of course how do you
figure out where the other end is at that moment and how do you know
that the path between them is optional and and and a lot more issues

To repeat: that section is only an 'example possible alternate use' so
don't comment on it (except if you find typos or so

Greets,
Jeroen

Joe_Maimon1 · October 21, 2005, 1:23pm

(apologies to Owen for CC'ng list, his points are valid concerns that I hadnt addressed or considered properly)

Owen DeLong wrote:

c) Carry a much larger table on a vastly more expensive set of routers
in order to play.

ISPs who dont wish to connect these customers should feel free not to,
and that will have no bearing on the rest of those who do.

Somehow, given C) above, I am betting that most providers will be in this
latter category.

Considering that most people who are in favor of multihoming for ipv6 believe that there is customer demand for it, the market forces would decide this one.

Additionally, until there are a few hundred thousand routes in the multihoming table, I dont see any more expense than today, merely an extra box in the pop. It could be years away that the doomsday table growth the anti-multihoming crowd predicts could occur. Only at that point would expensive seperate routers be needed.

In fact seperate routers makes the multihoming table very small, at least to start with. It would be an implementation detail. An ISP could easily start off by simply not announcing the more specifics in the prefix space, without the new router systems.

The point is, that the scaling problems multihoming brings would be limited to

a) ISP's who want to offer service to customers who want to multihome
b) The system that the ISP runs to provide this service.

This is in contrast to todays mechanism, where customers who want to multihome affect everyone who accepts a full BGP feed.

At the time customer demand worldwide demanded seperate routing tables, would be the time that ISPs would be able to decide whether the roi would be sufficient or not for them to keep their investment.

Such a scheme would be a "money where your mouth is".

You say there is customer demand for multihoming? Well here it is. Lets see which ISPs want to implement it and which customers want to pay extra (FSVO extra) for it.

In fact, customers who multihome in this way, need not use the same ASN space as the rest of the world, just unique to the multihoming table

(that might not work well if ISP's "faked" it by simply not advertising the more specifics they carried internally)

This concept brings true hierarchy, and thus scalability, to the routing table.

If you are referring to the affect that this will attract "unwanted"
traffic, that would be considered a COB.

That too, but, primarily, c).

There are simple ways to minimize this.

1) standard BGP tricks....anti-social to be sure, such as prepending, meds......

2) "Transit"-multihoming peering, where you depend more on external parties who peer with you on the multihoming plane more "popular" advertisement to bring you a higher ratio of traffic you are interested in.

A small multihoming-table-carrying ISP would want to arrange things so that he pays a bit mer per (Mn|Gb) from his multihoming-table-peer, but does not have to attract large quantities of unwanted traffic from his non-multihoming-table peer.

In essence, the previous discussion about LNP suggested that telco's must
do the same thing, attract unwanted traffic, traffic they must switch
right back out of their network.

Except they don't. My formerly AT&T number does not go through AT&Ts
network to reach me just because it was ported. Read up on how SS7
actually works before you make statements like this that simply aren't
true.

So I have been told....apparently I mistook the "conslusions" of the relevant threads. apologies.

neilmcrae · October 21, 2005, 1:33pm

Considering that most people who are in favor of multihoming
for ipv6 believe that there is customer demand for it, the
market forces would decide this one.

We have nobody but ourselves to blame for this. If we all ran
networks that worked as well as our customers demand and didn't have
our petty peering squables every full moon, the market wouldn't
feel the need to have to dual home.

Bandy_Rush1 · October 21, 2005, 4:00pm

We have nobody but ourselves to blame for this. If we all ran
networks that worked as well as our customers demand and didn't have
our petty peering squables every full moon, the market wouldn't
feel the need to have to dual home.

that's the telco brittle network model, make it so it fails
infrequently. this has met with varied success.

the internet model is to expect and route around failure.

randy

neilmcrae · October 21, 2005, 4:13pm

that's the telco brittle network model, make it so it fails
infrequently. this has met with varied success.

One way to look at it:

the internet model is to expect and route around failure.

this has also met with varied success.

Andre_Oppermann · October 21, 2005, 4:43pm

Neil J. McRae wrote:

Considering that most people who are in favor of multihoming for ipv6 believe that there is customer demand for it, the market forces would decide this one.

We have nobody but ourselves to blame for this. If we all ran
networks that worked as well as our customers demand and didn't have
our petty peering squables every full moon, the market wouldn't
feel the need to have to dual home.

There is not only the multihoming issue but also the PI address issue.
Even if any ISP would run his network very competently and there
were no outages we would face the ISP switching issue. Again we
would end up with either PI addresses announced by the ISP or BGP
by the customer. With either the DFZ continues to grow. There is
just no way around it.

Matt_Ghali · October 21, 2005, 6:33pm

the internet model is to expect and route around failure.

randy

That precludes agreement on a definition of "failure". In recent
weeks we have once again learned that a large fuzzy fringe around
any sort of 100% consensus makes life interesting.

For instance; was the withdrawal of certain routes from your BGP
sessions a "failure" for you? Was it for superwebhostingforfree.com,
who relies on a single provider for transit?

matto

--matt@snark.net------------------------------------------<darwin><
The only thing necessary for the triumph
of evil is for good men to do nothing. - Edmund Burke

tony1athome · October 21, 2005, 8:12pm

the internet model is to expect and route around failure.

You cannot stop the last mile backhoes.

Tony

Owen_DeLong · October 21, 2005, 9:08pm

There is not only the multihoming issue but also the PI address issue.
Even if any ISP would run his network very competently and there
were no outages we would face the ISP switching issue. Again we
would end up with either PI addresses announced by the ISP or BGP
by the customer. With either the DFZ continues to grow. There is
just no way around it.

The way around it is to stop growing the DFZ routing table by the size
of the Prefixes. If customers could have PI addreses and the DFZ
routing table was based, instead, on ASNs in such a way that customers
could use their upstream's ASNs and not need their own, then, provider
switch would be a change to the PI->ASN mapping and not affect the
DFZ table at all.

Owen

Gary_E_Miller · October 21, 2005, 10:15pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Yo Neil!

If we all ran networks that worked as well as our customers demand...

Some demand low price and some demand high availability. No way to
please everyone.

RGDS
GARY
- ---------------------------------------------------------------------------
Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701
gem@rellim.com Tel:+1(541)382-8588 Fax: +1(541)382-8676

Bandy_Rush1 · October 23, 2005, 5:50pm

the internet model is to expect and route around failure.

You cannot stop the last mile backhoes.

no, but if your facility is critical, you have redundant physical
and layer one exits from it. and you have parallel sites.

randy

michael.dillon1 · October 24, 2005, 9:01am

> the market wouldn't
> feel the need to have to dual home.

the internet model is to expect and route around failure.

Seems to me that there is some confusion over the meaning
of "multihoming". We seem to assume that it means BGP multihoming
wherein a network is connected to multiple ASes and uses BGP
to manage traffic flows.

Other people use this term in very different ways. To some people
it means using having multiple IP addresses bound to a single
network interface. To others it means multiple websites on one
server.

And to many consumers of network access it is a synonym for
redundancy or resiliency or something like that. BGP multihoming
is not the only way to satisfy the consumers of network access
and design a solution in which failure is expected and it is
possible for the customer to route around failure.

A single tier-2 ISP who uses BGP multihoming with several
tier 1 ISPs can provide "multihoming" to it's customers
without BGP. For instance, if this tier-2 has two PoPs
in a city and peering links exist at both PoPs and they
sell a resilient access service where the customer has
two links, one to each PoP, then it is possible to route
around many failures. This is probably sufficient for most
people and if the tier-2 provider takes this service seriously
they can engineer things to make total network collapse exteremely
unlikely.

Another way in which consumer's could be "multihomed" would be
to have their single access link going to an Internet exchange
where there is a choice of providers. If one provider's network
fails, they could phone up another provider at the exchange and
have a cross-connect moved to restore connectivity in an hour or
so. This will satisfy many people.

Of course there are many variations on the above theme. This is
an issue with multiple solutions, some of which will be superior
to BGP multihoming. It's not a simple black or white scenario.
And being a tier-1 transit-free provider is not all good. It may
give some people psychological comfort to think that they are in
the number 1 tier, but customers have good reason to see tier-1
transit-free status as a negative.

--Michael Dillon

Owen_DeLong · October 24, 2005, 9:24am

> the market wouldn't
> feel the need to have to dual home.

the internet model is to expect and route around failure.

Seems to me that there is some confusion over the meaning
of "multihoming". We seem to assume that it means BGP multihoming
wherein a network is connected to multiple ASes and uses BGP
to manage traffic flows.

As I understand it, the term multihoming in a network operations
context is defined as:

(A multihomed network is)
A network which is connected via multiple distinct
paths so as to eliminate or reduce the likelihood that a single
failure will significantly reduce reachability.

Note, this is independent of the protocols used, or, even of
whether or not what is being connected to is the internet.
So, it does not assume BGP. It does not assume an AS.

Now, in the context of an ARIN or NANOG discussion, I would expect
to be able to add the following assertions to the term:

1. The connections are to the internet. A connection which
  is not to the internet is of little operational
  significance to NANOG, and, ARIN has very little to
  do with multihoming in general, and, even less if it is
  not related to the internet.

2. The connections are likely to distinct ISPs, although, in
  some cases, not necessarily so. Certainly, if one is to
  say one is addressing the issues of multihoming, then,
  one must address both values for this variable.

3. Most multihoming today is done using BGP, but, many other
  solutions exist with various tradeoffs. In V6, there is
  currently only one known (BGP) and one proposed, but,
  unimplemented (Shim6) solution under active consideration
  by IETF. (this may be untrue, but, it seems to be the
  common perception even if not reality).

Other people use this term in very different ways. To some people
it means using having multiple IP addresses bound to a single
network interface. To others it means multiple websites on one
server.

That is not multihoming. That may be an implementation artifact
of some forms of multihoming (using the addresses assigned by
multiple providers ala Shim6 proposal), but, multiple addresses
on an interface do not necessarily imply multihoming. In fact,
more commonly, that is virtual hosting.

And to many consumers of network access it is a synonym for
redundancy or resiliency or something like that. BGP multihoming
is not the only way to satisfy the consumers of network access
and design a solution in which failure is expected and it is
possible for the customer to route around failure.

It certainly is one component of a redundancy/resiliency solution.

A single tier-2 ISP who uses BGP multihoming with several
tier 1 ISPs can provide "multihoming" to it's customers
without BGP. For instance, if this tier-2 has two PoPs
in a city and peering links exist at both PoPs and they
sell a resilient access service where the customer has
two links, one to each PoP, then it is possible to route
around many failures. This is probably sufficient for most
people and if the tier-2 provider takes this service seriously
they can engineer things to make total network collapse exteremely
unlikely.

As long as you are willing to accept that a policy failure in
said Tier2 ISP could impact both pops simultaneously, and,
accept that single point of failure as a risk, then, yes,
it might meet some customers' needs. It will not meet all
customers' needs.

Another way in which consumer's could be "multihomed" would be
to have their single access link going to an Internet exchange
where there is a choice of providers. If one provider's network
fails, they could phone up another provider at the exchange and
have a cross-connect moved to restore connectivity in an hour or
so. This will satisfy many people.

Again, there are tradeoffs and risks to be balanced here as there
are multiple single points of failure inherent in such a
scenario. However, at the IP level, such a network would, indeed
be multihomed. The layer 1 and 2 issues not withstanding.

Of course there are many variations on the above theme. This is
an issue with multiple solutions, some of which will be superior
to BGP multihoming. It's not a simple black or white scenario.
And being a tier-1 transit-free provider is not all good. It may
give some people psychological comfort to think that they are in
the number 1 tier, but customers have good reason to see tier-1
transit-free status as a negative.

I'm not sure why you say some are superior to BGP multihoming.
I can see why some are more cost effective, easier, simpler
in some cases, or, possibly more hassle-free, but, the term
superior is simply impossible to define in this situation,
so, I'm unsure how you can categorize something as superior
when the term can't be defined sufficiently.

Owen

Jeroen_Massar1 · October 24, 2005, 9:35am

<SNIP>

3. Most multihoming today is done using BGP, but, many other
  solutions exist with various tradeoffs. In V6, there is
  currently only one known (BGP) and one proposed, but,
  unimplemented (Shim6) solution under active consideration
  by IETF. (this may be untrue, but, it seems to be the
  common perception even if not reality).

As for "multihoming" in the sense that one wants redundancy, getting two
uplinks to the same ISP, or what I have done a couple of times already,
multiple tunnels between 2 sites (eg 2 local + 2 remote) and running
BGP/OSPF/RIP/VRRP/whatever using (private) ASN's and just providing a
default to the upstream network and them announcing their /48 works
perfectly fine.

The multihoming that people here seem to want though is the Provider
Independent one, and that sort of automatically implies some routing
method: read BGP.

Greets,
Jeroen

michael.dillon1 · October 24, 2005, 9:44am

The way around it is to stop growing the DFZ routing table by the size
of the Prefixes. If customers could have PI addreses and the DFZ
routing table was based, instead, on ASNs in such a way that customers
could use their upstream's ASNs and not need their own, then, provider
switch would be a change to the PI->ASN mapping and not affect the
DFZ table at all.

One way to do this is for two ISPs to band together
in order that each ISP can sell half of a joint
multihoming service. Each ISP would set aside a
subset of their IP address space to be used by many
such multihomed customers. Each ISP would announce
the subset from their neighbor's space which means
that there would be two new DFZ prefixes to cover
many multihomed customers.

Each multihomed customer would run BGP using a private
AS number selected from a joint numbering plan. This
facilitates failover if one circuit goes down but
doesn't consume unneccesary public resources per customer.

This does require the two ISPs to maintain a strict
SLA on their interconnects in order to match the SLAs
on their customer contracts. The interconnect then
becomes more than "just" a peering connection, it also
becomes a mission critical service component.

Of course, the whole thing multihoming thing could
be outsourced to a 3rd party Internet exchange
operator with some creativity at both the technical
level and the business level. The IP address aggregate
would then belong to the exchange. More than 2 ISPs could
participate. Customers could move from one ISP to another
without changing addresses. The SLA on interconnects could
be managed by the exchange. Etc.

--Michael Dillon

Owen_DeLong · October 24, 2005, 10:10am

[snip...]

Except this completely disregards some customers concerns about having
provider independence and being able to change providers without
having a major financial disincentive to do so. That _IS_ a real
business concern, no matter how much the IETF would like to pretend
it does not matter.

Owen