"Selfish Routing"

http://www.scienceblog.com/community/article1018.html

Thus spake "Pete Kruckenberg" <pete@kruckenberg.com>

http://www.scienceblog.com/community/article1018.html
---
This might be easier to understand if it was more technical,
but I'm only aware of a lot of disabled features on my
routers that are supposed to in theory do some of these
things.

And they're disabled because they often result in routing loops, usually
transient but sometimes permanent. With very careful planning, you can
create scenarios where these features help; however, it's usually cheaper to
add capacity than to improve efficiency when you include engineering and
operational costs.

Abstractions and analogies aside, is this really a problem,
and is it really worth solving? Sounds like a lot of
additional complexity for the supposed benefits.

Some carriers are solving this problem with MPLS-TE, but not the way the
author suggests.

Other than the MPLS-TE solution, I'm not aware of any ISPs that use
congestion- or RTT-based routing. [E]IGRP is the only IGP with a mechanism
to implement this on a packet level, and experience shows it is unstable in
most topologies.

S

Abstractions and analogies aside, is this really a problem,
and is it really worth solving? Sounds like a lot of
additional complexity for the supposed benefits.

A key comment to put this paper in perspective:
(from http://news.com.com/2100-1033-984694.html?tag=fd_top )
> "The extent to which the real Internet conforms to these
> mathematical models is not yet well understood," Roughgarden
> said.

In particular, BGP does uses neither latency not congestion as
a parameter to choose a path. (assuming that conditions are such
that the BGP session can stay up, of course ;).

Widespread deployment of a pure-performance criteria for path
selection could indeed have the type of problem described. Standard
BGP flap dampening mechanisms would mitigate, but not eliminate,
the potentially negative effects described.

Chasing the last ms of optimization tends to both focus traffic
on the single "best" link, as well as increasing the rate of route
change as the "best" continually changes.

Considering alternate paths with roughly similar performance
significantly changes the picture. This not only reduces the
required rate of route change, but also tends to spread the
load across the range of valid (near-optimal) paths, and thus
significantly mitigates the concerns raised in the paper.

cheers -- Sean

Thus spake "Sean Finn" <seanf@routescience.com>

Chasing the last ms of optimization tends to both focus traffic
on the single "best" link, as well as increasing the rate of route
change as the "best" continually changes.

Considering alternate paths with roughly similar performance
significantly changes the picture. This not only reduces the
required rate of route change, but also tends to spread the
load across the range of valid (near-optimal) paths, and thus
significantly mitigates the concerns raised in the paper.

The problem is eliminating the possibility of a packet taking a "near
optimal" path from A to B, and then taking another "near optimal" path from
B back to A.

I suspect this is impossible to fix while retaining hop-by-hop routing.

S

"Roughgarden has a suggestion that wouldn't be expensive to implement.

so do i. forget all this "normal physics" nonsense and just use hyperspace.

Before deciding which way to send information, he says, routers should
consider not only which route seems the least congested, but also should
take into account the effect that adding its own new messages will have
on the route it has chosen.

i dunno, i don't think igrp would scale to the size of the internet. wasn't
there a 1/(n^2) relationship between metadata size and network capacity as
a function of total delay*bandwidth product in the whole system?

That would be, he says, 'just a bit altruistic' in that some routers
would end up choosing routes that were not necessarily the fastest, but
the average time for all users would decrease."

if somebody was looking for a problem to work on, i'd suggest that needing
the shortest path to always have enough capacity makes planning crunchy.
(which sounds like the same thing as quoted above, but really isn't.)

Basically, injecting current traffic state information into routing system
is a recipe for disaster. "Normal" flap due to equipment failures and
human interventions is bad enough already.

Somehow people in academia tend to underappreciate the sheer scale of the
Internet, and offer solutions which won't work at that scale.

--vadim

When I read the article I didnt recognize anything in it that correlated
with my understanding as to how BGP et all works. Am I the only one who
after reading it thought the author needs to cut back on whatever drugs he
is taking?

I have never seen any router "send test packets and time them" nor take
congestion into account when doing routing decisions?

Thus spake "Mikael Abrahamsson" <swmike@swm.pp.se>

When I read the article I didnt recognize anything in it that correlated
with my understanding as to how BGP et all works. Am I the only
one who after reading it thought the author needs to cut back on
whatever drugs he is taking?

I have never seen any router "send test packets and time them" nor take
congestion into account when doing routing decisions?

There are indeed commercial and in-house products which use probes and
traffic measurement to influence the BGP decision process. Watch tarpit
logs for sites that ping every subnet on a regular schedule. I hope this
doesn't catch on or all the pings may drown out real traffic...

As a general description of Internet routing, I agree it's laughable -- but
the technology does exist.

S

Stephen Sprunk wrote:

The problem is eliminating the possibility of a packet taking a "near
optimal" path from A to B, and then taking another "near optimal" path from
B back to A

I suspect this is impossible to fix while retaining hop-by-hop routing.

Looping does indeed present a problem. If you want all nodes to play the "near optimality" game, you have to move very slowly - if I choose you, then we need a sensible way to guarantee that you won't choose me > i dunno, i don't think igrp would scale to the size of the internet.
> wasn't there a 1/(n^2) relationship between metadata size and
> network capacity as a function of total delay*bandwidth product
> in the whole system?

Combining the scale problem and the looping problem suggests optimization would fit most sensibly in an environment where the number of choices under consideration is small, and some loop-free properties already exist.

Such conditions happen to occur where autonomous systems meet, and especially for a stub AS. Careful inter-AS BGP optimization makes a good deal more practical sense than, say, hysterically self-reconfiguring OSPF.

I could observe that the edge of a BGP AS is also a place where money sometimes changes hands, but I wouldn't want to infect a nice academic discussion with anything too commercial :slight_smile:

Mike

Stephen Sprunk wrote:
> The problem is eliminating the possibility of a packet taking a "near
> optimal" path from A to B, and then taking another "near optimal" path from
> B back to A

> I suspect this is impossible to fix while retaining hop-by-hop routing.

Looping does indeed present a problem. If you want all nodes to play
the "near optimality" game, you have to move very slowly - if I choose
you, then we need a sensible way to guarantee that you won't choose me
at the same instant.

Or you can precompute potential paths and remove the paths that may
introduce loops. One way to do this would be with diffserv. You could
have different paths for different traffic classes. As long as each path
is loopfree and there can't be any oscillating between classes (router A
makes the packet CP X that must be routed over B, router B rewrites the
codepoint to Y that must be routed over A) you're homefree. And this is
easily accomplished by limiting the class transitions to one way only.
For instance, the traffic class may be upgraded if there is room in a
higher class but never downgraded.

> i dunno, i don't think igrp would scale to the size of the internet.
> wasn't there a 1/(n^2) relationship between metadata size and
> network capacity as a function of total delay*bandwidth product
> in the whole system?

Combining the scale problem and the looping problem suggests
optimization would fit most sensibly in an environment where the number
of choices under consideration is small, and some loop-free properties
already exist.

Such conditions happen to occur where autonomous systems meet, and
especially for a stub AS. Careful inter-AS BGP optimization makes a
good deal more practical sense than, say, hysterically
self-reconfiguring OSPF.

Utilization-based routing between two ASes is an excercise in futility
if it's in a single location (load balancing is simpler and just as
effective) and isn't simpler than across multiple ASes if the
interconnects are in different places of the network topology.

Besides, the whole point is that you'd be able to go from New York to
London over Tokio if the direct line is congested.

Iljitsch van Beijnum wrote:

Utilization-based routing between two ASes is an excercise in futility
if it's in a single location (load balancing is simpler and just as
effective) and isn't simpler than across multiple ASes if the
interconnects are in different places of the network topology

Pure utilization-based routing between 2 AS's is, in effect, load balancing, so I'm not sure how to interpret your first instance. But in any case, adding performance objectives to a load optimization problem makes the solution more demanding. (It is possible in practice to observe performance differences between 2 paths from AS A to directly connected AS B - not as many as you see comparing different transit choices to a far point, but they still exist.)

Optimization across topologically distant egress points from AS A to AS B is harder, and if the choice is a nearby link to B and a far link to some other AS C, harder still. But in practical networking, this is not the common problem.

Practically speaking, most AS's fall between your two cases - that is, they have at least one place in their topology where they border several other AS's. This provides multiple pre-computed, loop-free paths to the same end point. Picking the "best" one for near-optimal performance and low congestion in this context is, I would suggest, beneficial, and if done correctly, not prone to the suboptimality discussed in the paper that started this thread.

Besides, the whole point is that you'd be able to go from New York to
London over Tokio if the direct line is congested.

Given good end to end performance measurement as part of the path selection, a planet-circling route is unlikely. Few apps will perform well if packets have to circumnavigate (streaming and email are among the only candidates).

So to an extent, I agree - performing dynamic re-routing over arbitrary paths purely to avoid congestion isn't a great idea. You can indeed end up bouncing of satellites. But if you add performance sensitivity, you have a harder optimization challenge, but a much more worthwhile result.

Mike