Selfish routing

How do network operators maintain "fairness" in their networks in
the face of selfish behaivor? Although this article concerns some
of the "smart routing" products, we see the same thing with other
applications (and even malicious applications like worms).

Every 5 years or so we discuss the need for something like a "penalty
box" for ill behaived traffic. But in the end, that's too hard. Its
easier to add capacity than to solve the fairness problem.

http://www.nytimes.com/2003/04/24/technology/circuits/24next.html
  Like motorists who cut off other cars as they swerve onto residential
  streets to speed their own trips, an Internet based on what Dr.
  Roughgarden and Dr. Tardos call "selfish routing" might indeed speed up
  the journeys of some data packets. But over all, the two researchers
  found, the result is quite different. Those shortcuts through side
  streets often have the effect of delaying other drivers, or in the
  Internet's case, packets.
[...]
  One antidote to selfish routing, the two researchers found, is more
  capacity. Optimum overall system speeds can be restored despite selfish
  routing by either doubling the number of lanes on a highway or doubling
  the bandwidth of a communications link. Particularly in the case of
  roads, however, that is rarely practical or even desirable.

The article (mentioned RouteScience's "product"). RS didn't seem to talk
about doing anything bad (other than pinging/monitoring) end-user
performance destinations. I suppose that could add an unacceptable amount of
overhead to some connections, but it looked like it just dynamically
adjusted certain BGP prefs in one networks' edge routers out of the
available egress connections. It didn't talk about source routing or
anything that would attempt to make "in-between" hop decisions discretely.
How is this a bad thing? How is this different than what SAVVIS or Internap
claim to do?

Or did I miss the point of the discussion on selfish routing?

Deepak Jain
AiNET

Thus spake "Sean Donelan" <sean@donelan.com>

How do network operators maintain "fairness" in their networks in
the face of selfish behaivor? Although this article concerns some
of the "smart routing" products, we see the same thing with other
applications (and even malicious applications like worms).

Every 5 years or so we discuss the need for something like a "penalty
box" for ill behaived traffic. But in the end, that's too hard. Its
easier to add capacity than to solve the fairness problem.

Are you asking about strategies to fix the uneven resource utilization
inherently caused by shortest-path routing, or are you asking about how to
deal with abusive traffic? I don't see a connection between the two.

S

Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

:: The article (mentioned RouteScience's "product"). RS didn't seem to talk
:: about doing anything bad (other than pinging/monitoring) end-user
:: performance destinations. I suppose that could add an unacceptable amount of
:: overhead to some connections, but it looked like it just dynamically
:: adjusted certain BGP prefs in one networks' edge routers out of the
:: available egress connections. It didn't talk about source routing or
:: anything that would attempt to make "in-between" hop decisions discretely.
:: How is this a bad thing? How is this different than what SAVVIS or Internap
:: claim to do?
::
:: Or did I miss the point of the discussion on selfish routing?
::

Not to say you missed the point, but i think the purpose of mentioning
RS was to suggest that there are people working on optimizing internet
performance within the existing routing framework of the net. Dr.
Roughgarden's work is completely different and has primarily been proved
on theoretical models, and not anyone's IP network.

see: http://wisl.ece.cornell.edu/ECE794/Apr2/roughgarden2002.pdf

cheers,
-jba

Deepak, Sean,

Deepak Jain wrote:

The article (mentioned RouteScience's "product"). ...
How is this a bad thing? How is this different than what SAVVIS or Internap
claim to do?

Or did I miss the point of the discussion on selfish routing?

No, I wouldn't say you missed the point at all :slight_smile:

Dr Roughgarden's results are, in brief:

   1/ networks can, in principle, be routed for minimal latency

   2/ strict "selfish" routing will (under certain conditions) fall short of that ideal, but by a bounded amount - at most, a factor of 4/3

   3/ some simple workarounds exist to eliminate the suboptimality

Note what's missing from the list: if you just plug in and run a complex network, does it achieve the optimum from point 1? Dr Roughgarden doesn't say. On this list, I think I can leave it as a rhetorical question.

If you're part of a network that's not working optimally, you can attempt to optimize it centrally/globally, you can optimize locally, or you can leave it alone. Dr Roughgarden observes that the first answer is sometimes better than the second, but it's impractical. He certainly does not say that the second - local route optimization - is in any way a step backwards relative to the third - living with whatever your network happens to be doing.

So let me put this another way: I agree with Sean's original comment that adding more bandwidth makes networks better, but only on condition that you know how to use it.

Mike Lloyd
CTO, RouteScience

So let me put this another way: I agree with Sean's original comment
that adding more bandwidth makes networks better, but only on condition
that you know how to use it.

Someone who built a rather good network used to say something along the
lines of "You are confused. QoS does not stand for Quality of Service. It
stands for Quantity of Service. What it means is that you don't have enough
capacity so you drop packets on the floor of those who pay you less money
before dropping on the floor packets of those who pay you more. At the end,
you still drop packets." Having capacity *always* makes a network better.

Alex

Someone who built a rather good network used to say something along the
lines of "You are confused. QoS does not stand for Quality of Service. It
stands for Quantity of Service. What it means is that you don't have enough
capacity so you drop packets on the floor of those who pay you less money
before dropping on the floor packets of those who pay you more. At the end,
you still drop packets." Having capacity *always* makes a network better.

While it�s general knowledge, it should be pointed out that having capacity
at each and every millisecond is quite different game than having your 5 minute
averages look nice.

Pete

Thus spake <alex@yuriev.com>

What it means is that you don't have enough capacity so you drop
packets on the floor of those who pay you less money before
dropping on the floor packets of those who pay you more. At the
end, you still drop packets." Having capacity *always* makes a
network better.

Ah, but there are times when suboptimal paths have spare capacity but you
are dropping packets on the optimal path(s) due to congestion. An
"unselfish" routing model would allow you to use _all_ available capacity in
the network before packets get dropped.

This isn't just theory; the ISPs using an "unselfish routing" schemes today
consider that a competitive advantage and thus don't publish details.

S

Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

In a message written on Wed, Apr 23, 2003 at 10:52:16PM -0400, Sean Donelan wrote:

  One antidote to selfish routing, the two researchers found, is more
  capacity. Optimum overall system speeds can be restored despite selfish
  routing by either doubling the number of lanes on a highway or doubling
  the bandwidth of a communications link. Particularly in the case of
  roads, however, that is rarely practical or even desirable.

Primary path gets full.

Provider routes to secondary path, which is better than primary path
when full.

Users and provider discover life would be better when primary path had
more bandwidth.

News at 11.

I mean, really, the fact that a secondary path is worse than a primary
path with no capacity is a no brainer, couldn't these people be doing
something more useful?

Ah, but there are times when suboptimal paths have spare capacity but you
are dropping packets on the optimal path(s) due to congestion. An
"unselfish" routing model would allow you to use _all_ available capacity in
the network before packets get dropped.

Make optimal path have more capacity.
No need for FastPath(tm) or any other market hype.
Case closed.

This isn't just theory; the ISPs using an "unselfish routing" schemes today
consider that a competitive advantage and thus don't publish details.

Or maybe it is because

"My outbound is your inbound"
"I always can control my outbound"
"Therefore, you cannot control your inbound"
"Therefore, your claims of 'traffic management' is marketing hype"

Alex

Yes but... I seem to recall a few things about the network you are
referring to (the "You are confused" is kindof a tip-off :P), and one of
those things was the complete and total meltdown with every fiber cut.

Sometimes you can win the battle with quantity of service, and sometimes
you just don't have a choice in the matter... Why throw away the ability
to keep your network alive in the event of fiber cuts, DoS, or just the
realities of business, because of engineering religion...

Yes but... I seem to recall a few things about the network you are
referring to (the "You are confused" is kindof a tip-off :P), and one of
those things was the complete and total meltdown with every fiber cut.

That was a totally braindead design - claiming that no backup path is better
than an ATM backup path is just silly.

Sometimes you can win the battle with quantity of service, and sometimes
you just don't have a choice in the matter... Why throw away the ability
to keep your network alive in the event of fiber cuts, DoS, or just the
realities of business, because of engineering religion...

Agreed.

However, claims "we have a special technology that magically avoids problems
in the networks that we do not control" is the egineering religion.

Alex

Magically, no. Random good luck just by having lots of paths to choose
from and a way to detect which one "works"... it's possible.

Theoretically there *IS* a use for optimized routing (yes, outbound only),
it's just that implementing it is a little more difficult than talking
about it. :stuck_out_tongue:

> Someone who built a rather good network used to say something along the
> lines of "You are confused. QoS does not stand for Quality of Service. It
> stands for Quantity of Service. What it means is that you don't have enough
> capacity so you drop packets on the floor of those who pay you less money
> before dropping on the floor packets of those who pay you more. At the end,
> you still drop packets." Having capacity *always* makes a network better.

In general I agree here. Although its not always practical to do the necessary
upgrades, especially in last mile where costs may be prohibitive.

While it�s general knowledge, it should be pointed out that having capacity
at each and every millisecond is quite different game than having your 5 minute
averages look nice.

I'm not sure it is general knowledge, I explain to someone at least once a week
that just because youre graph is at 70% doesnt mean you have 30% spare.

Steve

> > Someone who built a rather good network used to say something along the
> > lines of "You are confused. QoS does not stand for Quality of Service. It
> > stands for Quantity of Service. What it means is that you don't have enough
> > capacity so you drop packets on the floor of those who pay you less money
> > before dropping on the floor packets of those who pay you more. At the end,
> > you still drop packets." Having capacity *always* makes a network better.

In general I agree here. Although its not always practical to do the necessary
upgrades, especially in last mile where costs may be prohibitive.

If this is the last mile which is the problem, it will *always* be a problem
no matter if one talks about primary or secondary path.

Alex

Thus spake <alex@yuriev.com>

> Ah, but there are times when suboptimal paths have spare capacity
> but you are dropping packets on the optimal path(s) due to
> congestion. An "unselfish" routing model would allow you to use
> _all_ available capacity in the network before packets get dropped.

Make optimal path have more capacity.

If your lead time for ordering circuits is <1 day and your cost for excess
bandwidth is zero, that's certainly a viable strategy. Most of us, even
facilities-based carriers, don't live in that dreamland.

No need for FastPath(tm) or any other market hype.
Case closed.

Please distinguish between startups desperately marketing OSPF under a
trademark, and tier 1 carriers who use _significantly different_ routing
strategies and won't even acknowledge it without an NDA.

"My outbound is your inbound"
"I always can control my outbound"
"Therefore, you cannot control your inbound"
"Therefore, your claims of 'traffic management' is marketing hype"

A carrier can't exercise fine-grained control over what traffic levels their
peers/customers/upstreams send them, but it is possible to react in
real-time to varying traffic levels and prevent congestion (within your own
network) from flash crowds, link outages, peer flaps, etc.

Capcity, even in our current bandwidth glut, is expensive. If you can
maintain the same performance level with less capacity, you keep more
profits at the end of the day -- and that's the real goal, not design
purity.

S

Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

alex@yuriev.com wrote:

If this is the last mile which is the problem, it will *always* be a problem
no matter if one talks about primary or secondary path.

Except when the last mile has an alternate (eg, a multihomed corporate site).

Mike

>
> Make optimal path have more capacity.

If your lead time for ordering circuits is <1 day and your cost for excess
bandwidth is zero, that's certainly a viable strategy. Most of us, even
facilities-based carriers, don't live in that dreamland.

No, it takes me 60 days to get approval to order a circuit which wont be
delivered for another 90 days, and somehow I have no problem with it
consulting for non-facilities based carriers.

The reason for the problem is that there are people at the facilities based
carriers that have no interest in saving the money and making their network
more flexible, largely due to constant hand-greasing from the sales people
those who are selling them equipment to make marginal improvements in their
very broken networks.

No backbone ever should have congestion inside itself, and no backbone ever
gets to control someone else's network. This is the fundamentals of the
business case at hand, which cannot and should not be redefined. So figure
out how to

  (a) not have congestion inside the backbone itself

  (b) not have congestion on the interconnects

Please distinguish between startups desperately marketing OSPF under a
trademark, and tier 1 carriers who use _significantly different_ routing
strategies and won't even acknowledge it without an NDA.

The problem with tier-1 carriers is that their networks are a mess since too
many of them have too many buyers that get too much gooey stuff stuck to
their hands for buying overpriced and wrong gear and services.

A carrier can't exercise fine-grained control over what traffic levels
their peers/customers/upstreams send them, but it is possible to react in
real-time to varying traffic levels and prevent congestion (within your
own network) from flash crowds, link outages, peer flaps, etc.

Business case requirement (a) - your internal outages should not cause your
backbone links to overflow, especially if you claim to be a tier-1 carrier.
If it does, you do not have (a) requirement met, so solving any other issues
is a waste of time.

Capcity, even in our current bandwidth glut, is expensive. If you can
maintain the same performance level with less capacity, you keep more
profits at the end of the day -- and that's the real goal, not design
purity.

Rubbish again.

Capacity (both longhaul and short haul) and bandwidth is cheap for the
companies. However, if the buyers actually push sellers, the sellers won't
have a reason to take buyers to Morrimotto's, give them Louis Vuitton
handbags and give them SuperBowl tickets.

Alex

Nope, then you either will get the congestion inside the corporate network (
which you should not ) or the second path actually is the best path.

Alex

You know, Iron Chef references making it in to routing discussion, what a
great world we live in!