Can P2P applications learn to play fair on networks?

Sean_Donelan · October 21, 2007, 7:24am

Much of the same content is available through NNTP, HTTP and P2P. The content part gets a lot of attention and outrage, but network engineers seem to be responding to something else.

If its not the content, why are network engineers at many university networks, enterprise networks, public networks concerned about the impact particular P2P protocols have on network operations? If it was just a
single network, maybe they are evil. But when many different networks
all start responding, then maybe something else is the problem.

The traditional assumption is that all end hosts and applications cooperate and fairly share network resources. NNTP is usually considered a very well-behaved network protocol. Big bandwidth, but sharing network resources. HTTP is a little less behaved, but still roughly seems to share network resources equally with other users. P2P applications seem
to be extremely disruptive to other users of shared networks, and causes
problems for other "polite" network applications.

While it may seem trivial from an academic perspective to do some things,
for network engineers the tools are much more limited.

User/programmer/etc education doesn't seem to work well. Unless the network enforces a behavor, the rules are often ignored. End users generally can't change how their applications work today even if they wanted too.

Putting something in-line across a national/international backbone is extremely difficult. Besides network engineers don't like additional
in-line devices, no matter how much the sales people claim its fail-safe.

Sampling is easier than monitoring a full network feed. Using netflow sampling or even a SPAN port sampling is good enough to detect major issues. For the same reason, asymetric sampling is easier than requiring symetric (or synchronized) sampling. But it also means there will be
a limit on the information available to make good and bad decisions.

Out-of-band detection limits what controls network engineers can implement on the traffic. USENET has a long history of generating third-party cancel messages. IPS systems and even "passive" taps have long used third-party
packets to respond to traffic. DNS servers been used to re-direct subscribers to walled gardens. If applications responded to ICMP Source Quench or other administrative network messages that may be better; but they don't.

Florian_Weimer · October 21, 2007, 3:17pm

* Sean Donelan:

If its not the content, why are network engineers at many university
networks, enterprise networks, public networks concerned about the
impact particular P2P protocols have on network operations? If it was
just a single network, maybe they are evil. But when many different
networks all start responding, then maybe something else is the
problem.

Uhm, what about civil liability? It's not necessarily a technical issue
that motivates them, I think.

The traditional assumption is that all end hosts and applications
cooperate and fairly share network resources. NNTP is usually
considered a very well-behaved network protocol. Big bandwidth, but
sharing network resources. HTTP is a little less behaved, but still
roughly seems to share network resources equally with other users. P2P
applications seem to be extremely disruptive to other users of shared
networks, and causes problems for other "polite" network applications.

So is Sun RPC. I don't think the original implementation performs
exponential back-off.

If there is a technical reason, it's mostly that the network as deployed
is not sufficient to meet user demands. Instead of providing more
resources, lack of funds may force some operators to discriminate
against certain traffic classes. In such a scenario, it doesn't even
matter much that the targeted traffic class transports content of
questionable legaility. It's more important that the measures applied
to it have actual impact (Amdahl's law dictates that you target popular
traffic), and that you can get away with it (this is where the legality
comes into play).

Sam_Stickland1 · October 22, 2007, 1:12pm

Sean Donelan wrote:

Much of the same content is available through NNTP, HTTP and P2P. The content part gets a lot of attention and outrage, but network engineers seem to be responding to something else.

If its not the content, why are network engineers at many university networks, enterprise networks, public networks concerned about the impact particular P2P protocols have on network operations? If it was just a
single network, maybe they are evil. But when many different networks
all start responding, then maybe something else is the problem.

The traditional assumption is that all end hosts and applications cooperate and fairly share network resources. NNTP is usually considered a very well-behaved network protocol. Big bandwidth, but sharing network resources. HTTP is a little less behaved, but still roughly seems to share network resources equally with other users. P2P applications seem
to be extremely disruptive to other users of shared networks, and causes
problems for other "polite" network applications.

What exactly is it that P2P applications do that is impolite? AFAIK they are mostly TCP based, so it can't be that they don't have any congestion avoidance, it's just that they utilise multiple TCP flows? Or it is the view that the need for TCP congestion avoidance to kick in is bad in itself (i.e. raw bandwidth consumption)?

It seems to me that the problem is more general than just P2P applications, and there are two possible solutions:

1) Some kind of magical quality is given to the network to allow it to do congestion avoidance on an IP basis, rather than on a TCP flow basis. As previously discussed on nanog there are many problems with this approach, not least the fact the core ends up tracking a lot of flow information.

2) A QoS scavenger class is implemented so that users get a guaranteed minimum, with everything above this marked to be dropped first in the event of congestion. Of course, the QoS markings aren't carried inter-provider, but I assume that most of the congestion this thread talks about is occuring the first AS?

Sam

Bora_Akyol2 · October 22, 2007, 3:54pm

Sean

I don't think this is an issue of "fairness." There are two issues at play
here:

1) Legal Liability due to the content being swapped. This is not a technical
matter IMHO.

2) The breakdown of network engineering assumptions that are made when
network operators are designing networks.

I think network operators that are using boxes like the Sandvine box are
doing this due to (2). This is because P2P traffic hits them where it hurts,
aka the pocketbook. I am sure there are some altruistic network operators
out there, but I would be sincerely surprised if anyone else was concerned
about "fairness"

Regards

Bora

Sean_Donelan · October 22, 2007, 4:12pm

The problem with words is all the good ones are taken. The word "Fairness" has some excess baggage, nevertheless it is the word used.

Network operators probably aren't operating from altruistic principles, but for most network operators when the pain isn't spread equally across the the customer base it represents a "fairness" issue. If 490 customers are complaining about bad network performance and the cause is traced to what 10 customers are doing, the reaction is to hammer the nails sticking out.

Whose traffic is more "important?" World of Warcraft lagged or P2P throttled? The network operator makes P2P a little worse and makes WoW a little better, and in the end do they end up somewhat "fairly" using the same network resources. Or do we just put two extremely vocal groups, the gamers and the p2ps in a locked room and let the death match decide the winnner?

Bora_Akyol2 · October 22, 2007, 4:26pm

I see your point. The main problem I see with the traffic shaping or worse
boxes is that comcast/ATT/... Sells a particular bandwidth to the customer.
Clearly, they don't provision their network as Number_Customers*Data_Rate,
they provision it to a data rate capability that is much less than the
maximum possible demand.

This is where the friction in traffic that you mention below happens.

I have to go check on my broadband service contract to see how they word the
bandwidth clause.

Bora

Jack_Bates · October 22, 2007, 5:35pm

Bora Akyol wrote:

1) Legal Liability due to the content being swapped. This is not a technical
matter IMHO.

Instead of sending an icmp host unreachable, they are closing the connection via spoofing. I think it's kinder than just dropping the packets all together.

2) The breakdown of network engineering assumptions that are made when
network operators are designing networks.

I think network operators that are using boxes like the Sandvine box are
doing this due to (2). This is because P2P traffic hits them where it hurts,
aka the pocketbook. I am sure there are some altruistic network operators
out there, but I would be sincerely surprised if anyone else was concerned
about "fairness"

As has been pointed out a few times, there are issues with CMTS systems, including maximum upstream bandwidth allotted versus maximum downstream bandwidth. I agree that there is an engineering problem, but it is not on the part of network operators. DSL fits in it's own little world, but until VDSL2 was designed, there were hard caps set to down speed versus up speed. This has been how many last mile systems were designed, even in shared bandwidth mediums. More downstream capacity will be needed than upstream. As traffic patterns have changed, the equipment and the standards it is built upon have become antiquated.

As a tactical response, many companies do not support the operation of servers for last mile, which has been defined to include p2p seeding. This is their right, and it allows them to protect the precious upstream bandwidth until technology can adapt to a high capacity upstream as well as downstream for the last mile.

Currently I show an average 2.5:1-4:1 ratio at each of my pops. Luckily, I run a DSL network. I waste a lot of upstream bandwidth on my backbone. Most downstream/upstream ratios I see on last mile standards and equipment derived from such standards isn't even close to 4:1. I'd expect such ratio's if I filtered out the p2p traffic on my network. If I ran a shared bandwidth last mile system, I'd definitely be filtering unless my overall customer base was small enough to not care about maximums on the CMTS.

Fixed downstream/upstream ratios must die in all standards and implementations. It seems a few newer CMTS are moving that direction (though I note one I quickly found mentions it's flexible ratio as beyond DOCSIS 3.0 features which implies the standard is still fixed ratio), but I suspect it will be years before networks can adapt.

Jack Bates

Rich_Groves · October 22, 2007, 8:05pm

I'm a bit late to this conversation but I wanted to throw out a few bits of info not covered.

A company called Oversi makes a very interesting solution for caching Torrent and some Kad based overlay networks as well all done through some cool strategically placed taps and prefetching. This way you could "cache out" at whatever rates you want and mark traffic how you wish as well. This does move a statistically significant amount of traffic off of the upstream and on a gigabit ethernet (or something) attached cache server solving large bits of the HFC problem. I am a fan of this method as it does not require a large foot print of inline devices rather a smaller footprint of statics gathering sniffers and caches distributed in places that make sense.

Also the people at Bittorrent Inc have a cache discovery protocol so that their clients have the ability to find cache servers with their hashes on them .

I am told these methods are in fact covered by the DMCA but remember I am no lawyer.

Feel free to reply direct if you want contacts

Rich

Frank_Bulk1 · October 23, 2007, 2:31am

Here's a few downstream/upstream numbers and ratios:
    ADSL2+: 24/1.5 = 16:1 (sans Annex.M)
DOCSIS 1.1: 38/9 = 4.2:1 (best case up and downstream modulations and
carrier widths)
      BPON: 622/155 = 4:1
      GPON: 2488/1244 = 2:1

Only the first is non-shared, so that even though the ratio is poor, a
person can fill their upstream pipe up without impacting their neighbors.

It's an interesting question to ask how much engineering decisions have led
to the point where we are today with bandwidth-throttling products, or if
that would have happened in an entirely symmetrical environment.

DOCSIS 2.0 adds support for higher levels of modulation on the upstream,
plus wider bandwidth
(http://i.cmpnet.com/commsdesign/csd/2002/jun02/imedia-fig1.gif), but still
not enough to compensate for the higher downstreams possible with channel
bonding in DOCSIS 3.0.

Frank

Frank_Bulk1 · October 23, 2007, 2:42am

I don't see how this Oversi caching solution will work with today's HFC
deployments -- the demodulation happens in the CMTS, not in the field. And
if we're talking about de-coupling the RF from the CMTS, which is what is
happening with M-CMTSes
(http://broadband.motorola.com/ips/modular_CMTS.html), you're really
changing an MSO's architecture. Not that I'm dissing it, as that may be
what's necessary to deal with the upstream bandwidth constraint, but that's
a future vision, not a current reality.

Frank

Gadi_Evron1 · October 23, 2007, 3:35am

Hey Rich.

We discussed the technology before but the actual mental click here is important -- thank you.

BTW, I *think* it was Randy Bush who said "today's leechers are tomorrow's cachers". His quote was longer but I can't remember it.

Gadi.

Rich_Groves · October 23, 2007, 4:49am

Frank,

The problem caching solves in this situation is much less complex than what you are speaking of. Caching toward your client base brings down your transit costs (if you have any)........or lowers congestion in congested areas if the solution is installed in the proper place. Caching toward the rest of the world gives you a way to relieve stress on the upstream for sure.

Now of course it is a bit outside of the box to think that providers would want to cache not only for their internal customers but also users of the open internet. But realistically that is what they are doing now with any of these peer to peer overlay networks, they just aren't managing the boxes that house the data. Getting it under control and off of problem areas of the network should be the first (and not just future) solution.

There are both negative and positive methods of controlling this traffic. We've seen the negative of course, perhaps the positive is to give the user what they want ......just on the providers terms.

my 2 cents

Rich

Iljitsch_van_Beijnum · October 23, 2007, 11:18am

The problem here is that they seem to be using a sledge hammer: BitTorrent is essentially left dead in the water. And they deny doing anything, to boot.

A reasonable approach would be to throttle the offending applications to make them fit inside the maximum reasonable traffic envelope.

What I would like is a system where there are two diffserv traffic classes: normal and scavenger-like. When a user trips some predefined traffic limit within a certain period, all their traffic is put in the scavenger bucket which takes a back seat to normal traffic. P2P users can then voluntarily choose to classify their traffic in the lower service class where it doesn't get in the way of interactive applications (both theirs and their neighbor's). I believe Azureus can already do this today. It would even be somewhat reasonable to require heavy users to buy a new modem that can implement this.

Marshall_Eubanks3 · October 23, 2007, 12:52pm

I also would like to see a UDP scavenger service, for those applications that generate lots of bits but
can tolerate fairly high packet losses without replacement. (VLBI, for example, can in principle live with 10% packet loss without much pain.)

Drop it if you need too, if you have the resources let it through. Congestion control is not an issue because, if there is congestion, it gets dropped.

In this case, I suspect that a "worst effort" TOS class would be honored across domains. I also suspect that BitTorrent could live with this TOS quite nicely.

Regards
Marshall

Iljitsch_van_Beijnum · October 23, 2007, 1:07pm

I also would like to see a UDP scavenger service, for those applications that generate lots of bits but
can tolerate fairly high packet losses without replacement. (VLBI, for example, can in principle live with 10% packet loss without much pain.)

Note that this is slightly different from what I've been talking about: if a user trips the traffic volume limit and is put in the lower-than-normal traffic class, that user would still be using TCP apps so very high packet loss rates would be problematic here.

So I guess this makes three traffic classes.

In this case, I suspect that a "worst effort" TOS class would be honored across domains.

If not always by choice.

Sam_Stickland1 · October 23, 2007, 1:43pm

Iljitsch van Beijnum wrote:

Network operators probably aren't operating from altruistic principles, but for most network operators when the pain isn't spread equally across the the customer base it represents a "fairness" issue. If 490 customers are complaining about bad network performance and the cause is traced to what 10 customers are doing, the reaction is to hammer the nails sticking out.

The problem here is that they seem to be using a sledge hammer: BitTorrent is essentially left dead in the water. And they deny doing anything, to boot.

A reasonable approach would be to throttle the offending applications to make them fit inside the maximum reasonable traffic envelope.

What I would like is a system where there are two diffserv traffic classes: normal and scavenger-like. When a user trips some predefined traffic limit within a certain period, all their traffic is put in the scavenger bucket which takes a back seat to normal traffic. P2P users can then voluntarily choose to classify their traffic in the lower service class where it doesn't get in the way of interactive applications (both theirs and their neighbor's). I believe Azureus can already do this today. It would even be somewhat reasonable to require heavy users to buy a new modem that can implement this.

Surely you would only want to set traffic that falls outside the limit as scavenger, rather than all of it?

S

Marshall_Eubanks3 · October 23, 2007, 1:59pm

Comcast has come out with a little more detail on what they were doing :

http://bits.blogs.nytimes.com/2007/10/22/comcast-were-delaying-not-blocking-bittorrent-traffic/

Speaking on background in a phone interview earlier today, a Comcast Internet executive admitted that reality was a little more complex. The company uses data management technologies to conserve bandwidth and allow customers to experience the Internet without delays. As part of that management process, he said, the company occasionally – but not always – delays some peer-to-peer file transfers that eat into Internet speeds for other users on the network.

Iljitsch_van_Beijnum · October 23, 2007, 2:26pm

If the ISP gives you (say) 1 GB a month upload capacity and on the 3rd you've used that up, then you'd be in the "worse effort" traffic class for ALL your traffic the rest of the month. But if you voluntarily give your P2P stuff the worse effort traffic class, this means you get to upload all the time (although probably not as fast) without having to worry about hurting your other traffic. This is both good in the short term, because your VoIP stuff still works when an upload is happening, and in the long term, because you get to do video conferencing throughout the month, which didn't work before after you went over 1 GB.

Joe_Provo4 · October 23, 2007, 2:34pm

>Network operators probably aren't operating from altruistic
>principles, but for most network operators when the pain isn't
>spread equally across the the customer base it represents a
>"fairness" issue. If 490 customers are complaining about bad
>network performance and the cause is traced to what 10 customers
>are doing, the reaction is to hammer the nails sticking out.

The problem here is that they seem to be using a sledge hammer:
BitTorrent is essentially left dead in the water.

Wrong - seeding from scratch, that is uploading without any
download component, is being clobbered. Seeding back into the
swarm works while one is still taking chunks down, then closes.
Essentially, all clients into a client similar to BitTyrant
and focuses on, as Charlie put it earlier, customers downloading
stuff.

From the perspective of thee protocol designers, unfair sharing

is indeed "dead" but to state it in a way that indicates customers
cannot *use* BT for some function is bogus. Part of the reason
why caching, provider based, etc schemes seem to be unpopular
is that private trackers appear to operate much in the way that
old BBS download/uploads used to... you get credits for contributing
and can only pull down so much based on such credits. Not just
bragging rights, but users need to take part in the transactions
to actually use the service. A provider-hosted solution which
managed to transparently handle this across multiple clients and
trackers would likely be popular with the end users.

Cheers,

Joe

Sam_Stickland1 · October 23, 2007, 3:12pm

Iljitsch van Beijnum wrote:

What I would like is a system where there are two diffserv traffic classes: normal and scavenger-like. When a user trips some predefined traffic limit within a certain period, all their traffic is put in the scavenger bucket which takes a back seat to normal traffic. P2P users can then voluntarily choose to classify their traffic in the lower service class where it doesn't get in the way of interactive applications (both theirs and their neighbor's).

Surely you would only want to set traffic that falls outside the limit as scavenger, rather than all of it?

If the ISP gives you (say) 1 GB a month upload capacity and on the 3rd you've used that up, then you'd be in the "worse effort" traffic class for ALL your traffic the rest of the month. But if you voluntarily give your P2P stuff the worse effort traffic class, this means you get to upload all the time (although probably not as fast) without having to worry about hurting your other traffic. This is both good in the short term, because your VoIP stuff still works when an upload is happening, and in the long term, because you get to do video conferencing throughout the month, which didn't work before after you went over 1 GB.

Oh, you mean to do this based on traffic volume, and not current traffic rate? I suspose an external monitoring/billing tool would need track this and reprogram the neccessary router/switch, but it's the sort of infrastructure most ISPs would need to have anyway.

I was thinking more along the lines of: everything above 512 kbps (that isn't already marked worse-effort) gets marked worse effort, all of the time.

Sam