[Nanog] Lies, Damned Lies, and Statistics [Was: Re: ATT VP: Internet to hit capacity by 2010]

Here in comcast land hdtv is actually averaging around 12 megabits a

second. Still adds up to staggering numbers..:slight_smile:

Another disturbing fact inside this entire mess is that, by compressing
HD content, consumers are noticing the degradation in quality:

http://cbs5.com/local/hdtv.cable.compression.2.705405.html

So, we have a "Tragedy of The Commons" situation that is completely
created by the telcos themselves trying to force consumer decisions,
and then failing to deliver, but bemoaning the fact that
infrastructure is being over-utilized by file-sharers (or
"Exafloods" or whatever the apocalyptic issue of the day is for
telcos).

A real Charlie Foxtrot.

- - ferg

Time to push multicast as transport for bittorrent? If the downloads get
better performance that way, I think the clients would be around quicker
that multicast would be enabled for consumer DSL or cable.

Pete

Paul Ferguson wrote:

Time to push multicast as transport for bittorrent?

Bittorrent clients are already multicast, only they do it in a crude way
that does not match network topology as well as it could. Moving to use
IP multicast raises a whole host of technical issues such as lack of
multicast peering. Solving those technical issues requires ISP
cooperation, i.e. to support global multicast.

But there is another way. That is for software developers to build a
modified client that depends on a topology guru for information on the
network topology. This topology guru would be some software that is run
by an ISP, and which communicates with all other topology gurus in
neighboring ASes These gurus learn the topology using some kind of
protocol like a routing protocol. They also have some local intelligence
configured by the ISP such as allowed traffic rates at certain time
periods over certain paths. And they share all of that information in
order to optimize the overall downloading of all files to all clients
which share the same guru. Some ISPs have local DSL architectures in
which it makes better sense to download a file from a remote location,
than from the guy next door. In that case, an ISP could configure a guru
to prefer circuits into their data centre, then operate clients in the
data center that effectively cache files. But the caching thing is
optional.

Then, a bittorrent client doesn't have to guess how to get files
quickly, it just has to follow the guru's instructions. Part of this
would involve cooperating with all other clients attached to the same
guru so that no client downloads distant blocks of data that have
already been downloaded by another local client. This is the part that
really starts to look like IP multicast except that it doesn't rely on
all clients functioning in real time. Also, it looks like NNTP news
servers except that the caching is all done on the clients. The gurus
never cache or download files.

For this to work, you need to start by getting several ISPs to buy-in,
help with the design work, and then deploy the gurus. Once this proves
itself in terms of managing how and *WHEN* bandwidth is used, it should
catch on quite quickly with ISPs. Note that a key part of this
architecture is that it allows the ISP to open up the throttle on
downloads during off-peak hours so that most end users can get a
predictable service of all downloads completed overnight.

--Michael Dillon

While the current bittorrent implementation is suboptimal for large
swarms (where number of adjacent peers is significantly less than the
number of total participants) I fail to figure out the necessary
mathematics where topology information would bring superior results
compared to the usual greedy algorithms where data is requested from the
peers where it seems to be flowing at the best rates. If local peers
with sufficient upstream bandwidth exist, majority of the data blocks
are already retrieved from them.

In many locales ISP's tend to limit the available upstream on their
consumer connections, usually causing more distant bits to be delivered
instead.

I think the most important metric to study is the number of times the
same piece of data is transmitted in a defined time period and try to
figure out how to optimize for that. For a new episode of BSG, there are
a few hundred thousand copies in the first hour and a million or so in
the first few days. With the headers and overhead, we might already be
hitting a petabyte per episode. RSS feeds seem to shorten the
distribution ramp-up from release.

The p2p world needs more high-upstream "proxies" to make it more
effective. I think locality with current torrent implementations would
happen automatically. However there are quite a few parties who are
happy to have it as bad as they can make it :slight_smile:

Is there a problem that needs to be solved that is not solved by
Akamai's of the world already?

Pete

<snip>

Isn't TCP already measuring throughput and latency of the network for
RTO etc.? Why not expose those parameters for peers to the local P2P
software, and then have it select the closest peers with either the
lowest latency, the highest throughput, or a weighed combination of
both? I'd think that would create a lot of locality in the traffic.

Regards,
Mark.

This is where you hit a serious problem. If you implemented that in a
client, it could be much worse than naive P2P for quite a lot of networks -
for example all the UK ISPs. If you have a bitstream/IPStream architecture,
your bits get hauled from local aggregation sites to your routers via L2TP
and you get billed by the telco for them; now, if you strictly localise P2P
traffic, all the localised bits will be transiting the bitstream sector
TWICE, drastically increasing your costs.

(Assumption: your upstream costs are made up of X amount of wholesale
transit+Y amount of peering, unlike your telco costs which in this case are
100% transit-like and paid for by the bit.)

Things also vary depending on the wholesale transit and peering market; for
example, someone like a customer of CityLink in Wellington, NZ would be
intensely relaxed about local traffic on the big optical ethernet pipes, but
very keen indeed to save on international transit due to the highly
constrained cable infrastructure. But if you were, say, a Dutch DSL operator
with incumbent backhaul, you might want to actively encourage P2Pers to
fetch from external peers because international peering at AMSIX is
abundant.

Basically, it's bringing traffic engineering inside the access network.

Alex

I fail to figure
out the necessary mathematics where topology information
would bring superior results compared to the usual greedy
algorithms where data is requested from the peers where it
seems to be flowing at the best rates. If local peers with
sufficient upstream bandwidth exist, majority of the data
blocks are already retrieved from them.

First, it's not a mathematical issue. It is a network operational
issue where ISPs have bandwidth caps and enforce them by traffic
shaping when thresholds are exceeded. And secondly, there are
cases where it is not in the ISP's best interest for P2P clients
to retrieve files from the client with the lowest RTT.

In many locales ISP's tend to limit the available upstream on
their consumer connections, usually causing more distant bits
to be delivered instead.

Yep, it's a game of whack-a-mole.

I think the most important metric to study is the number of
times the same piece of data is transmitted in a defined time
period and try to figure out how to optimize for that.

Or P2P developers could stop fighting ISPs and treating the Internet
as an amorphous cloud, and build something that will be optimal for
the ISPs, the end users, and the network infrastructure.

The p2p world needs more high-upstream "proxies" to make it
more effective.

That is essentially a cache, just like NNTP news servers or
Squid web proxies. But rather than making a special P2P client that
caches and proxies and fiddles with stuff, why not take all the
network intelligence code out of the client and put it into
a topology guru that runs in your local ISP's high-upstream
infrastructure. Chances are that many ISPs will put a few P2P
caching clients in the same rack as this guru if it pays them
to take traffic off one direction of the last-mile, or if it
pays them to ensure that files hang around locally longer than
they do naturally, thus saving on their upstream/peering traffic.

Is there a problem that needs to be solved that is not solved
by Akamai's of the world already?

Akamai is a commercial service that content senders can contract
with to achieve the same type of multicasting (called Content Delivery
Network) as a P2P network provides to end users. ISPs don't provide
Akamai service to their hosting customers, but they do provide those
customers with web service, mail service, FTP service, etc. I am
suggesting
that there is a way for ISPs to provide a generic BitTorrent P2P service
to any customer who wants to send content (or receive content). It would
allow heavy P2P users to evade the crude traffic shaping which tends to
be off on the 1st day of the month, then gets turned on at a threshold
and
stays on until the end of the month. Most ISPs can afford to let users
take
all they can eat during non-peak hours without congesting the network.
Even
an Australian ISP could use this type of system because they would only
open
local peering connections during off-peak, not the expensive
trans-oceanic
links. This all hinges on a cooperative P2P client that only downloads
from sites
(or address ranges) which the local topology guru directs them to.
Presumably
the crude traffic shaping systems that cap bandwidth would still remain
in
place for non-cooperating P2P clients.

--Michael Dillon

The good news about a DillTorrent solution is that at least the user and ISP
interests are aligned; there's no reason for the ISP to have the guru lie to
the users (because you just know someone'll try it). However, it does
require considerable trust from the users that it will actually lead to a
better experience, rather than just cost-saving at their expense.

And as with any client-side solution, if you can write a client that listens
to it and behaves differently you can write one that pretends to listen:-)

Alex

  Isn't TCP already measuring throughput and latency of
the network for
  RTO etc.? Why not expose those parameters for peers to
the local P2P

This is where you hit a serious problem. If you implemented
that in a client, it could be much worse than naive P2P for
quite a lot of networks - for example all the UK ISPs. If you
have a bitstream/IPStream architecture, your bits get hauled
from local aggregation sites to your routers via L2TP and you
get billed by the telco for them; now, if you strictly
localise P2P traffic, all the localised bits will be
transiting the bitstream sector TWICE, drastically increasing
your costs.

This is where all the algorithmic tinkering of the P2P software
cannot solve the problem. You need a way to insert non-technical
information about the network into the decision-making process.
The only way for this to work is to allow the network operator
to have a role in every P2P transaction. And to do that you need
a middlebox that sits in the ISP network which they can configure.
In the scenario above, I would expect the network operator to ban
connections to their DSL address block. Instead, they would put
some P2P clients in the rack with the topology guru middlebox
and direct the transactions there. Or to peers/upstreams. And
the network operator would manage all the block retrieval requests
from the P2P clients in order to achieve both traffic shaping (rate
limiting) and to ensure that multiple local clients cooperate in
retrieving unique blocks from the file to reduce total traffic from
upstreams/peers.

Basically, it's bringing traffic engineering inside the
access network.

Actually, bringing traffic engineering into the P2P service which
is where the problem exists. Or bringing the network operator into
the P2P service rather than leaving the netop as a reactive outsider.

--Michael Dillon

You could probably do this with a variant of DNS. Use an Anycast
address common to everyone to solve the discovery problem. Client
sends a DNS request for a TXT record for, as an example,
148.165.32.217.p2ptopology.org. The topology box looks at the IP
address that the request came from and does some magic based on the
requested information and returns a ranking score based on that (maybe
0-255 worse to best) that the client can then use to rank where it
downloads from. (might have to run DNS on another port so that normal
resolvers don't capture this).

The great thing is that you can use it for other things.

MMC

Don't know about the word "ban"; what we need is more like BGP than DRM.
Ideally, we want the clients to do sensible things because it works best,
not because they are being coerced. Further, once you start banning things
you get into all kinds of problems; not least that interests are no longer
aligned and trust is violated.

If DillTorrent is working well with a localpref metric of -1 (where 0 is the
free-running condition with neither local or distant preference) there
shouldn't be any traffic within the DSL pool anyway, without coercion.

There is obvious synergy with CDNs here.

Alex

(I know, replying to your own email is sad ...)

You could probably do this with a variant of DNS. Use an Anycast
address common to everyone to solve the discovery problem. Client
sends a DNS request for a TXT record for, as an example,
148.165.32.217.p2ptopology.org. The topology box looks at the IP
address that the request came from and does some magic based on the
requested information and returns a ranking score based on that (maybe
0-255 worse to best) that the client can then use to rank where it
downloads from. (might have to run DNS on another port so that normal
resolvers don't capture this).

The great thing is that you can use it for other things.
  

Since this could be dynamic (I'm guessing BGP and other things like SNMP
feeding the topology box) you could then use it to balance traffic flows
through your network to avoid congestion on certain links - that's a win
for everyone. You could get webbrowsers to look at it when you've got
multiple A records to chose which one is best for things like Flash
video etc.

MMC

NCAP - Network Capability (or Cost) Announcement Protocol.

SNSP = Simple Network Selection Protocol

Alexander Harrowell wrote:

a message of 46 lines which said:

This is where all the algorithmic tinkering of the P2P software
cannot solve the problem. You need a way to insert non-technical
information about the network into the decision-making process.

It's strange that noone in this thread mentioned P4P yet. Isn't there
someone involved in P4P at Nanog?

http://www.dcia.info/activities/p4pwg/

IMHO, the biggest issue with P4P is the one mentioned by Alexander
Harrowell. After that users have been s.....d up so many times by some
ISPs, will they trust this service?

Personally I consider P4P a big step forward; it's good to see Big Verizon
engaging with these issues in a non-coercive fashion.

Just to braindump a moment, it strikes me that it would be very useful to be
able to announce preference metrics by netblock (for example, to deal with
networks with varied internal cost metrics or to pref-in the CDN servers)
but also risky. If that was done, client developers would be well advised to
implement a check that the announcing network actually owns the netblock
they are either preffing in (to send traffic via a suboptimal route/through
a spook box of some kind/onto someone else's pain-point) or out (to restrict
traffic from reaching somewhere); you wouldn't want a hijack, whether
malicious or clue-deficient.

There is every reason to encourage the use of dynamic preference.

You can think of the scheduling process as two independent problems:
1. Given a list of all the chunks that all the peers you're connected
to have, select the chunks you think will help you complete the
fastest. 2. Given a list of all peers in a cloud, select the peers you
think will help you complete the fastest.

Traditionally, peer scheduling (#2) has been to just connect to
everyone you see and let network bottlenecks drive you toward
efficiency, as you pointed out.

However, as your chunk scheduling becomes more effective, it usually
becomes more expensive. At some point, its increasing complexity will
reverse the trend and start slowing down copies, as real-world clients
begin to block making chunk requests waiting for CPU to make
scheduling decisions.

A more selective peer scheduler would allow you to reduce the inputs
into the chunk scheduler (allowing it to do more complex things with
the same cost). The idea is, doing more math on the best data will
yield better overall results than doing less math on the best + the
worse data, with the assumption that a good peer scheduler will help
you find the best data.

As seems to be a trend, Michael appears to be fixated on a specific
implementation, and may end up driving many observers into thinking
this idea is annoying :slight_smile: However, there is a mathematical basis for
including topology (and other nontraditional) information in
scheduling decisions.

But there is another way. That is for software developers to build a
modified client that depends on a topology guru for information on
the
network topology. This topology guru would be some software that
is run

number of total participants) I fail to figure out the necessary
mathematics where topology information would bring superior results
compared to the usual greedy algorithms where data is requested
from the
peers where it seems to be flowing at the best rates. If local peers
with sufficient upstream bandwidth exist, majority of the data blocks
are already retrieved from them.

It's true that in the long run p2p transfers can optimize data sources
by measuring actual throughput, but at any given moment this approach
can only optimize within the set of known peers. The problem is that
for large swarms, any given peer only knows about a very small subset
of available peers, so it may take a long time to discover the best
peers. This means (IMO) that starting with good peers instead of
random peers can make a big difference in p2p performance, as well as
reducing data delivery costs to the ISP.

For example, let's consider a downloader in a swarm of 100,000 peers,
using a BitTorrent announce once a minute that returns 40 peers. Of
course, this is a simple case, but it should be sufficient to make the
general point that the selection of which peers you connect to matters.

Let's look at the odds that you'll find out about the closest peer (in
network terms) over time.

With random peer assignment, the odds of any random peer being the
closest peer is 40/100,000, and if you do the math, the odds of
finding the closest peer on the first announce is 1.58%. Multiplying
that out, it means that you'll have a 38.1% chance of finding the
closest peer in the first half hour, and a 61.7% chance in the first
hour, and 85.3% chance in the first two hours, and so on out as a
geometric curve.

In the real world there are factors that complicate the analysis (e.g.
most Trackers announce much less often than 1/minute, but some peers
have other discovery mechanisms such as Peer Exchange). But as far as
I can tell, the basic issue (that it takes a long time to find out
about and test data exchanges with all of the peers in a large swarm)
still holds.

With P4P, you find out about the closest peers on the first announce.

There's a second issue that I think is relevant, which is that
measured network throughput may not reflect ISP costs and business
policies. For example, a downloader might get data from a fast peer
through a trans-atlantic pipe, but the ISP would really rather have
that user get data from a fast peer on their local loop instead. This
won't happen unless the p2p network knows about (and makes decisions
based on) network topology.

What we found in our first field test was that random peer assignment
moved 98% of data between ISP's and only 2% within ISP's (and for
smaller ISP's, more like 0.1%), and that even simple network awareness
resulted in an average of 34% same-ISP data transfers (i.e. a drop of
32% in external transit). With ISP involvement, the numbers are even
better.

You can think of the scheduling process as two independent problems:
1. Given a list of all the chunks that all the peers you're connected
to have, select the chunks you think will help you complete the
fastest. 2. Given a list of all peers in a cloud, select the peers you
think will help you complete the fastest.

Traditionally, peer scheduling (#2) has been to just connect to
everyone you see and let network bottlenecks drive you toward
efficiency, as you pointed out.

However, as your chunk scheduling becomes more effective, it usually
becomes more expensive. At some point, its increasing complexity will
reverse the trend and start slowing down copies, as real-world clients
begin to block making chunk requests waiting for CPU to make
scheduling decisions.

A more selective peer scheduler would allow you to reduce the inputs
into the chunk scheduler (allowing it to do more complex things with
the same cost). The idea is, doing more math on the best data will
yield better overall results than doing less math on the best + the
worse data, with the assumption that a good peer scheduler will help
you find the best data.

Interesting approach. IMO, given modern computers, CPU is highlu
underutilized (PC's are 80% idle, and rarely CPU-bound when in use),
while bandwidth is relatively scarce, so using more CPU to optimize
bandwidth usage seems like a great tradeoff!

As seems to be a trend, Michael appears to be fixated on a specific
implementation, and may end up driving many observers into thinking
this idea is annoying :slight_smile: However, there is a mathematical basis for
including topology (and other nontraditional) information in
scheduling decisions.

_______________________________________________
NANOG mailing list
NANOG@nanog.org
http://mailman.nanog.org/mailman/listinfo/nanog

Laird Popkin
CTO, Pando Networks
520 Broadway, 10th floor
New York, NY 10012

laird@pando.com
c) 646/465-0570

However, as your chunk scheduling becomes more effective, it
usually becomes more expensive. At some point, its increasing
complexity will reverse the trend and start slowing down
copies, as real-world clients begin to block making chunk
requests waiting for CPU to make scheduling decisions.

This is not a bad thing. The intent is to optimize the whole
system, not provide the fastest copies. Those who promote QoS
often talk of some kind of scavenger level of service that
sweeps up any available bandwidth after all the important users
have gotten their fill. I see this type of P2P system in a similar
light, i.e. it allows the ISP to allow as much bandwidth use
as is economically feasible and block the rest. Since the end
user ultimately relies on an ISP having a stable network that
functions in the long term (not drives the ISP to bankruptcy)
this seems to be a reasonable tradeoff.

As seems to be a trend, Michael appears to be fixated on a
specific implementation, and may end up driving many
observers into thinking this idea is annoying :slight_smile: However,
there is a mathematical basis for including topology (and
other nontraditional) information in scheduling decisions.

There is also precedent for this in manufacturing scheduling
where you optimize your total systems by identifying the prime
bottleneck and carefully managing that single point in the
chain of operations. I'm not hung up on a specific implementation,
just trying to present a concrete example that could be a starting
point. And until today, I knew nothing about the P4P effort which
seems to be working in the same direction.

--Michael Dillon

In case anyone's curious, there's more info on P4P at http://cs-www.cs.yale.edu/homes/yong/p4p/index.html.

- Laird Popkin, CTO, Pando Networks
  mobile: 646/465-0570