Internet diameter?

Hi folks,

Does anybody have more or less recent data on the average, median and
maximum diameter (ip hop count) of the Internet? My google fu is
failing me: I've only found stuff from the '90s.

Thanks,
Bill Herrin

Hi William,

We don’t have the number sitting around, but you can get a pretty good feel by clicking
through a few of the ark monitors (http://www.caida.org/projects/ark/locations). Click on “data”
to the right of each monitor.

Bradley

Considering 40% of the “internet” is sitting in my backyard in cdn caching, I’d say the perceived diameter for that content is.... 3 or 4 hops. :wink:

...but something tells me that isn’t they response you were seeking...

... but seriously it is interesting that with local caching that much of the Internet is now sitting local in the subscriber’s ISP.

Aaron

I’d argue that’s just content (though admittedly a lot of it). You can’t cache, e.g., a SIP trunk, and offices which need to connect to each other can’t cache one another in a CDN either.

I would further argue that you can't cache active Web content, like bank
account statements, utility billing, help desk request/responses,
equipment status, and other things that change constantly.

Yes I agree Ross/Stephen. I didn’t mean to overstate the CDN fact.

I wonder what the answer is to Bill’s question is. “average, median and

maximum diameter (ip hop count) of the Internet? “

Aaron

42

now, why does it matter?

I don’t have any hard statistics but I notice that on a majority of ASs on bgp.he.net, the average AS path length is between 4 and 5. As for the average number of hops, it clearly depends on what type of traffic and many ASNs have more than one router. Going on my own experience I would say between 8 and 10 hops would be the average of non-cached content. If you included cached content such as cdns and caches then the actual average might be closer to 5 to 7. This is only an estimate from my own network and those of my clients so the actual value may be completely different.

As with what others have said, I’m not sure on what use this data, if collected, would be. Latency is the most important.

Obligatory XKCD:

https://xkcd.com/908/

-Andy

This begs the question, which is more meaningful a metric, AS-path or hop count? Many networks have a large number of routers but the packets don’t stay in them very long.

-Ben.

I’m not sure on what use this data, if collected, would be. Latency is the most important.
It’s not operationally useful in any way that I can think of, but it is interesting (at least to me). It’s possible that Bill has something in mind, though.

[…] which is more meaningful a metric, AS-path or hop count?
Good point. Before we can decide which is more useful, we have to decide what they would be useful for.
But, I think it’s not really feasible to analyze hop count, because you would have to collect that data with a huge number of traceroutes. Average/median AS-path length can be estimated by static analysis of BGP tables from various routers.

Many networks have a large number of routers but the packets don’t stay in them very long.
It wouldn’t be a very good router if the packets hung around for a long time before leaving :slight_smile:

Good question! It matters because a little over two decades ago we had
some angst as equipment configured to emit a TTL of 32 stopped being
able to reach everybody. Today we have a lot of equipment configured
to emit a TTL of 64. It's the default in Linux, for example. Are we
getting close to the limit where that will cause problems? How close?

Regards,
Bill Herrin

^
This

If it’s hop-count that’s interesting, I think that raises a question on the potential for a sudden large change in the answer, potentially with unforeseen consequences if we do have a lot of devices with TTL=64.

Imagine a “tier-1” carrying some non-trivial fraction of Internet traffic who is label-switching global table, with no TTL-propagation into MPLS, and so looks like a single layer-3 hop today. In response to traceroute-whingeing, they turn on TTL-propagation, and suddenly look like 10 layer-3 hops.

Having been in the show/hide MPLS hops internal debate at more than one employer, I’d expect flipping the switch to “show” to generate a certain support load from people complaining that they are now “more hops” away from something they care about (although RTT, packet-loss, throughput remain exactly the same). I wouldn’t have expected to break connectivity for a whole class of devices.

Regards,

Tim.

Hello,

Does anybody have more or less recent data on the average, median and
maximum diameter (ip hop count) of the Internet? My google fu is
failing me: I've only found stuff from the ‘90s.

In the past 2 years of running `traceroute` towards YouTube from
home networks, the maximum IP path length we have seen is around 22 IP hops.

details, see:
https://vaibhavbajpai.com/documents/papers/proceedings/youtube-traceroutes-commag-2018.pdf

Thanks,
Bill Herrin

PS: This (path lengths) is only towards YouTube destinations.

-- Vaibhav

Hi,

>> Does anybody have more or less recent data on the average, median and maximum diameter (ip hop count) of the Internet?

First, to give some hints regarding the initial question: A year ago I did some analysis based on Caida’s routed /24 topology data set () for data at the beginning of Jan. 2015. Its not using all available data but rather only traceroutes that reached the destination. The attached figure shows violin plots for each day - vertically, they show the distribution of Hopcounts (looks a little weird due to Hopcounts only beeing Integers). However, please keep in mind: i.) the data set is collected from only few physical locations but has traceroutes towards every routed /24 prefix. ii.) only a subset of the entire data set is shown. iii.) its from 2015. iv.) there is a good chance that it is not representative for the “entire” Internet. Secondly, regarding the ongoing discussion: +1 for Tim’s answer. IMHO, neither AS paths nor IP paths, in general, are reliable proxies for e.g. latency or physical distance. In addition, keep in mind that we are only able to observe a certain part of the Internet and thus it’s hard to make claims about the “entire” Internet. best regards, Lars

I'd argue that's just content (though admittedly a lot of it).

"just static content" would be more accurate ...

I would further argue that you can't cache active Web content, like
bank account statements, utility billing, help desk request/responses,
equipment status, and other things that change constantly.

There were many attempts at this by Johhny-cum-lately ISPs back in the 90's -- particularly Telco and Cableco's -- with their "transparent poxies". Eventually they discovered that it was more cost efficient to actually provide the customer with what the customer had purchased.

It is indeed hard to say how useful is to know hop counts when a large fraction of IXP member are remote and plenty of content is cached, but that question was bugging me too and I have been looking into it. From what we could see, pretty stable around 5 hops.

https://arxiv.org/abs/1810.10963

Disclaimer this is an ongoing work. Feedback welcome!

“Eventually they discovered that it was more cost efficient to actually provide the customer with what the customer had purchased.”

Sometimes yes, sometimes no. Big content has been making this more complicated.

To get back to the original question regarding the "diameter" of the Internet, it would appear to me that we are easily looking at about 30 to 40 hops just within North America -- and easily double that to reach the rest of the Internet outside of North America. Of course, the "Top 5 Channels" are probably only a few hops away due to CDNs, but this is for the most part irrelevant (unless one only wants to watch the Top 5 channels) ...