Traffic locality and other questions

Traffic locality has long been a concern of mine, since one would expect
local public library traffic to be dominated by local users. What's
interesting is over the last five years the network topology has become
less local. Once upon a time, almost every site within a geographic
region 'bought' service from a single mid-level provider. MIDNET served
the midwest, MERIT served Michigan, SURANET served the southeast,
BARRNET served the San Francisco bay area, and so on. The concept of the
regional provider dropped out of favor as MCI, Sprint and UUNET entered
the marketplace with significantly lower charges than the regional providers.

Now network topology, like the airline industry, favors tail circuits
feeding into large hubs. With facilities based providers its not uncommon
to see a >1,000 mile back-haul for the local loop. Even if both end-users
bought service from the same provider, they may end up with service from
hubs thousands of miles away from each other. For example, its not uncommon
for St. Louis customers to be served from Chicago, Kansas City, or Fort Worth.

This is most pronounced in the individual user marketplace. Even if the
individual user dials a local phone number, often their connection is
hauled to someplace like Northern Virginia (AOL) before heading out on
the Internet. So the best performance may come from a provider with
connections closer to Northern Virginia than to some place near the user's
dialup.

What's also interesting is the asymmetry between traffic generators and
traffic consumers. Although overall DRA has a even mix of inbound and
outbound traffic. I see tremendous inbalances with individual providers.
But it is very difficult to tell who is getting a 'free-ride' from whom.
Other than SYN-flood attacks, and the dreaded default route, traffic only
flows if there is some kind of customer demand on both sides.

I have some questions whether it is better to aggregate traffic into
a single huge flow, or if it is better to have lots of smaller paths.
Would I be better buying a single cross-country OC3 with one provider,
or trying to do a deal for several T1's with Turner, Gannett, or other
places public libraries are interested in using? Over the last five
years, I've found building small bypasses work better. But maybe things
have changed and I can meet end-to-end performance requirements by
other means.

I also have a question about how consolidated the traffic really is.
If I only bought circuits from the top three providers, how much of
our traffic would that really cover. What if I bought circuits from
the top 10 providers. Perhaps the public library market is an aberration,
but I see much less traffic aggregation than others report. I use both
'managed' and 'un-managed' connections in both 'transit' and 'peer'
provider relationships. My business relationship with the next-hop
provider seems to have very little correlation with the traffic flows.

DRA's top 20 traffic generators/sinks by end AS, alphabetical order

   AOL, BBN, CONCERT, DIGEX, DISI, EDS, GANNETT, IACNET, ISI, JVNC, MCI,
   MERIT, NETCOM, OARNET, PSI, SPRINT, SUPERNET, TURNER, UNISYS, UUNET

The top 20 end ASs are extremely volatile. NASA Science Internet can leap
to the top for a week, e.g. Pathfinder; or IBM jumps during the Olympics,
or UK providers the week following Princess Diana's death. Since most
of the long-term large traffic generators are multi-homed among several
providers, a very small network topology change can dramatically shift
traffic between network backbone providers.

Percentage of traffic destined for the top 10 next-hop ASs, includes
transit ASs so some next-hop traffic may be inflated by traffic destined
to another provider via the transit provider. Since most peering agreements
prohibit the release of identifiable information, I'm not identifying
which provider goes with what percentage.

  14.4%
  13.4%
  12.0%
   7.9%
   3.2%
   1.9%
   1.8%
   1.2%
   1.1%
   1.1%

BGP isn't very good at showing 'other' transit paths to networks. Its
like adding another lane to a road, traffic that didn't exist before
appears. The next-hop rankings are less volatile, but over time they
do show significant trends. In the last year, the traffic among next-hops
has become flatter at the top, and the tail has become longer. The
opposite of what I would expect in a market experiencing consolidation.
What I find interesting is the rankings of traffic flows I see don't
match with what the pundits rank as the largest network providers. I
don't know what that means though.

... Once upon a time, almost every site within a geographic
region 'bought' service from a single mid-level provider. MIDNET served
the midwest, MERIT served Michigan, SURANET served the southeast,
BARRNET served the San Francisco bay area, and so on. The concept of the
regional provider dropped out of favor as MCI, Sprint and UUNET entered
the marketplace with significantly lower charges than the regional providers.

This is a political, economic side effect, not a traffic issue. The
regional networks were formed back in the days of the NSFNET with the
express charter of serving constituents in a specific (although informally
drawn) geographic region, eventually covering the entire US. There was no
point in competition, although several regional networks did compete in
some areas because we were nascent commercial entities (recall NYSERNET and
NEARNET, now PSI and GTE).

The regionals offered service to all comers whether in the metro areas or
the hinterlands. The idea was to get "everyone" on the one Net asap.
Competition was beside the point. Cost recovery and managing growth on
bootstrap budgets were the prime concerns.

Today, commercial providers go where the density of prospective customers
is highest. That is why the top 20 metro areas have fifty ISPs to choose
from and the most rural areas have at most one.

The geographic density of the Internet is still fairly low and not all ISPs
use geographic based addressing or MEDs. Therefore, the density of exchange
points is still rather low, but eventually there will be many more
exchanges. It is not a strategic thing, it's strictly tactical.

Now network topology, like the airline industry, favors tail circuits
feeding into large hubs. With facilities based providers its not uncommon
to see a >1,000 mile back-haul for the local loop.

Bandwidth is nearly free for facilities based providers. I have had
facilities based providers quote me recurring costs of zero for bandwidth
(but only if we are speaking about their bandwidth, not mine.)

I have some questions whether it is better to aggregate traffic into
a single huge flow, or if it is better to have lots of smaller paths.

Then you need to buy a ringside seat ticket to the World Wide Wrestling
Foundation match between Sean Doran, representing the hierarchical network
builders and Mike O'Dell representing the dense mesh network builders.

:slight_smile:

--Kent

Sean Donelan wrote:

In the last year, the traffic among next-hops
has become flatter at the top, and the tail has become longer. The
opposite of what I would expect in a market experiencing
consolidation.

I have a suspiction that this is because of the limitations
of the current backbone technology. I.e. traffic is not
determined by aggregate customer demand, but rather by the
capacity of backbones' connections to the IXPs.

If so, that'll mean that ISPs with larger market share have
the poorest bit-transport service; and so boasting about having
60% of market or so is seriously misguided :slight_smile:

What I find interesting is the rankings of traffic flows I see don't
match with what the pundits rank as the largest network providers. I
don't know what that means though.

"Pundit" sounds vaguely scatological for a slavic-speaking
person (and very much like Russian "pizdit", a rude word meaning
"[he] bullshits", fortified with reference to female genitalia).
I'm sure there must be some deeper meaning in that :slight_smile:

Thanks for posting the real data, Sean!

--vadim

Ghod help us all; I read that as "capacity of backhoe's connections..."

Cheers,
-- jra

Sean Donelan <SEAN@SDG.DRA.COM> writes:

I have some questions whether it is better to aggregate traffic into
a single huge flow, or if it is better to have lots of
smaller paths.

Ah, good question.

What seems to scale better is aggregating things into
large buckets which are switched together rather than
merely filtering lots of small individually-switched
buckets of data through single pieces of equipment.

There are some assumptions driving this. Firstly, it is
easier to move large numbers of symbols per second than it
is to make large numbers of switching decisions.
Secondly, it is possible to build a hierarchical routing
scheme that can take advantage of aggregation
opportunities to put similar traffic into big buckets
along particular path segments. Thirdly, there are
economies of scale which can be exploited when one uses
large data pipes that reduce the cost of moving traffic
intelligently.

If all of these assumptions prove to be invalid, and in
particular if it is cheaper to build equipment which are
better at switching very small amounts of data across many
diverse physical paths, if a routing scheme that can fully
exploit this can be developed, and if it is more
economical to use many small pipes than a few large pipes,
then obviously one would be better off not aggregating
traffic, and perhaps even deaggregating it and its
complementary reachability information.

BGP isn't very good at showing 'other' transit paths to
networks.

BGP is a distance vector protocol and not a map-exchanging
protocol. You cannot build a completely accurate map of
the Internet from a subset of BGP distance vectors.

BGP routers will also only announce their paths of choice
to their neighbours, and therefore any other paths they
may know will be hidden.

What I find interesting is the rankings of traffic flows I see don't
match with what the pundits rank as the largest network providers. I
don't know what that means though.

I suppose it depends on what people describe when they say
"largest".

  Sean.

Sean M. Doran wrote:

If all of these assumptions prove to be invalid, and in
particular if it is cheaper to build equipment which are
better at switching very small amounts of data across many
diverse physical paths,

The cost of building a 1 Tbps/line signle data path router
at the present level of technology: infinity.

Everything is cheaper than that :slight_smile:

if a routing scheme that can fully
exploit this can be developed,

There's no need for L3 routing to be aware of multiplicity of
physical paths underneath.

and if it is more
economical to use many small pipes than a few large pipes,

For some reason i doubt it. The general rule -- use transmission
technology presently at the bottom of price/performance @ performance
curve; and replicate it accordingly to reach desireable performance
level.

then obviously one would be better off not aggregating
traffic, and perhaps even deaggregating it and its
complementary reachability information.

You can have deaggregated traffic and still keep aggregated
reacheability information, as long as you constrain topologies
to multiple-parallel-links in otherwise small general graph.
There are no routing technologies which wouild scale for
large general graphs, to my knowledge.

--vadim