Questions about Internet Packet Losses

From: Bob Metcalfe <>
Is Merit's packet loss data (NetNow) credible?

The measurement needs more data points located inside other providers,
but is pretty accurate regarding links into and out of the major NAPs.

Do packet losses in the
Internet now average between 2% and 4% daily?

This is nothing new. In fact, it used to be much worse. Remember 1986?
1989? 1994? We've had pretty serious loss rates at times.

Are 30% packet losses common
during peak periods?

I've personally measured 40% pretty regularly, and 80% at times. But it
is much better now than a few years back.

Is there any evidence that Internet packet losses are
trending up or down?

Losses are on the rise again (past few months), but there was a downward
trend during 1996 for my own connections. I specifically saw MCI and
Sprint put in faster links last year, and add more private interconnects.
Improved my life immensely.

Were Merit's data correct, what would be the impact of 30% packet losses on
opening up TCP connections?

TCP will keep working. It's much harder on UDP, as the UDP
applications often aren't very good on recovery. Lots of applications
use UDP that should have used TCP. TCP is very robust.

On TCP throughput, say through a 28.8Kbps

Compared to what? I remember using TCP pretty well over 300 bps and
1200 bps modems. Even at 28.8 Kbps, the delay in the modem swamps the
delay of the net.

On Web throughput, since so many TCP connections are involved?

The Web is actually a major source of the problem. It was not very well
designed, protocol-wise. It causes rapid transient congestion.

On DNS look-ups?

Yes, DNS uses UDP, and fails a lot more often. So, I just try again.

On email transport?

SMTP uses TCP, and still works great. Neither delay nor packet loss are
an issue.

How big a problem is HTTP's opening of so many TCP connections?

It is a terrible problem. There is no time for the congestion and round
trip estimation algorithms to kick in, as each connection is so short.

But, HTTP is being fixed.

Does TCP
need to operate differently than it does now when confronted routinely with
30% packet losses and quarter-second transit delays?

No. It works fine. BTW, 1/4 second delays are not a problem; as I said
earlier, this is actually better than we had when we developed the code,
2 second delays were typical in modems (576 byte payloads at 2400 bps).

What is the proper
response of an IP-based protocol, like TCP, as packet losses climb? Try
harder or back off or what?

Exponential backoff. This is all well described by Van Jacobson nearly
a decade ago. This kind of question is what makes folks think you
haven't done your homework, Bob.

How robust are various widespread TCP/IP
implementations in the face of 30% packet loss and quarter-second transit

BSD 4.4 and derivatives work fine. Newer implementations with Sack work
a bit better.

Karn's KA9Q stack (used in many small enterprise routers and some MS-DOS
hosts) is even more robust, as it was developed for amateur radio. High
losses and high delay are typical in radio.

FTP Software's stack seems to perform very well, and they update it

MacTCP is terrible. And many early WWW servers and clients were Mac
based, so they have terrible TCP characteristics. But, they still
actually worked!

But Apple replaced MacTCP last year with Open Transport, which works
much better. Everyone should upgrade to MacOS 7.6.

I've also had some problems with Win'95, and actually removed it and
went back to Win 3.1. Since I gave up on it, I never really got to the
bottom of why it was performing so badly. As an ISP, we actually charge
more to setup folks with Win'95, while we charge nothing for Macs with
Open Transport. Amazing difference in user support costs.

I've noticed that there are a rather a lot of old SunOS systems out
there. They don't have modern versions of TCP/IP, with MTU Discovery
et cetera. But they should have been upgraded years ago, and are
probably about to fall over anyway.

Is the Internet's sometimes bogging down due mainly to packet losses or
busy servers or what, or does the Internet not bog down?

Busy servers is the worst problem I've had in the past year. Much worse
(more frequent) than the packet losses.

What fraction of Internet traffic still goes through public exchange points
and therefore sees these kinds of packet losses? What fraction of Internet
traffic originates and terminates within a single ISP?

From personal experience at my clients, virtually all traffic is

distributed across multiple providers. Hardly any is local. This has
always been true of email, and is also true of WWW traffic.

But as far as I know, there is no serious data available on internal
versus external traffic of ISPs. In the ISP I partly own, most of the
traffic is external. Where dial-up users POP their email from the
server locally, the email still came via other providers to the server.

Where is the data on packet losses experienced by traffic that does not go
through public exchange points?

As far as I know, there is no data on how much traffic goes through
public versus private exchange points. But, we should encourage more
exchange points of every kind.

If 30% loss impacts are noticeable, what should be done to eliminate the
losses or reduce their impacts on Web performance and reliability?

The losses are noticable. What _should_ be done is fairly well known.
We've been talking about it for years. I've been fairly active on the

Link speeds do not increase at the rate of Internet traffic. Merely
making the links faster is doomed to fail.

Routing performance will not increase at the rate of Internet traffic.
Merely adding links between the same places is doomed to fail.

Resource reservation on already congested links is doomed to fail.

Resource reservation on many short flows is doomed to fail.

We need providers to share faster links, such as inter-continental.
By the very nature of Internet traffic multiplexing, it is better to
share one bigger link than many smaller ones. Traffic shaping would
ensure each provider getting their "fair share".

We need more exchanges, both public and private. There should be one or
more major public exchanges in every metropolitan area. Massive
parallel inter-connections. More robust in the case of link failure and
as backhoe protection. It's the only way we can scale at the rate of
Internet traffic.

Unfortunately, both these solutions require cooperation, which is in
short supply.

Are packet losses due mainly to transient queue buffer overflows of user
traffic or to discards by overburdened routing processors or something else?

Most of the packet losses I see and have verified are _link_
underprovisioning! That is, providers have sold more subscriber
connections than they can carry to other providers, and subscribers have
bought links that are too small for the amount of traffic they generate.
I've seen the provider version of the problem a lot more often than I've
seen the subscriber problem.

What does Merit mean when they say that some of these losses are
intentional because of settlement issues?

In some cases, the decision to keep the small congested link to other
providers appears to be political and deliberate.

Are ISPs cooperating
intelligently in the carriage of Internet traffic, or are ISPs competing
destructively, to the detriment of them and their customers?

Major ISPs are not cooperating very well. Regional ISPs are doing a
better job of cooperating. There are plenty of examples of both.
    Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32
    Key fingerprint = 2E 07 23 03 C5 62 70 D3 59 B1 4F 5E 1D C2 C1 A2

Assuming HTTP traffic is 50%-60% of data traffic on the Internet, how much
of a win will we get once streaming HTTP comes along?

Hank Nussbacher

Let me get this straight: Spammers are now sending millions upon
millions of unsolicited, for all intents and purposes unpaid for, junk
email &c ads daily, growing at a phenomenal rate as they perfect their
24 by 7 mass mailing robots, unchecked, and the concern here is that
relatively minor vendor implementation details in TCP stacks are
contributing to congestion on the net?

Talk about re-arranging deck chairs on the Titanic, this really is
just a morbid, shuddering case of denial, right? Hey, why build an
ark, Noah, when we can have such productive discussions about sponge

I suppose the reasoning is that the vendors we can whine at to fix
poorly chosen default MTUs, but spammers, well, if they want to show
up at these all you can eat buffets with 18-wheel semis and payloaders
that's their business, we'll just have to spend our time building more
stuff and socking the costs to the legitimate customers, perhaps we
can improve the gas mileage on those semis, yeah that should help.

This party really is going downhill fast, it's astounding to watch,
like boy scouts dropped off in Somalia proceeding to organize litter