Keynote/Boardwatch Results

Jack_Rickard · July 8, 1997, 12:02am

I hate to say I told you so. Actually, I don't, but I can be such a pain
in the ass anyway what difference does it make.

It would appear that everyone is pretty smugly satisfied by concensus that
the performance series we ran actually measures server performance and that

since all ISPs run weeny home servers, this was not "really" a test, flawed
methodology, etc. I corresponded with Doug the Hump at Digex about this.
I've liked this guy since I first met him largely because he's funny and
doesn't take himself too seriously. He's got a yen for black helicopters
that still has me in stitches.

In any event, he didn't appear to be emotionally involved, but noted that
he did think their web server was the problem in their case and that it was
a real issue with the other backbones as well. He said that if we had
measured one of their honking customer web servers that it would have all
been better.

Now I was clear with Doug, as I have been in this mailing list, that
servers DO have impact. When you are attempting to measure end to end, you
certainly hope that the results cummulatively represent everything that has
an effect. But I have been very clear that I didn't buy the "measuring
server" theory. It will have an effect, but not near the effect you all
have apparently dreamed up amongst you with no data at all.

Anyway, Doug coughed the names of a couple of "honkin" customer sites. One
on a UNIX machine running Apache. One on an NT server. As it so happens
they are the NIKE site and the FORBES site. I agreed to run them all for a
few days and publish the results rather openly. Here they are:

Www.digex.com

Metro Area Abbr Population Mean Std Dev Data Pts. Rating
Omaha OMA 640000 - - - -
Norfolk ORF 1443000 0.917 0.787 477 -
Milwaukee MKE 1607000 2.641 5.545 427 -
Cleveland CLE 2860000 2.893 4.155 432 -
Washington D.C WAS 6727000 3.326 10.274 436 -
Kansas City MKC 1583000 3.334 3.881 440 -
Detroit DTT 5187000 3.424 11.593 470 -
Atlanta ATL 2960000 3.434 5.774 447 -
Denver DEN 1980000 3.497 2.725 433 -
Tampa TPA 2068000 3.541 9.753 440 -
Minneapolis-St. Paul MSP 2539000 3.697 7.263 420 -
Pittsburgh PIT 2395000 4.039 9.484 441 -
Miami MIA 3193000 4.336 11.227 409 -
Chicago CHI 8240000 4.433 7.977 432 -
Philadelphia PHL 5893000 5.687 36.621 446 -
Columbus CMH 1345000 6.582 10.275 352 -
San Diego SAN 2498000 7.124 6.563 111 -
Houston HOU 3731000 7.156 19.197 426 -
Boston BOS 5455000 8.147 68.522 342 -
New York NYC 19550000 8.911 48.549 436 -
San Francisco SFO 6253000 9.462 70.834 367 -
Phoenix PHX 2238000 12.226 26.802 447 -
Seattle SEA 2970000 14.989 74.818 201 -
Dallas-Ft. Worth DFW 4037000 16.24 36.844 421 -
Los Angeles LAX 14532000 33.879 78.169 572 -
Salt Lake City SLC 1072000 48.719 80.656 383 -
Portland PDX 1793000 65.23 88.641 446 -

11.287 42.959 10654
www.nike.com

Metro Area Abbr Population Mean Std Dev Data Pts. Rating
Omaha OMA 640000 - - - -
Norfolk ORF 1443000 1.359 6.704 385 -
Washington D.C WAS 6727000 2.838 6.486 426 -
Cleveland CLE 2860000 2.892 5.958 424 -
Detroit DTT 5187000 3.032 8.818 380 -
Milwaukee MKE 1607000 3.262 10.087 420 -
Tampa TPA 2068000 3.263 5.876 427 -
Philadelphia PHL 5893000 3.697 7.305 433 -
Kansas City MKC 1583000 3.82 7.477 429 -
Los Angeles LAX 14532000 3.979 7.142 441 -
Denver DEN 1980000 4.22 8.725 421 -
Miami MIA 3193000 4.338 13.774 398 -
Pittsburgh PIT 2395000 4.392 10.68 431 -
Minneapolis-St. Paul MSP 2539000 4.545 8.688 409 -
Atlanta ATL 2960000 4.577 16.279 438 -
New York NYC 19550000 5.163 13.025 427 -
Boston BOS 5455000 5.725 16.02 347 -
Chicago CHI 8240000 5.95 11.926 425 -
San Francisco SFO 6253000 7.254 21.74 353 -
Houston HOU 3731000 7.557 14.675 421 -
Seattle SEA 2970000 9.359 22.033 184 -
Columbus CMH 1345000 9.991 21.586 358 -
Phoenix PHX 2238000 14.089 25.896 432 -
Dallas-Ft. Worth DFW 4037000 17.677 44.361 407 -
San Diego SAN 2498000 19.029 10.833 16 -
Salt Lake City SLC 1072000 37.156 92.765 302 -
Portland PDX 1793000 79.404 166.148 437 -

9.868 44.374 9971
www.forbes.com

Metro Area Abbr Population Mean Std Dev Data Pts. Rating
Omaha OMA 640000 - - - -
Kansas City MKC 1583000 0.0050 0.0 14 -
Norfolk ORF 1443000 2.165 11.616 380 -
Miami MIA 3193000 2.715 10.382 396 -
Washington D.C WAS 6727000 3.151 22.964 419 -
Philadelphia PHL 5893000 3.177 14.547 423 -
Milwaukee MKE 1607000 3.204 19.791 416 -
Atlanta ATL 2960000 3.364 17.068 434 -
Minneapolis-St. Paul MSP 2539000 3.847 10.974 405 -
Cleveland CLE 2860000 3.879 20.768 415 -
Denver DEN 1980000 3.974 22.303 417 -
Tampa TPA 2068000 4.001 19.877 421 -
Pittsburgh PIT 2395000 4.333 25.179 426 -
Detroit DTT 5187000 4.428 20.891 376 -
New York NYC 19550000 4.912 21.919 423 -
Boston BOS 5455000 5.347 30.106 340 -
Seattle SEA 2970000 6.324 12.105 179 -
Chicago CHI 8240000 6.378 26.431 421 -
San Francisco SFO 6253000 6.575 35.339 351 -
Houston HOU 3731000 7.49 20.885 417 -
Columbus CMH 1345000 7.726 19.924 351 -
Dallas-Ft. Worth DFW 4037000 13.068 22.705 404 -
Phoenix PHX 2238000 15.445 50.526 427 -
Los Angeles LAX 14532000 23.117 150.429 460 -
Salt Lake City SLC 1072000 47.801 227.832 297 -
San Diego SAN 2498000 85.378 153.109 14 -
Portland PDX 1793000 89.043 241.734 425 -

11.53 79.068 9451

The bottom line is that there is some slight variation, but as I predicted,
not much. And as it so happens, it was generally in the wrong direction.
Nike was a little better on the mean and a little worse on the standard
deviation. Forbes was a little worse (in fractions) from Digex on the
mean, and more so on the deviation. Digex's original figures for the April
20- May 20 period were 9.162 seconds on the mean and 31.752 on the standard
deviation - slightly better than average. Note that 30 days and five days
are apples and oranges if you comprendo fruit. The means for the five day
period were roughly 9.9, 11.3, and 11.5 with the "weeny" Digex server in
the middle.

So yeah, servers do have an impact, but not nearly what you had hoped and
believed. I would say miniscule in a universe where our results ran from
1.5 to 26.8 seconds. And while I'm not shy about "I told you so's" my real
reason for putting this out is that I have heard from several backbones
that are scrambling to upgrade and move their home page servers etc. I
personally would get them up to what you think they ought to be anyway, but
if you go to extraordinary measures, you're probably going to be
disappointed in how little the numbers move - as I predicted. It just
won't move the numbers much. A little perhaps, and if you're not careful -
potentially the wrong direction. Doug was pretty emphatic that these
customer servers were the "good" ones on the good part of the net, and the
home server was the weeny one. Maybe Doug Mohney can jump in and remind me
which was which as far as NT and UNIX goes if anyone is interested. I
would guess off hand that Forbes is taking a little more load than Nike,
but I may be reading messages from God in standard deviation cloud
formations. There just isn't that much difference - certainly not in the
mean.

The bottom line is that if we are actually measuring server performance, we
should be able to measure three different servers, two avowed muscle boxes
and one avowed weeny one, all on the same network (connected differently
I'm told) and get at least as wide a variation as we saw between networks.
We didn't by about a mile and a fortnight. At LEAST it ought to be in the
predicted direction. This clearly was not. Good theory - but not so -
even in the lab.

Again, I think you guys should take a look at this stuff a little more open
mindedly and professionally. It's certainly NOT to scientific laboratory
standards, but it is certainly interesting and I would claim very VALID
information. Better information than you have previously had at your
disposal. It's an attempt to look at the FOREST, not tree limb diameters,
leaf patterns and nutrient flows - all very interesting though those may be
I do grant you. A great deal of this network has operated on theories that
once scaled up, nobody really knows if they work that way or not. I can
tell you from personal experience that most of what I know is wrong, and I
find that out over, and over, and over again. I might also mention that a
lot of what I'm told turns out to be wrong as well. I'm only going to
SUGGEST that I may not be alone.

We'll continue to work on it. I discussed the universal test page
suggestion with Gene Shklar this afternoon and we will make it so. Again,
I don't think it will move any numbers around much, but certainly, as
Forest Gump says, ONE LESS THING.... And if you can make a case for a
different server ON YOUR OWN NETWORK, we will certainly entertain requests
to shoot at another machine. Nominally July 15-August 15th though I'm not
signing up to those precise dates at this time.

Regards

Tony_Li1 · July 8, 1997, 5:47am

jack.rickard@boardwatch.com (Jack Rickard) writes:

Again, I think you guys should take a look at this stuff a little more open
mindedly and professionally. It's certainly NOT to scientific laboratory
standards,

You'll pardon me, I'm sure, but I must point out that our professional
standards _are_ scientific laboratory standards. These appear to be
distinct from magazine standards.

Tony

Robert_Laughlin1 · July 8, 1997, 12:57pm

Jack,

If you supplied a test page, we would be willing to make it available on
our server. Assuming that server testing is the best thing available at
this time, we should try to reduce the variables. If you selected a page
whose content loaded slightly faster than pages selected now, you
would encourage others to make your test page available.

I am still wondering how the 27 sites are connected, not so much what the
cities are but which path is their connectivity to my network. IE whose
network do they have to go thru before reaching me. I do not have my copy
of the complete study yet, I guess it's on its way...

Best Regards,
Robert Laughlin

John_Leong1 · July 9, 1997, 6:24pm

n Mon, 7 Jul 1997, Jack Rickard wrote:

I discussed the universal test page
> suggestion with Gene Shklar this afternoon and we will make it so.

Universal test page is certainly a [good] step - but it is a step (one
among many) towards [comparative] testing the performance between the
test machine and a *SPECIFIC* web server .... but not the peformance of
the backbone (even though it clearly is a major component of the
result). There are simply too many uncontrolled variables still
outstanding.

By the way, we have given considerable thought on the attributes of such
standardised test page. ... starting with what is the test about. It
would be a good idea if a sub group of interested parties can get off
line and work on this under some frame work (IETF, NANOG, CAIDA, ... or
just an ad hoc BOF group) and then report the finding back to the
mailing list.

So, if anyone is interested to work on a standardized web page, send me
mail and I will try to get a seperate discussion list going, do some
serious work ... and reduces the S/N ratio of this list.

Regards,
John Leong