latency (was: RE: cooling door)

Please clarify. To which network element are you referring in connection with
extended lookup times? Is it the collapsed optical backbone switch, or the
upstream L3 element, or perhaps both?

Certainly, some applications will demand far less latency than others. Gamers and
some financial (program) traders, for instance, will not tolerate delays caused
by access provisions that are extended over vast WAN, or even large Metro,
distances. But in a local/intramural setting, where optical paths amount to no
more than a klick or so, the impact is almost negligible, even to the class of
users mentioned above. Worst case, run the enterprise over the optical model and
treat those latency-sensitive users as the one-offs that they actually are by
tying them into colos that are closer to their targets. That's what a growing
number of financial firms from around the country have done in NY and CHI colos,
in any case.

As for cost, while individual ports may be significantly more expensive in one
scenario than another, the architectural decision is seldom based on a single
element cost. It's the TCO of all architectural considerations that must be taken
into account. Going back to my original multi-story building example-- better
yet, let's use one of the forty-story structures now being erected at Ground Zero
as a case in point:

When all is said and done it will have created a minimum of two internal data
centers (main/backup/load-sharing) and a minimum of eighty (80) LAN enclosures,
with each room consisting of two L2 access switches (where each of the latter
possesses multiple 10Gbps uplinks, anyway), UPS/HVAC/Raised flooring,
firestopping, sprinklers, and a commitment to consume power for twenty years in
order to keep all this junk purring. I think you see my point.

So even where cost may appear to be the issue when viewing cost comparisons of
discreet elements, in most cases that qualify for this type of design, i.e. where
an organization reaches critical mass beyond so many users, I submit that it
really is not an issue. In fact, a pervasively-lighted environment may actually
cost far less.

Frank A. Coluccio
DTI Consulting Inc.
212-587-8150 Office
347-526-6788 Mobile

On Sat Mar 29 19:20 , Mikael Abrahamsson sent:

I am talking about the matter that the following topology:

server - 5 meter UTP - switch - 20 meter fiber - switch - 20 meter fiber - switch - 5 meter UTP - server

has worse NFS performance than:

server - 25 meter UTP - switch - 25 meter UTP - server

Imagine bringing this into metro with 1-2ms delay instead of 0.1-0.5ms.

This is one of the issues that the server/storage people have to deal with.

Thats because the LAN protocols need to be re-jiggled a little to start
looking less like LAN protocols and more like WAN protocols. Similar
things need to happen for applications.

I helped a friend debug an NFS throughput issue between some Linux servers
running Fortran-77 based numerical analysis code and a 10GE storage backend.
The storage backend can push 10GE without too much trouble but the application
wasn't poking the kernel in the right way (large fetches and prefetching, basically)
to fully utilise the infrastructure.

Oh, and kernel hz tickers can have similar effects on network traffic, if the
application does dumb stuff. If you're (un)lucky then you may see 1 or 2ms
of delay between packet input and scheduling processing. This doesn't matter
so much over 250ms + latent links but matters on 0.1ms - 1ms latent links.

(Can someone please apply some science to this and publish best practices please?)


There's been a lot of work done on TCP throughput. Roughly speaking,
and holding everything else constant, throughput is linear in the round
trip time. That is, if you double the RTT -- even from .1 ms to .2 ms
-- you halve the throughput on (large) file transfers. See for one
summary; feed "tcp throughput equation" into your favorite search
engine for a lot more references. Another good reference is RFC 3448,
which relates throughput to packet size (also a linear factor, but if
serialization delay is significant then increasing the packet size will
increase the RTT), packet loss rate, the TCP retransmission timeout
(which can be approximated as 4x the RTT), and the number of packets
acknowledged by a single TCP acknowledgement.

On top of that, there are lots of application issues, as a number of
people have pointed out.

    --Steve Bellovin,

... feed "tcp throughput equation" into your favorite search
engine for a lot more references.

There has been a lot of work in some OS stacks
(Vista and recent linux kernels) to enable TCP
auto-tuning (of one form or another), which is
attempting to hide some of the worst of the TCP
uglynesses from the application/end-users. I
am not convinced this is always a good thing,
since having the cruft exposed to the developers
(in particular) means one needs to plan for
errors and less than ideal cases.

Gary ("Buhrmaster, Gary") writes:

> ... feed "tcp throughput equation" into your favorite search
> engine for a lot more references.=20

There has been a lot of work in some OS stacks
(Vista and recent linux kernels) to enable TCP
auto-tuning (of one form or another), ...

on <>
i'd read that freebsd 7 also has some tcp auto tuning logic.

There are certain things that the stack can do, like auto-adjusting the
window size, tuning retransmission intervals, etc. But other problem
are at the application layer, as you noted a few posts ago.

    --Steve Bellovin,