Why the temptation for dial users to crank back rwin/mtu?

Hi Folks,

As we know[1], packet sizes on the network fall almost exclusively into one
of 4 categories:

   Small (<50 byte) dataless packets.. tcp acks/syns/rsts..
   576 byte packets
   1500 byte packets
   runt packets less than sender MTU..

It's always been my belief that the 576 values were generated by hosts
that didn't support Path MTU discovery and wanted to be conservative
and avoid fragmentation. Fair enough..

but, as I'm sure everybody knows, there's a plethora of
website/utilities ([2],[3],[4],[5],[6],..) imploring windows dial users
to use any one of a number of tools/techniques to play with their
mtu/rwin settings to 'make things faster'.. In typical windows fashion
nothing is quantified in any meaningful fashion and the only
motivation provided is typically some garbage like "windows comes
setup for LAN not internet use".. these tools tend to crank down both
rwin and mtu an awful lot.

apparently there's some performance value in this (at least to the
immediate user) because they keep doing it in droves. It's not obvious
to me why the heck this would be. (warning: I am a protocol guy, but
I'm not a dialup guy at all.. and even less of a windows guy)

MTU - at least this makes a little bit of sense.. If they're doing
HTTP/1.0 stuff with parallel connections then a smaller MTU is going
to make that parallelization latency much more effective and perceived
performance will go up some.. it doesn't impact full document
retrieval time though (at least not positively!).. are dial links
really lossy enough that chopping the segment size to 1/3 is a big win
in retransmit time or are the win95/98 stacks really braindead enough
that they don't do pmtud so are just trying to dodge fragmentation? I
found it really odd that [7] which I use all the time to track
features in a myriad of shipped OS's actually has a blank entry for
pmtud on both of those (neither yes nor no..)

RWIN - this is the one that boggles my mind.. it gets set way way way
down by the above mentioned tools.. I've seen it as low as 2500 bytes
recently. Anyone have any insight into the value of pushing this all
the way down? The web pages generally mumble about capping the amount
of data that needs to be resent in case of a failure.. which is of
course true in the extreme case, but I'd much rather have the
congestion window providing the throttle than the hard-limit of rwin
that can just cap transfer rates on you.. about the only reason I can
think of for small RWINs is to conserve the buffer space, but it sure
seems worth a few K to me to be sure I can work with high latency
links. You could argue that 3 or 4 K is sufficient for any reasonable
latency that is bottlenecked by a modem's throughput.. and eventually
I might give in (or maybe not ;)).. what I don't get is why this
results in any kind of perceived performace increase on the part of
the user under any condition.. It almost implies that TCP congestion
control is too conservative, although almost all work on that
indicates it's a little too aggressive (which would be the side to err
on..) Any thoughts?

-P

[1] http://www.caida.org/Papers/Inet98/
[2] http://www.cerberus-sys.com/~belleisl/tune_faq/tuning.htm
[3] http://www.trumpet.com.au/wsk/faq/config.htm
[4] http://www.mc-pro.com/hardware/windialup.shtml
[5] http://www0.delphi.com/pccompat/mtu.html
[6] http://www.pattersondesigns.com/tweakdun/index.html
[7] http://www.psc.edu/networking/perf_tune.html

apparently there's some performance value in this (at least to the
immediate user) because they keep doing it in droves. It's not obvious
to me why the heck this would be. (warning: I am a protocol guy, but
I'm not a dialup guy at all.. and even less of a windows guy)

This has been generally beat to death on nanog in the past. If you
weren't around back then, dig around in the archive. I remember one of
the subjects being "PC Bozoworld strikes again" or something like that.

The short recap is that for some unknown reason the Microsoft TCP/IP stack
is broken in some bizzare way that setting down the MTU on a good chunk of
the machines out there will result in a dramatic speed increase. Why
this occurs, I'm not sure anyone really knows. It would be really
interesting to see a study of what the MS stack is doing and why it's
faster.

MTU - at least this makes a little bit of sense.. If they're doing
HTTP/1.0 stuff with parallel connections then a smaller MTU is going
to make that parallelization latency much more effective and perceived
performance will go up some.. it doesn't impact full document

Just for my information, does the MTU setting affect <received> packets
in some way? My understanding was that a machine wouldn't send packets
over the MTU size, but could recieve anything up to whatever the TCP/IP
stack writer included in the stack. Guess I'll have to go dig out the
RFC's.

- Forrest W. Christian (forrestc@imach.com)