[pfSense Support] Strange TCP connection behavior 2.0 RC2 (+3)

----- Forwarded message from William Salt <williamejsalt@googlemail.com> -----

This is a well known issues called "Long Fat Pipe Network".

There's many university papers about it and many tricks to get around
it on software-based boxes.

Adjusting your TCP window size was the best start, if it's set
properlu. The basic formula is provided in this forum post :
http://forums.whirlpool.net.au/archive/116165

Good luck !

Indeed, we had similar issues on a 3G radio network. Long RTTs made it impossible to reach the maximum potential throughput of the network. I installed one of these:

http://www.fastsoft.com/home/

And the problem just went away.

Even disabling window scaling and setting the window to 16MB makes no difference.

If you disable window scaling, you're limiting it to 64k.

However, we have tried different hardware (L3 switches, media convertes +
laptops etc), and the symptoms still persist...

You should dump the traffic and analyse it in Wireshark, then you'll see if you have packet loss or not. Most likely you do not, and the reason for your problem is TCP setting related (see other posts with links).

Before you start replacing hw you should diagnose what your problem is, if you're not losing packets and the delay variation is constant, then it's not related to the network. Both of these factors can be seen using the built in tools in wireshark (Analyze->Expert info, and Statistics->TCP stream graph).

Since UDP works I have my doubts it is a driver/interface link issue.

This sounds more like a latency/packet loss issue (esp since it is a transatlantic link).

What type of latency, packet loss, and or packet error rates are you seeing?

Sounds like TCP RTT and/or packet-loss - should be easy to determine the issue with a bit of traffic capture.

Obviously not helping if you are trying to tune standard TCP, but I
lament that protocols like Tsunami are not in wider use.
http://tsunami-udp.sourceforge.net/ Short of it, a TCP control channel
takes care of error checking and resends while the data channel is a UDP
stream, specifically built to max out LFNs.

Hi,

----- Forwarded message from William Salt <williamejsalt@googlemail.com> -----
From: William Salt <williamejsalt@googlemail.com>
Date: Tue, 28 Jun 2011 08:03:25 +0100
To: support@pfsense.com
Subject: [pfSense Support] Strange TCP connection behavior 2.0 RC2 (+3)
Reply-To: support@pfsense.com

Each TCP connection starts very slowly, and will max out at around 190mbps,
taking nearly 2 minutes to climb to this speed before *plateauing*.

We have to initiate many (5+) connections to saturate the link with tcp
connections with iperf.
----- End forwarded message -----

You pretty much solved your own puzzle right there: the throughput on a
single TCP connection will max out at the value determined by the bandwidth
delay product (excluding other strange conditions, such as deep buffers).

Here is a calculator online:

-andreas
[who has to explain this about once a week to customers who think
that they bought a GigE connection but then can't "ftp" a file from
coast to coast at 1Gbps throughput. Use multiple TCP streams!]

Yeah, try explaining to a VSAT customer why they don't get 10Mb/s on their 10Mb/s VSAT connection with a 600ms RTT!

The response is.. "I don't care, I want my 10Mb/s!"...

From: Andreas Ott [mailto:andreas@naund.org]
Sent: 28 June 2011 16:27
To: Eugen Leitl; williamejsalt@googlemail.com
Cc: NANOG list
Subject: Re: [pfSense Support] Strange TCP connection behavior 2.0 RC2
(+3)

-andreas
[who has to explain this about once a week to customers who think
that they bought a GigE connection but then can't "ftp" a file from
coast to coast at 1Gbps throughput. Use multiple TCP streams!]

Yeah, try explaining to a VSAT customer why they don't get 10Mb/s on their 10Mb/s VSAT connection with a 600ms RTT!

The response is.. "I don't care, I want my 10Mb/s!"...

In the 3G world, i have had good results overcoming longish RTT by
using the Hybla TCP algorithm http://hybla.deis.unibo.it/

I am hoping it gets more default traction, especially in wireless
where the radio link is a pretty big latency source

Cameron

How do you implement this for lots of clients and servers that have out of the box implementations? The FastSoft box is a TCP man-in-the-middle box that essentially implements the FAST TCP algorithm without either end having to worry about it.

I have also used home-fudged TCP proxies with some success.

Some 3G/wireless/VSAT vendors implement their own TCP modification stacks but they usually only fiddle with window sizes and such.

From: Cameron Byrne [mailto:cb.list6@gmail.com]
Sent: 28 June 2011 16:53
To: Leigh Porter
Cc: Andreas Ott; Eugen Leitl; williamejsalt@googlemail.com; NANOG list
Subject: Re: [pfSense Support] Strange TCP connection behavior 2.0 RC2
(+3)
In the 3G world, i have had good results overcoming longish RTT by
using the Hybla TCP algorithm http://hybla.deis.unibo.it/

I am hoping it gets more default traction, especially in wireless
where the radio link is a pretty big latency source

Cameron

How do you implement this for lots of clients and servers that have out of the box implementations? The FastSoft box is a TCP man-in-the-middle box that essentially implements the FAST TCP algorithm without either end having to worry about it.

You don't, the full benefits only come with a Linux kernel patch. The
good news is that it only has to be implemented on the client end.

I have also used home-fudged TCP proxies with some success.

Some 3G/wireless/VSAT vendors implement their own TCP modification stacks but they usually only fiddle with window sizes and such.

That's why i said i hope it catches on as default :slight_smile: If Android
implemented Hybla, i think it would be a great improvement for user
experience. Nobody likes the middleboxes that proxy TCP.... they cost
money, don't scale well, and are generally fragile. Hybla is not a
solution for the OPs issue, just a solution for high RTT links where
the client can do Hybla. It an evolutionary step that i think would
make a great fit in smartphones like Android.

Cameron

I have found most/all modern 3g networks can achieve optimal download speed
within their latency limitations (<200ms domestic end-to-end is normal for
most today) when combined with a modern operating system that does automatic
TCP receive window adjustments based on per-flow characteristics. I never
had a problem getting ~2 megabit from EVDO-revA, and can get ~20 megabit
without issue from the new Verizon LTE network. (Windows XP is not modern).

As for VSAT, most every vsat equipment manufacturer has TCP
acceleration/proxy support built into the satellite modem. They basically
forge acks at the hub site to buffer data from the server, then deliver it
it to the remote end in a continuous flow. Many also have protocol
optimizations for some of the more "chatty" protocols. If you use it, your
10 megabit should be achievable for typical HTTP/FTP consumer internet
activities, and it's surprisingly fast. I've sustained 6 without issue on
VSAT, only limited by bandwidth available, doing a simple SCP file transfer.

Of course, none of this is to the scale of transatlantic gigabit transfers
with a single flow...

I have found most/all modern 3g networks can achieve optimal download speed
within their latency limitations (<200ms domestic end-to-end is normal for
most today) when combined with a modern operating system that does automatic
TCP receive window adjustments based on per-flow characteristics. I never
had a problem getting ~2 megabit from EVDO-revA, and can get ~20 megabit
without issue from the new Verizon LTE network. (Windows XP is not modern).

AFAIK, Verizon and all the other 4 largest mobile networks in the USA
have transparent TCP proxies in place.

My point was that if end-hosts had Hybla or something similar, these
proxies can be removed providing a better end-to-end solution.

Cameron

Well, then you run into the nice problem of the RNCs only having 400 kilobytes of buffers per session and will drop packets if they receive more packets than that, or sometimes even just because they receive a burst of a few tens of packets at GigE linerate (because the customer with large TCP window size is talking to a GigE connected server).

The recommended "solution" from the vendor was to tune the client to a smaller TCP window size so that their RNC wouldn't get such a large burst.

*sigh*

This reminds me of the work I did in 1999 on getting T3 sat links to fully utilize the full 45Mb/sec:
http://www.interall.co.il/internet2-takes-to-the-air.pdf

-Hank

Excessively large buffers are a problem because they break TCP's RTT
measurement. Also TCP cannot measure the available bandwidth without
packet loss.

Tony.

?

TCP stacks will figure out available bandwidth just fine by measuring
return acks - there's no need to drop any packets.

Nick

Well, actually it can. ECN.

And regarding RTT measurement, since mobile network vary delay a lot, that's not going to work well anyway.

Do you have a reference for that information? Neither AT&T nor Sprint
seem to have transparent *HTTP* proxies according to
http://www.lagado.com/tools/cache-test. I would have thought that
would be the first and most important optimization a mobile carrier
could make. I used to see "mobile-optimized" images and HTTP
compression for sites that weren't using it at the origin on Verizon's
3G network a few years ago, so Verizon clearly had some form of HTTP
proxy in effect.

Aside from that, how would one check for a transparent *TCP* proxy? By
looking at IP or TCP option fingerprints at the receiver? Or comparing
TCP ACK RTT versus ICMP ping RTT?