TCP receive window set to 0; DoS or not?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

New listener, first-time caller.

I've been seeing some systems that stop serving pages, and I also see
the Linux "Treason Uncloaked!" kernel messages that indicate a remote
system reduced its rcv win from 1 to 0... is there a non-malicious
explanation for this, aside from a remote host running out of socket
buffers? Seems to happen too often for that to be the case, and
my googling has shown that it may be outside of spec. Certainly
the warning is clear enough...
- --
The whole point of the Internet is that different kinds of computers
can interoperate. Every time you see a web site that only supports
certain browsers or operating systems, they clearly don't get it.

I've been seeing some systems that stop serving pages, and I also see
the Linux "Treason Uncloaked!" kernel messages that indicate a remote
system reduced its rcv win from 1 to 0... is there a non-malicious
explanation for this, aside from a remote host running out of socket
buffers? Seems to happen too often for that to be the case, and
my googling has shown that it may be outside of spec. Certainly
the warning is clear enough...

I've seen this, quite a bit, on some heavy traffic web clusters. Some
impolite web browsers will shrink the TCP window to kill the socket
connection instead of a proper fin/reset.

- billn

Advertising a window of 0 is a perfectly valid way of telling the other
side that you are temporarily out of resoruces, and would like them to
stop sending you data. This can be caused by any number of things, from a
completely bogged down box, to an application which isn't read()ing off
its socket buffer (thus for all intents and purposes the kernel is out of
resources to buffer any more data for that socket). It doesn't kill the
TCP session, it just throttles it back. The sender then goes into problem
the zero window mode, waiting for this condition to go away. It is
described in RFC 1122 section 4.2.2.17:

            Probing of zero (offered) windows MUST be supported.

            A TCP MAY keep its offered receive window closed
            indefinitely. As long as the receiving TCP continues to
            send acknowledgments in response to the probe segments, the
            sending TCP MUST allow the connection to stay open.

            etc etc etc

Looking at the Linux code which calls the error message (tcp_timer.c
tcp_retransmit_timer()), the condition which triggers it is:

         if (!tp->snd_wnd && !sock_flag(sk, SOCK_DEAD) &&
             !((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))) {
                 /* Receiver dastardly shrinks window. Our retransmits
                  * become zero probes, but we should not timeout this
                  * connection. If the socket is an orphan, time it out,
                  * we cannot allow such beasts to hang infinitely.
                  */

It looks like it is just detecting this condition and changing its
behavior in accordance with the RFC. Since the actual print of the message
is wrapped in #ifdef TCP_DEBUG, it probably isn't intended to be displayed
to end users at all. As for the cute "Treason Uncloaked" message, thats
what you get for running an OS written by/for 14 year olds. :slight_smile:

Or at least thats make 15 minute take on it, having not touched Linux
(gleefully) in many, many years.

This makes sense when taken in combination with our earlier assessment
that it was coming from mobile devices, like PDAs or smart phones, limited
on both CPU and optimized to save power.

- billn

Advertising a zero window is perfectly proper, Probing one is mandatory,
per RFC 793:

  The sending TCP must be prepared to accept from the user and send at
  least one octet of new data even if the send window is zero. The
  sending TCP must regularly retransmit to the receiving TCP even when
  the window is zero. Two minutes is recommended for the retransmission
  interval when the window is zero. This retransmission is essential to
  guarantee that when either TCP has a zero window the re-opening of the
  window will be reliably reported to the other.

But closing an open window is a bad idea:

  The mechanisms provided allow a TCP to advertise a large window and to
  subsequently advertise a much smaller window without having accepted
  that much data. This, so called "shrinking the window," is strongly
  discouraged. The robustness principle dictates that TCPs will not
  shrink the window themselves, but will be prepared for such behavior
  on the part of other TCPs.
    
    --Steven M. Bellovin, http://www.cs.columbia.edu/~smb

Richard A Steenbergen <ras@e-gerbil.net> writes:

Advertising a window of 0 is a perfectly valid way of telling the other
side that you are temporarily out of resoruces, and would like them to
stop sending you data....

Except that that's not what's going on here. This message appears
when the TCP peer shrinks the window, withdrawing a previously granted
permission to send bytes -- a protocol violation. For example, you're
free to tell me (with your window advertisement) that you're
authorizing me to send you 32K bytes, and then, after I've sent you
32K bytes, to close the window until you're ready to accept more.
You're not free to tell me it's OK to send 32K bytes, then change your
mind and advertise a window size of 0 after I've sent you only 16K
bytes.

To address the "DoS" question, I don't see how this protocol violation
enables a DoS attack. More likely, it's simply somebody's buggy
TCP stack misbehaving. That "somebody" is unlikely to be Windows, MacOS,
FreeBSD, or Linux. My money is on some flavor of $50 NAT/"home router"
box.

Jim Shankland

Richard A Steenbergen <ras@e-gerbil.net> writes:
> Advertising a window of 0 is a perfectly valid way of telling the other
> side that you are temporarily out of resoruces, and would like them to
> stop sending you data....

Except that that's not what's going on here. This message appears
when the TCP peer shrinks the window, withdrawing a previously granted
permission to send bytes -- a protocol violation. For example, you're
free to tell me (with your window advertisement) that you're
authorizing me to send you 32K bytes, and then, after I've sent you
32K bytes, to close the window until you're ready to accept more.
You're not free to tell me it's OK to send 32K bytes, then change your
mind and advertise a window size of 0 after I've sent you only 16K
bytes.

Ok, looking at the error condition in further detail I do believe that
you're righ. So, per RFC1122:

4.2.2.16 Managing the Window: RFC-793 Section 3.7, page 41

       A TCP receiver SHOULD NOT shrink the window, i.e., move the
       right window edge to the left. However, a sending TCP MUST
       be robust against window shrinking, which may cause the
       "useable window" (see Section 4.2.3.4) to become negative.

It is a warning message generated by a "SHOULD NOT" violation, during the
"MUST be robust against this behavior" section of code.

Looking at other such messages in the Linux kernel which are wrapped in
#ifdef TCP_DEBUG, they all appear to be equally esoteric and probably not
worth mentioning to the end user. However it looks like TCP_DEBUG is
enabled by default (don't ask me why), which when combined with a
relatively inane message using "alarm provoking" words, serves only to
confuse. :slight_smile:

To address the "DoS" question, I don't see how this protocol violation
enables a DoS attack. More likely, it's simply somebody's buggy
TCP stack misbehaving. That "somebody" is unlikely to be Windows, MacOS,
FreeBSD, or Linux. My money is on some flavor of $50 NAT/"home router"
box.

Did a little poking into this condition on other platforms as well, and as
previously mentioned it does appear to be fairly contained to "mobile
devices" (not sure which ones though). I guess if you have a small
portable device with limited memory, this may be an issue.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jim Shankland wrote:

To address the "DoS" question, I don't see how this protocol violation
enables a DoS attack. More likely, it's simply somebody's buggy
TCP stack misbehaving. That "somebody" is unlikely to be Windows, MacOS,
FreeBSD, or Linux. My money is on some flavor of $50 NAT/"home router"
box.

The part where it becomes a DoS is when they tie up all the listeners
on a socket (e.g. apache), and nothing happens for several minutes until
their connections time out. Whether intentional or not, it does have
a negative effect.

It's insidious in that it leaves no traces in the application logs;
in particular, apache never logs anything because they never
complete a transaction (it logs when they finish).
- --
The whole point of the Internet is that different kinds of computers
can interoperate. Every time you see a web site that only supports
certain browsers or operating systems, they clearly don't get it.

Travis Hassloch <travis.hassloch@rackspace.com> writes:

The part where it becomes a DoS is when they tie up all the listeners
on a socket (e.g. apache), and nothing happens for several minutes until
their connections time out. Whether intentional or not, it does have
a negative effect.

Ah, that makes sense. I was assuming a deliberate attack, which is
not actually implicit in the term "DoS". A deliberate denial of
service is not made easier by shrinking the window. But an implementation
that advertises a 0 window in lieu of sending FIN or RST can certainly
deny service inadvertently by tying up resources that should have been
freed.

Jim Shankland