SYN Resisting

I know this may not be strictly on-topic here because it deals with
"host-stuff" rather than "router-stuff", but here goes...

I will have some comments on how to track where SYN storms are coming
from a bit later.

In order to build a SYN-resistant BSD kernel, you need to modify one
file in src/sys/os, uipc_socket2.c, and you also need to modify
src/sys/netinet/tcp_timer.h and you have to rebuild tcp_usrreq.c and
tcp_input.c in the netinet directory.

For those without SunOS source, I will get Sun4c (Sparc 1/1+/2/IPC/IPX/
ELC/SLC) binaries online; for those running BSD on other platforms, you
probably have source.

From the bottom level up, change TCPTV_KEEP_INIT from 75*PR_SLOWHZ

to 7*PR_SLOWHZ (or whatever # you want). This timeout (the 75) is
the number of seconds that the kernel will keep un-established TCP
PCB/sockets around for... When the SYN is received, it is acknowledged
and the PCB && socket are set up for the embryonic session; the goal
is to rip those things out of any queues they're in more aggressively.

At the top (socket) level, instead of modifying SOMAXCONN, I decided to
just see what happened if I removed the limit. What you do is up to your
own personal taste. I commented out:

  if (head->so_qlen + head->so_q0len > 3 * head->so_qlimit / 2)
    goto bad;

in src/sys/os/uipc_socket2.c.

Head in this case points to a 'server' socket (the socket for your
web, mail, news, ... server). so_qlimit is set to the min of either
what the listen() system call inside of it requested or SOMAXCONN.
I had some funkiness increasing SOMAXCONN to 8096 or so when I was
playing with it - and didn't want to recompile inetd, sendmail, etc...
to ask for more slots in the listen() queue (just a linked list or two),
so I figured I'd *try* to make the queue size infinite and see what
happened. so_qlen and so_q0len are the linked lists of sockets waiting
to be accept()ed and the sockets of the embryonic (not established)
TCP connections that were aimed at this server socket, respectively.
The code uses a 3/2 fudge factor to make the comparison, and is saying
"if the number of queued requests is > 3/2 times the limit for this
socket, don't stick this requesting socket in the queue - just destroy
it and exit".

I just commented those two lines out.

On a Sparc 1+ w/ 4.1.4, I could sustain a 200-400 SYN-packet/sec attack
and still remain functional (and quick for a 1+), but the machine didn't
normally run web servers... Even when I nailed it with 1000 SYNs/sec,
the machine continued functioning but I couldn't connect to the socket
being nailed. A second after stopping the heavier attack, I could.

I've had trouble compiling and getting these modified modules to work on a
Sun4m architecture (Sparc 5 and 10) but may play more with that today.

The best solution is to implement a better data structure than a linked
list for storing the embryonic connections per socket. A large-ish array
with appropriate hashing, perhaps. Either per socket or for the whole
kernel. If anyone wants to attack that problem, please do; otherwise,
I'll blow BSD on a laptop so I can play with it when I'm next on a plane/
train.

Avi

In order to build a SYN-resistant BSD kernel, you need to modify one
file in src/sys/os, uipc_socket2.c, and you also need to modify
src/sys/netinet/tcp_timer.h and you have to rebuild tcp_usrreq.c and
tcp_input.c in the netinet directory.

For those of you running Solaris 2.5, this can be done using ndd. The man
page and the "ndd /dev/tcp \?" command will get you started. You will have
to tweak the following variables "tcp_conn_req_max" and
"tcp_conn_grace_period". This will have roughly the same effects as Avi's
patches.

>From the bottom level up, change TCPTV_KEEP_INIT from 75*PR_SLOWHZ
to 7*PR_SLOWHZ (or whatever # you want). This timeout (the 75) is
the number of seconds that the kernel will keep un-established TCP
PCB/sockets around for... When the SYN is received, it is acknowledged
and the PCB && socket are set up for the embryonic session; the goal
is to rip those things out of any queues they're in more aggressively.

On web servers, remote users routinely take longer than this to set up
connections. Anything less than 15-20 seconds and you will start loosing
hits from those ISP's that Metcalfe seems to frequent. This isn't a
criticism of Avi's patch. Its just something to be aware of.

On a Sparc 1+ w/ 4.1.4, I could sustain a 200-400 SYN-packet/sec attack
and still remain functional (and quick for a 1+), but the machine didn't
normally run web servers... Even when I nailed it with 1000 SYNs/sec,
the machine continued functioning but I couldn't connect to the socket
being nailed. A second after stopping the heavier attack, I could.

I have no idea what this will do for performance on Solaris 2.5 machines.

-chris

PS Does anyone have a good source of info on the Solaris implementation
for those of us not lucky enough to have source licenses?