RE: Let's talk about Distance Sniffing/Remote Visibility

Gironda_Andre · March 28, 2002, 11:14pm

> I'm imagining that even with a relatively speedy box, if you were
> trying to do analysis from multiple interfaces you'd at least choke
> the disk I/O. There's always stringent filters, I guess.

Disk I/O on a sniffer box? Sounds like you've been sniffing something
other than packets my friend.

Why do you say that? In the 10/100 range, yes, no problems. But at
the Gigabit range (say with two GbE cards or a single OC-48 card) on
an x86 box with IDE disks (or even SCSI RAID0), doesn't disk I/O
become a severe problem? Under Solaris or Linux, scaling disk seems
relatively easy with Veritas Foundation Suite on Solaris or GFS under
Linux All Red Hat products

However, I don't think Linux or Solaris can handle the packet capture
capabilities like FreeBSD and BPF can. I've heard things about the
new LPF capabilities and turbopacket, but it's just hard to believe
coming from such a joke/toy operating system.

Whether you are passively tapping a gigabit ethernet or SONET fiber,
or even spanning an entire VLAN or mirroring a gigabit ethernet or
SONET port on a router/switch -- you've got a lot of packets/frames
to deal with, especially if you want to keep all of them for analyzing
later. Sounds like a disk I/O problem to me. Are you doing packet
capture at these rates and ran into no disk problems? How did you
deal with that? We are doing so right now, but only with the IP
headers and some "top N" information. Getting full packets and
keeping them for awhile (a day or two even) is going to take a lot
of I/O and disk space. I don't think it's worth it, really.

You can build your own box like that easily enough. If you're going
for FastE sniffing I highly recommend the Adaptec Quartet 4-port
cards. If you're going for GigE sniffing, I STILL highly recommend
anything Alteon Tigon 2 based (NetGear GA620's were the cheapest
if you can still find them, not the 621/622).

You can accomplish almost anything with the Tigon2's and FreeBSD, agreed.

Another vendor I'm sort of looking at now is Endance (DAG cards):

http://www.endace.com/products/dag42ge.html

This only does IP headers, but that's the fun stuff anyways ;>

You don't even have to do anything fancy with the card firmware,
there is a native command for receiving only part of the frame.
Check out the programming manuals at
Index of /~wpaul/Alteon/, and I recommend you use
FreeBSD for this of course. Just add in a PARTIAL_RX_CNT command,
and the card will only DMA part of the packet (say 64 bytes for
full headers) across the PCI bus. Combined with interrupt coalescing
(or luigi's device polling and tuning the card to allocate all
memory to RX and remove the TX functionality completely), you can
sniff quite a few "gigabits" of traffic on a single cheap PC server.
You can dump it through the BPF mechanism and still maintain support
for all your favorite sniffer programs. Or if you're comfortable
writing kernel code, I recommend you make a character device for
sniffer device control, and use it to pass page-aligned malloc'd
memory pointers from userland into the nic driver, which you then
pass to the card as the RX ring buffers. This will let you DMA your
packets directly into userland. If not, at least unhook ether_input().

Can you post more details or catch up with me offline about this? I'm
really very interested in your implementations and results.

Or you can buy these things commercially. My favorite was from a
company called Tekelec, who sold a VERY expensive box which turned
out to be a pentium 200ish box running solaris x86 and completely
useless sniffing software, with a bunch of ISA ethernet cards hooked
up by proprietary (and VERY expensive) cables, all in a box made
out of what I swear was some kind of lead/neutron star material
alloy. Of course that was a couple years ago, maybe they've upgraded
to the current market's $50 processor.

We've been looking at NetVCR from Niksun which sounds similar except
that is actually is FreeBSD-based. Somebody needs to put together a
list of all these companies and do some comparisons of the product
offerings. Like you, I'd rather just build my own box and run with it ;>

-dre

Richard_A_Steenbegen · March 29, 2002, 12:21am

Why do you say that? In the 10/100 range, yes, no problems. But at
the Gigabit range (say with two GbE cards or a single OC-48 card) on
an x86 box with IDE disks (or even SCSI RAID0), doesn't disk I/O
become a severe problem? Under Solaris or Linux, scaling disk seems
relatively easy with Veritas Foundation Suite on Solaris or GFS under
Linux All Red Hat products

Capturing packets for realtime analysis is an attainable goal using cheap
off the shelf hardware and a little bit of clue. Storing many Gbps of data
on a harddrive is much harder task. Even using 160Gig drives, 1Gbps fills
one in about 20 minutes (10 if you're recording full duplex). Unless
you're the FBI, I really don't think you want to store that much data for
any reason. Be smart in what you write to disk, and how you write it.

However, I don't think Linux or Solaris can handle the packet capture
capabilities like FreeBSD and BPF can. I've heard things about the
new LPF capabilities and turbopacket, but it's just hard to believe
coming from such a joke/toy operating system.

The data capture mechanism of BPF is pretty simple (the filter language is
whats complex), I doubt even Linux can get it too wrong.

All you need is a buffer in the kernel (FreeBSD defaults to 4096,
libpcap turns it up to 32768 I believe but doesn't expose the value to the
user, you should probably turn that up a bit if you want to capture at
high speed). Read data from the nic, copy it into the buffer (or
preferably have the NIC be responsable for transfering it into the buffer
:P), and increment the offset. Then when someone comes along to read for
more data, copy out the buffer into the userland buffer, use the offset
value to indicate the total length, and reset the offset to 0. If you need
more than 20 lines to do that part, you're probably doing it wrong.

In normal use of bpf the data is copied 3 times, from the NIC to an mbuf,
from the mbuf to the bpf buffer if there is a configured bpf reader, and
then from the bpf buffer to the user supplied buffer when the user does a
read() on the BPF descriptor. Fortunately multiple packets are buffered
into a single copy in stages 2 and 3. If you want to eliminate some of
those copies, you have to make a dedicated reader mechanism. Malloc the
memory in userland so you get nice page aligned chunk, allocate the
counters in userland and pass it in via a character device similar to BPF.
You probably want to go with a ring structure, use 2 counters as a
producer and consumer index. The kernel updates the producer index, and
you update the consumer index as you process data. When both values are
equal, the ring is empty. When the end is 1 below the start, the ring is
full. With an intelligent card, you pass the memory address of your
userland allocated memory as where you want the RX data to be DMA'd. The
kernel updates the producer index, discarding any data which the consumer
can't read. Then you just have your userland program constantly scanning
the ring for new data, put a usleep(1); in there and you'll stay below
0.01% cpu.

Think there would be a benefit to writing this as an extension to BPF?