vFlow :: IPFIX, sFlow and Netflow collector

Mehrdad_Arshad_Rad · May 15, 2017, 6:31pm

Hi all,

I just wanted to share the vFlow - IPFIX, sFlow and Netflow collector, it's
scalable and reliable, written by pure Golang!
It doesn't have any library dependency and works w/ Kafka and NSQ (you can
write your own MQ plugin).

https://github.com/VerizonDigital/vflow

For more information
https://www.linkedin.com/pulse/high-performance-scalable-reliable-ipfix-sflow-open-arshad-rad

It can be able to integrate w/ MemSQL easy and you can have kind of below
SQL query:

Vitaly_Nikolaev · May 16, 2017, 3:04pm

Hello,

Interesting, what receives and where do you keep flows at the other end of
messaging bus ?

PS: in my case I am talking about hundreds of kilo flows/s that I would
like to keep for at least few weeks, so MemSQL or any other SQLs are out of
the picture.

Thank you

Mehrdad_Arshad_Rad · May 16, 2017, 5:33pm

I tried w/ standalone MemSQL w/ 100K IPFIX samples per second and it works.
if you pay MemSQL license you can have more than one node (cluster).
another solution is ClickHouse https://clickhouse.yandex/ but I'm gonna to
test it soon
The MemSQL's nice feature is it has built in Kafka consumer w/ transform
feature.

Avi_Freedman1 · May 16, 2017, 7:34pm

Hello,

Interesting, what receives and where do you keep flows at the other end of
messaging bus ?

PS: in my case I am talking about hundreds of kilo flows/s that I would
like to keep for at least few weeks, so MemSQL or any other SQLs are out of
the picture.

Thank you

I've seen a lot of different approaches for people trying to build their
own at that scale (taking off of a bus and storing for medium-long term
analysis), so I'll share some data re: what I've seen (not specific to vFlow).

MemSQL as shown is one option, and is super fast even multi-tenant for the
in-ram row store. They have a to-disk column store as well but it is less
optimized for massively indexed retrieval. Still, it's worth noting that
it's not only an in-RAM solution. And it does batched inserts from row to
column store so can keep up with pretty high ingest rates to diskful column
store.

Another option in the "native" SQL-y space is citusdb, though the high
ingest rate was an issue last I looked, and it didn't have multi-tenancy/
rate-limiting support so any 1 monster query could slow everything down.

Both MemSQL and Citus are commercial, though a lot of Citus functionality
is OSS.

And for just forensics (vs ad hoc fast querying for operational or BI purposes),
they can be a good augment, though, but are well behind on performance vs.
at least one commercial solution, especially for multi-tenant use (reports,
peering analysis, spelunking via portal use, alerting, DDoS detection,
etc all going on at once).

There are plenty of Hadoop-ecosystem column stores as well that can take
directly from Kafka or with light translation: Presto, Impala, Drill,
and others. Most of them can do multi-column indexing and support SQL
as an interface, but multi-tenancy support is also lacking and if you don't
get indexes right, many kinds of queries can take minutes to hours over months
of data (even from a relatively few routers).

But they can all do multi hundred k FPS from Kafka. You'll also need to
run a Hadoop cluster.

And there HDFS-topped column store implementations running at pretty large scale.

Spark I've never seen people stick with - it can compute real-time
aggregates with streaming, and if you try to store from RAM to disk, it's
less badly slow than Hadoop for map/reduce patterns, but it's slower than
just about every column store for accessing trillions of records and doing
specific sub-selections to query or dynamically aggregate.

Clickhouse from Yandex is interesting but for flow people generally get
hung up on its single column for indexing. It can scan VERY fast though,
but that still puts it a bit better at 100% forensics use cases for
the data scale you're asking about.

The leading DIY option we see for store-all is actually the Elastic stack.

There are still issues with security (everyone who can access the Elastic
backend can access all of the data), and it can require a tremendous #
of machines to keep it fast - easily tens of machines for hundreds of
k FPS over months.

But it's doable and can be pretty fast, if a bit less network-savvy.
There's some support for storing prefixes now but still lacks some network
savviness (projecting across AS paths, multi hop lookup for finding ultimate
exit, flexibility in variable prefixlen querying) and you need to frontend with
something like pmacct to do fusion and then build that into an HA architecture
if it's really important.

But there are a number of DIY setups we've seen that are Elastic-based - more
than that are Hadoop/SQL-based.

And then, the biggest flow store I know of (1 or 2 carriers may want to argue
but I haven't seen theirs) is at DISA for DoD - > a decade of un-sampled flow
coming from SiLK. All stored in hourly un-indexed files, essentially nothing
but CLI to access, and cluster-able with work (there is a non-OSS add-on to
do it). But it works and is pretty neat in its own way, which is optimized
around again a forensics-only set of queries (vs. operations, BGP, peering,
cost analytics and optimization also).

And it can certainly ingest at more than the scale you're talking about and
is pretty efficient in storing it on disk. And if you ran it on top of a
big MapR-ish NFS cluster (no flames please, though I'm not completely joking)
you can effectively cluster it. Still will be pretty slow for anything but
time-bounded forensic queries.

And then (separate topic and equally long potential survey) there are a new
wave of streaming databases that can be used, which can consume directly from
Kafka.

If you don't mind having to pre-define queries, or using it to augment a
column store, they can be MUCH more lightweight than any of the above options,
though also lacking in some networking primitives. And if you're running on
sampled flow already, the extra lack of precision might not be an issue (they
pretty much all use probabalistic data structures like HLLs to do count and
topN).

And MemSQL can operate in that mode as well though I don't think that was how
Mehrdad was showing it working with vFlow.

But again you can't ever go 'back in time' for an ad hoc query with
them so it's probably more interesting as an augment and offloader for most
uses where you'd normally think of storing many billions or a few trillion flows.

Happy flow-ing...

Avi Freedman
CEO, Kentik

Joe_Loiacono · May 16, 2017, 8:08pm

To: Vitaly Nikolaev <nvitaly@gmail.com>
Cc: nanog@nanog.org, Mehrdad Arshad Rad <arshad.rad@gmail.com>
Date: 05/16/2017 03:36 PM
Subject: Re: vFlow :: IPFIX, sFlow and Netflow collector
Sent by: "NANOG" <nanog-bounces@nanog.org>

I've seen a lot of different approaches for people trying to build their
own at that scale (taking off of a bus and storing for medium-long term
analysis), so I'll share some data re: what I've seen (not specific to

vFlow).

Nice analysis of the current state of the art.

And then, the biggest flow store I know of (1 or 2 carriers may want to

argue

but I haven't seen theirs) is at DISA for DoD - > a decade of un-sampled

flow

coming from SiLK. All stored in hourly un-indexed files, essentially

nothing

but CLI to access,

FlowViewer provides a web GUI for invoking SiLK analysis tools. Provides
textual and graphical analysis with the ability to track filtered subsets
over time. Screenshots, etc.:

Joe

Avi_Freedman1 · May 16, 2017, 8:40pm

Commercial options is a different thread and I'm conflicted so shouldn't
try to summarize those...

> And then, the biggest flow store I know of (1 or 2 carriers may want to
argue
> but I haven't seen theirs) is at DISA for DoD - > a decade of un-sampled
flow
> coming from SiLK. All stored in hourly un-indexed files, essentially
nothing
> but CLI to access,

FlowViewer provides a web GUI for invoking SiLK analysis tools. Provides
textual and graphical analysis with the ability to track filtered subsets
over time. Screenshots, etc.:

FlowViewer download | SourceForge.net

Sorry, forgot about flowviewer - I've never seen it in use and asked at a bunch
of Flocons - but it looks updated more recently than I had thought.

On a related topic, I'd love to see NANOGers and general netops and perf-minded
people go to Flocon (put on by CERT, and heavily but not exclusively SiLK- and
security-focused).

Cross-pollination of interests, tools, and techniques will help us all...

Joe

Thanks,

Avi

i_mawsog · May 17, 2017, 3:48pm

A few questions and comments.
1. Is there any good open repository of netwflow data ?
2. How about open repository of raw packet capture ? 3. There are many companies that help collect raw packet - Gigamon, BigSwitch, ... . Do folks on this list have any experiences with these vendors ? 3. xFLows are apparently the only detailed metric collected on a wider scale. I heard even that is often considered a nuisance for the value it provides . What are the experiences of the the folks on this list ? Where and how netflow is usually collected ?
SG