Service Provider NetFlow Collectors

Erik_Sundberg1 · December 31, 2018, 3:29am

Hi Nanog….

We are looking at replacing our Netflow collector. I am wonder what other service providers are using to collect netflow data off their Core and Edge Routers. Pros/Cons… What to watch out for any info would help.

We are mainly looking to analyze the netflow data. Bonus if it does ddos detection and mitigation.

We are looking at

ManageEngine Netflow Analyzer

PRTG

Plixer – Scrutinizer

PeakFlow

Kentik

Solarwinds NTA

Thanks in advance…

Erik

Michael_Gehrmann · December 31, 2018, 4:44am

Add Flowtraq to your list.

Cheers
Mike

Aaron · December 31, 2018, 5:47am

I’m still using nfsen/nfdump

Been looking at manageengine netflow analyzer lately and liking it, we might be buying some time on Calix flowanalyze which might be an improved version of xangati

Aaron

Michel_ic_Luczak · December 31, 2018, 9:40am

Don’t underestimate good old ELK
https://www.elastic.co/guide/en/logstash/current/netflow-module.html
+ https://github.com/robcowart/elastiflow

BR, ic

Eric_Lindsjo · December 31, 2018, 9:52am

Hi,

We use kentik and we’re very happy. Works great, tons of new features coming along all the time. Going to start looking into ddos detection and mitigation soon.

Would recommend.

Kind regards,
Eric Lindsj�

Jorg_Kost · December 31, 2018, 9:59am

Hi,

I am always peeking at this OSS project for new installations

https://github.com/VerizonDigital/vflow

- but did not try it out myself so far.

Jörg

Karsten_Elfenbein · December 31, 2018, 10:35am

An other tool worth looking into is Traffic Sentinel from inMon.

Karsten

Bryan_Holloway1 · December 31, 2018, 4:36pm

+1 Kentik ...

We've been using their DDoS/RTBH mitigation with good success.

Mike_Hammett · December 31, 2018, 4:40pm

I just recently rolled out Elastiflow. Lots of great information.

Romeo_Czumbil1 · December 31, 2018, 5:19pm

I personally recommend Kentik.
We mainly got it for DDoS detection which so far been 100% reliable for us
Now we also use it for other traffic analysis.
Query is extremely fast.
Support is also fantastic. If you're looking for a feature that they may not have, just ask...

Matthew_Crocker · December 31, 2018, 5:56pm

+1 Kentik as well, DDoS, RTBH, Netflow. Cloud based so I don't have to worry about it.

    +1 Kentik ...

    We've been using their DDoS/RTBH mitigation with good success.

Colton_Conor · December 31, 2018, 10:14pm

Doesn’t Kentik cost like $2000 a month minimum?

Avi_Freedman1 · January 1, 2019, 4:56am

We do have a minimum for commercial service that's more like $1500/mo but we are coming out with a free tier in Q1 with lower retention (among other deltas, but including fully slice and dice flow analytics +BGP that it sounded like Erik might be looking for).

Feel free to ping me if anyone would like to help us test the free tier in January.

Thanks,

Avi Freedman
CEO, Kentik

H_I_Baysal · January 2, 2019, 11:50am

PMACCT (Works Awesome)
push to influxdb ( Works awesome)

With some custom scripts to add/match interface descriptions. And you can query whatever you want in grafana
And grafana has a nice API for rendering a dashboardgraph to a PNG and you can send this png to whatever chat/bot or mail you want.

And all for free with 99% of accuracy.

(Mucho gracias to Paulo )

Tim_Raphael · January 2, 2019, 12:08pm

I would advise against InfluxDB in this case - flow data has a very high (and open) tag cardinality which is not suited to Influx (although their recently new index format has improved this).

I’m currently pushing sFlow through Pmacct —> Kafka —> Clickhouse (columnar store) with a summing merge tree database engine.
Clickhouse is very fast for queries across columns as well as aggregating down them (e.g. summing number of bytes).

For example this is the results of a query of nearly a year’s worth of MAC-to-MAC flows (7-tuple) queried for the last 7 days between two given sets of MACs:

2016 rows in set. Elapsed: 0.208 sec. Processed 17.56 million rows, 1.03 GB (84.51 million rows/s., 4.97 GB/s.)

There is also a Grafana datasource plugin for Clickhouse

- Tim

Saku_Ytti1 · January 2, 2019, 12:20pm

Hey Tim,

I would advise against InfluxDB in this case - flow data has a very high (and open) tag cardinality which is not suited to Influx (although their recently new index format has improved this).

I'm not entirely sure I understand. Does this mean the permutations of
tags are high, i.e. series count is high? If so, isn't this general
problem and advice against all TSDBs? If so, I fully agree, you
couldn't/shouldn't make for example IP addresses your tags,
potentially creating 2**32*2 series without any other tags, it's
rather non-sensical proposal in TSDB.

Influx themselves comment that >10M series is likely infeasible. So
you need unique tag combinations to be low millions at most.

H_I_Baysal · January 2, 2019, 12:37pm

Hi Tim,

That absolutely depends on the amount of TAGs you use, and how you aggregate, etc.
I am collecting DSTAS, SRCAS, en DST AS per IP. And influx is not even sweating a single drop…

We have a 4 Tbps of traffic during peak, and as well as pmacct and influxdb or running very very smooth.

(With the mentioned aggregations I can see what a single customer costs with Transit, Peering and IX (per IP even if needed) )
And dst AS per port/description/ethernet name

From your mail i derive that you just pushed everything to influx from flows, you have to be a bit smarter with the layout, aggregations and continuous queries.
(collect what you need)

Tim_Raphael · January 2, 2019, 12:48pm

This is correct,

With a flow database you want to be able to say: “show me all HTTP traffic from subnet a.b.c.0/24” which requires you to either keep individual IPs or aggregate subnets. Combined with port and protocol data for both source and destination, the series count shoots way above 10M.

- Tim

Saku_Ytti1 · January 2, 2019, 12:59pm

Hey,

Tim_Raphael · January 2, 2019, 1:43pm

That’s a much better cardinality (AS based) but it’s not the general case. Even if you want per-prefix information I’d argue that Influx would still not handle the load (~700k ^ 2 cardinality). For limited tag-sets it would do the trick.

I never did attempt to push it to Influx with some foresight that it’d be suboptimal for my ultimate use cases. I wanted a solution that could handle a wide range of use cases without having to worry about limits on tag-sets.

I found Clickhouse able to do what I wanted in a performant way.

Tim