We are looking at replacing our Netflow collector. I am wonder what other service providers are using to collect netflow data off their Core and Edge Routers. Pros/Cons… What to watch out for any info would help.
We are mainly looking to analyze the netflow data. Bonus if it does ddos detection and mitigation.
Been looking at manageengine netflow analyzer lately and liking it, we might be buying some time on Calix flowanalyze which might be an improved version of xangati
We use kentik and we’re very happy. Works great, tons of new features coming along all the time. Going to start looking into ddos detection and mitigation soon.
I personally recommend Kentik.
We mainly got it for DDoS detection which so far been 100% reliable for us
Now we also use it for other traffic analysis.
Query is extremely fast.
Support is also fantastic. If you're looking for a feature that they may not have, just ask...
We do have a minimum for commercial service that's more like $1500/mo but we are coming out with a free tier in Q1 with lower retention (among other deltas, but including fully slice and dice flow analytics +BGP that it sounded like Erik might be looking for).
Feel free to ping me if anyone would like to help us test the free tier in January.
PMACCT (Works Awesome)
push to influxdb ( Works awesome)
With some custom scripts to add/match interface descriptions. And you can query whatever you want in grafana
And grafana has a nice API for rendering a dashboardgraph to a PNG and you can send this png to whatever chat/bot or mail you want.
I would advise against InfluxDB in this case - flow data has a very high (and open) tag cardinality which is not suited to Influx (although their recently new index format has improved this).
I’m currently pushing sFlow through Pmacct —> Kafka —> Clickhouse (columnar store) with a summing merge tree database engine.
Clickhouse is very fast for queries across columns as well as aggregating down them (e.g. summing number of bytes).
For example this is the results of a query of nearly a year’s worth of MAC-to-MAC flows (7-tuple) queried for the last 7 days between two given sets of MACs:
2016 rows in set. Elapsed: 0.208 sec. Processed 17.56 million rows, 1.03 GB (84.51 million rows/s., 4.97 GB/s.)
There is also a Grafana datasource plugin for Clickhouse
I would advise against InfluxDB in this case - flow data has a very high (and open) tag cardinality which is not suited to Influx (although their recently new index format has improved this).
I'm not entirely sure I understand. Does this mean the permutations of
tags are high, i.e. series count is high? If so, isn't this general
problem and advice against all TSDBs? If so, I fully agree, you
couldn't/shouldn't make for example IP addresses your tags,
potentially creating 2**32*2 series without any other tags, it's
rather non-sensical proposal in TSDB.
Influx themselves comment that >10M series is likely infeasible. So
you need unique tag combinations to be low millions at most.
That absolutely depends on the amount of TAGs you use, and how you aggregate, etc.
I am collecting DSTAS, SRCAS, en DST AS per IP. And influx is not even sweating a single drop…
We have a 4 Tbps of traffic during peak, and as well as pmacct and influxdb or running very very smooth.
(With the mentioned aggregations I can see what a single customer costs with Transit, Peering and IX (per IP even if needed) )
And dst AS per port/description/ethernet name
From your mail i derive that you just pushed everything to influx from flows, you have to be a bit smarter with the layout, aggregations and continuous queries.
(collect what you need)
With a flow database you want to be able to say: “show me all HTTP traffic from subnet a.b.c.0/24” which requires you to either keep individual IPs or aggregate subnets. Combined with port and protocol data for both source and destination, the series count shoots way above 10M.
That’s a much better cardinality (AS based) but it’s not the general case. Even if you want per-prefix information I’d argue that Influx would still not handle the load (~700k ^ 2 cardinality). For limited tag-sets it would do the trick.
I never did attempt to push it to Influx with some foresight that it’d be suboptimal for my ultimate use cases. I wanted a solution that could handle a wide range of use cases without having to worry about limits on tag-sets.
I found Clickhouse able to do what I wanted in a performant way.