Data Mining/Crawling through a Mailing List

Hello,

A bit off topic but i was looking for a way/tool that could crawl through
nanog(or other) archives and try to filter most common discussions and
things like that, if anyone is aware of such a tool, pls let me know.

Thanks,
Kim

That tool will have its work cut out for it... :wink:

Dump it all into Hadoop and run a word cloud analysis :3.

Honestly it sounds like a cool idea, and I'm sure someone has worked on it before but I don't know anything off the top of my head.

Cheers,
Joshua

Were you thinking about parsing NANOG and creating a word-based streamgraph
like this?
http://www.benfarahmand.com/2012/12/psl-listserv-streamgraph.html
The author of that streamgraph did provide some additional information on
the steps he took to create it,
but may be too long (including attachments) to post directly to NANOG.

Tony Patti
CIO
S. Walter Packaging Corp.