O oracle of nanog: unlike things like rogue processes eating tons of CPU,
it seems to me that network monitoring is essentially a black art for the
average schmuck home network operator (of which I count myself). That
is: if the "network is slow", it's really hard to tell why that might be and
who of the eleventy seven devices on my wifi is sucking the life out of my
bandwidth. And then even if I get an idea of who the perp is, my remediation
choice seems to be "find that device, smash it with sledge hammer".
It seems that there really ought to be a better way here to manage my
home network. Like, for example, the ability to get stats from router and
tell it to shape various devices/flows to play nice. Right now, it seems to
me that the state of the art is pretty bad -- static-y kinds of setups for
static-y kinds of flows that people-y kind of users don't understand or
touch on their home routers.
The ob-nanog is that "my intertoobs r slow" is most likely a call to your
support desks which is expensive, of course. Is anything happening on
this front? Is openwrt, for example, paying much attention to this problem?
Someone created an application for uverse users that goes into the gateway and pulls relevant information. The information (link retrain, for example) is then color coded for caution and out of range. The application is called up real time, not something peddled by at&t to show how "great" your connection is. People unfortunately believe a speed test is a reliable way to measure a connection quality. There may be utilities out there like this that look at signal levels and statistics to tell the user their connection blows. I believe the uvrealtime application actually shows the provider sending resets as a deterrent for using bit torrent.
It would be nice for such a thing to tell me that my ISP connection is
having trouble too, but I'm mostly interested in understanding the things
that are nominally under my control on my home network. It seems that
most routers have (gratuitous?) apps these day, but given the awfulness
of their web UI's and their configurability, I don't have much hope that
they do what I want.
I've been using per-connection queues on a Mikrotik 450G; this permits
shaping based on the destination/source IP, so no one device can nom
all of the bandwidth on the link unless it's uncontested; should more
than one device want all the bandwidth they both get half, and so on
(in a typical config). It's not flawless but it's a massive
improvement on no shaping whatsoever.
The gotcha is that you need to configure your link speed in the router
for it to be aware of the capacity it has to play with, but that's not
something you have to touch very often most of the time (though if
your connection speed/upstream capacity varies, there's not a lot
that'll help you at that point. But it does most of the time stop the
"X is watching HD YouTube videos and now I can't check my email" sort
of problems. It's a nice set-and-forget solution.
ntop or similar on a Linux boxen in concert with flows from said
Mikrotik tends to help more than anything for analysis of usage etc,
but it's still an inelelegant solution to the problem of analyzing
links in this scenario. I'd be interested in what other people are
using for home connection debugging.
I've had great luck with Cisco's fair-queue option (and similar
techniques). Using RED, small queues (think on the order of 10-20
packets), and creating a choke point in and out of the network, I've
implemented similar behavior on plenty of DSL lines on the CPE-side. My
most successful was sharing one 7mbps line with 120 technical employees -
before the implementation of improved queuing, web pages took 60 seconds or
more to load during peak usage. After implementation, people didn't know
they were on a shared DSL unless they tried streaming video (fortunately
not a business requirement) or a bulk download (it worked fine, it just
would be slow if there were several others going on at the same time). I
suspect I could have even made a VoIP call across the line with a MOS in
the high 3's easily.
A second issue is poor wireless retransmission and buffering
implementations in consumer wireless. For my home, to make VoIP work with
low-end gear, I had to break most HTTP sessions and switch to a delay-based
congestion control algorithm inside my network - due to the 5+ second
buffers on the wifi gear. That would probably have been enough, but
turning on WMM really took the rest of the pain out of wifi-VoIP.
I don't know how to fix the home wifi problems (WMM helps with some
applications, certainly, but it's not a full solution if you still have 5
second buffers in the default traffic class). But for the other problems,
it would be nice if my provider didn't give me huge buffers and no RED on
the output queue (I have no idea if they are doing the best they can with
the gear they have or not, so there may not be any option here). But, even
without that, home routers can do better than they do now. My router knows
what speed it's connected at. It can create an internal bottleneck
slightly slower, prioritize small packets, implement RED, and use
reasonably-sized buffers (fast downloads should not increase ping times by
hundreds of ms). I shouldn't need to hang a Linux box between it and my
home network.
Large buffers have broken the average home internet. I can't tell you how
many people are astonished when I say "one of your family members
downloading a huge Microsoft ISO image (via TCP or other congestion-aware
algorithm) shouldn't even be noticed by another family member doing web
browsing. If it is noticed, the network is broke. Even if it's at the end
of a slow DSL line."
For Mikrotik routers, use the Winbox application and the Torch function on the interface. You can set it to show flows by various criteria such as source IP. That will tell you which client is chewing up the bandwidth at any instant.
Another way to go that I have not tried with Mikrotik is the Solarwinds Netflow analyzer. It tracks 60 minutes of data.
This is true only to a point: if you have 5 people streaming movies
on a 2 people broadband you're going to have problems regardless of
the queuing discipline. That said, it's pretty awful that in this day and
age that router vendors can't be bothered to set the default linux kernel
queuing parameters to something reasonable.
In any case, my point was really about wanting to deal with what happens
when your isp bandwidth is saturated and being able to track it down and/or
kill off the offenders. I haven't bought a router in the last year or two as
"apps" have become de rigueur, but it sure seems like it would be nice to
be able to do that. I'm pretty sure that I still can't (= being a dumb consumer,
not a net geek jockey), but would like to hear otherwise.
This was in 2005 so things may have changed/progressed.
It wasn't hard to give out some static dhcp leases and look at graphs
and see who the bandwidth piggies were, and then set some throttling.
Housemates weren't kicking down any money for the DSL line and running
p2p sharing apps... Not good for latency sensitive gaming!