Has anyone else observed Netflix sessions attempting to come into customer CPE devices at well in excess of the customers throttled plan?
I'm not talking error retries on the line. I'm talking like two to three times in excess of what the customers CPE device can handle.
I'm observing massive buffer overruns in some of our switches that appear to be directly related to this. And I can't figure out what possible good purpose Netflix would have for attempting to do this.
The second part. Fixed wireless is not even on their radar.
So they are trying to stuff every last bit as an end device modulates up
and down?
Or are you saying that's how they determine if they can scale up the
resolution "because there is more throughout available now".
Adaptive bandwidth detection.
Has anyone else observed Netflix sessions attempting to come into
customer CPE devices at well in excess of the customers throttled plan?
I'm not talking error retries on the line. I'm talking like two to three
times in excess of what the customers CPE device can handle.
I'm observing massive buffer overruns in some of our switches that appear
to be directly related to this. And I can't figure out what possible good
purpose Netflix would have for attempting to do this.
Pardon my ignorance of WISP-specific bits here, but how are they supposed to know to back off on their bitrate ramp-up if you keep buffering rather than dropping packets when the TX rate exceeds the customer's service rate? Or what am I missing?
I'm not buffering. Switches have packet buffers. I'm seeing switch buffers getting overrun by what appears to be Netflix traffic coming in at rates faster than the subscribers throttled speeds.
How big are your buffers (preferrably answer would be in milliseconds)? What access speeds are you providing?
It's standard behavior for network traffic to sometimes be at higher speeds than the customer access speeds, the sender only knows about congestion if there is an increase in RTT or if there is packet loss (or in case of ECN, EC=1 flag), and the only way to find out is to probe (=run faster than customer access speed).
I believe others have observed a similar situation with at least one other CDN and the situation continued solid for hours, not just occasional capacity detection.
Yes, ABR video attempts to fill the entire channel
This has been problematic as peak edge speeds have increased and pushed the
statistical multplexing logic / plans.
There is also buffer bloat issues that exacerbate the problem by allowing
elephant flows to be too greedy at the expsense of others on the access
segment
Netflix is streaming video. It will try to do the best data rate it can. If the connection can handle 4 megs a second it is going to try and do 4 megs a second. If the network can’t handle it then Netflix will back off and adapt to try and fit.
Keep in mind, at least last I knew, a full HD stream was somewhere around 5 megs a sec. If the customer has a 4 meg plan it will try and fill up that 4 megs unless the algorithm backs off and steps it down. ISPs who run into this on lower packages need to implement QOS at the customer level to deal with streaming. This can be done several ways. This is one reason an endpoint the ISP controls is a huge asset, especially if it does QOS.
As I understand it, the problem being discussed is an oscillation that is created when the reaction occurs faster than the feedback resulting in a series of dynamically increasing overcompensations.
I haven't done packet dumps to verify the behavior (too busy catching up on holiday email) but I can't help but wonder if IW10 (on by default in FreeBSD 10 which I believe might be what Netflix has underneath) is causing this problem, and that maybe a more gentle CWND ramp-up (or otherwise tweaking the slow start behavior) for prefixes that are known to be in networks with weak hardware might be a good choice.
Of course this would be a change on Netflix's end... as for things the ISP could do to alleviate the problem the answer is always "sure, but it'll cost ya".
The most obvious things would be to make feedback faster… Implement congestion controls further
up stream with reduced buffering throughout the network, selective technologies like WRED, etc.
As RS said, sure, but all come at a cost either in performance, equipment, support, or some
combination thereof.