Netflix stuffing data on pipe

Has anyone else observed Netflix sessions attempting to come into customer CPE devices at well in excess of the customers throttled plan?

I'm not talking error retries on the line. I'm talking like two to three times in excess of what the customers CPE device can handle.

I'm observing massive buffer overruns in some of our switches that appear to be directly related to this. And I can't figure out what possible good purpose Netflix would have for attempting to do this.

Curious if anyone else has seen it?

Adaptive bandwidth detection.

So they are trying to stuff every last bit as an end device modulates up and down?

Or are you saying that's how they determine if they can scale up the resolution "because there is more throughout available now".

The second part. Fixed wireless is not even on their radar.

The second part. Fixed wireless is not even on their radar.

So they are trying to stuff every last bit as an end device modulates up
and down?

Or are you saying that's how they determine if they can scale up the
resolution "because there is more throughout available now".

Adaptive bandwidth detection.

Has anyone else observed Netflix sessions attempting to come into
customer CPE devices at well in excess of the customers throttled plan?

I'm not talking error retries on the line. I'm talking like two to three
times in excess of what the customers CPE device can handle.

I'm observing massive buffer overruns in some of our switches that appear
to be directly related to this. And I can't figure out what possible good
purpose Netflix would have for attempting to do this.

Pardon my ignorance of WISP-specific bits here, but how are they supposed to know to back off on their bitrate ramp-up if you keep buffering rather than dropping packets when the TX rate exceeds the customer's service rate? Or what am I missing?

It's a long and ugly story...

1Gbps FD feeds -> switch -> 100Mbps FD radio port -> fluctuating PHY rate
Half Duplex wireless link/CPE (shaped here).

Netflix is microbusting, and its really nasty on his kind of network,
especially with the shaping being toward the end of his network.

I'm not buffering. Switches have packet buffers. I'm seeing switch buffers getting overrun by what appears to be Netflix traffic coming in at rates faster than the subscribers throttled speeds.

By what mechanism is the throttling accomplished? QoS on routers, or some kind of middlebox, or . . . ?

How big are your buffers (preferrably answer would be in milliseconds)? What access speeds are you providing?

It's standard behavior for network traffic to sometimes be at higher speeds than the customer access speeds, the sender only knows about congestion if there is an increase in RTT or if there is packet loss (or in case of ECN, EC=1 flag), and the only way to find out is to probe (=run faster than customer access speed).

I believe others have observed a similar situation with at least one other CDN and the situation continued solid for hours, not just occasional capacity detection.

Adaptive bandwidth detection.

Yes, ABR video attempts to fill the entire channel

This has been problematic as peak edge speeds have increased and pushed the
statistical multplexing logic / plans.

There is also buffer bloat issues that exacerbate the problem by allowing
elephant flows to be too greedy at the expsense of others on the access
segment

It is actually buffer-based, as it picks the video rate as a function of
the current buffer occupancy.

See here http://yuba.stanford.edu/~nickm/papers/sigcomm2014-video.pdf

Netflix is streaming video. It will try to do the best data rate it can. If the connection can handle 4 megs a second it is going to try and do 4 megs a second. If the network can’t handle it then Netflix will back off and adapt to try and fit.

Keep in mind, at least last I knew, a full HD stream was somewhere around 5 megs a sec. If the customer has a 4 meg plan it will try and fill up that 4 megs unless the algorithm backs off and steps it down. ISPs who run into this on lower packages need to implement QOS at the customer level to deal with streaming. This can be done several ways. This is one reason an endpoint the ISP controls is a huge asset, especially if it does QOS.

Justin Wilson
j2sw@mtin.net

As I understand it, the problem being discussed is an oscillation that is created when the reaction occurs faster than the feedback resulting in a series of dynamically increasing overcompensations.

Owen

Very succiently put, Owen!

I concur.

Is anything the ISP could avoid to alleviate this occurrence, or is it entirely a 'server-side' issue to resolve?

Pete

I haven't done packet dumps to verify the behavior (too busy catching up on holiday email) but I can't help but wonder if IW10 (on by default in FreeBSD 10 which I believe might be what Netflix has underneath) is causing this problem, and that maybe a more gentle CWND ramp-up (or otherwise tweaking the slow start behavior) for prefixes that are known to be in networks with weak hardware might be a good choice.

Of course this would be a change on Netflix's end... as for things the ISP could do to alleviate the problem the answer is always "sure, but it'll cost ya".

-r

The most obvious things would be to make feedback faster… Implement congestion controls further
up stream with reduced buffering throughout the network, selective technologies like WRED, etc.

As RS said, sure, but all come at a cost either in performance, equipment, support, or some
combination thereof.

Owen