Soooo..... Netflix

Armchair quarterbacking...

Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not.

What were your educated observations, preferably with supporting data?

Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over?

AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes?

-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP

Yeah, normally I hold them up as the poster child for a scalable CDN but I’m hoping they release an RCA explaining what happened.

To understand the (historic) event by the numbers:
https://about.netflix.com/en/news/60-million-households-tuned-in-live-for-jake-paul-vs-mike-tyson.

Nonstop and fast 502 and 504 here on a Mac with Chrome. Points to edge having enough sockets and just inside that proxy not enough. Some ratio that was expected and exceeded. IMHO.

Pete

The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited.

What were your educated observations, preferably with supporting data?

If it was capacity issues, they learned a hard lesson that you should have other CDNs available to shed traffic over to if yours hits a problem that can’t be quickly solved in real time.

If it was server/software/livestream technical , then /shrug. Fix those. :slight_smile:

This email may contain proprietary information of BAE Systems and/or third parties.

My experience over a home internet fiber connection wasn’t great (like everyone else’s) but my son was watching it over his mobile device without any issues.

Yeah, normally I hold them up as the poster child for a scalable CDN but I’m hoping they release an RCA explaining what happened.

I guess for then it was the difference between pre-recorded content they are used to, vs a live event. I wonder what the latency between live and the stream looked like – a few seconds or more like 30+ seconds?

JL

> My experience over a home internet fiber connection wasn’t great (like everyone else’s) but my son was watching it over his mobile device without any issues.

That may be an interesting data point – because mobile networks typically rate-shape video streams.

JL

> My experience over a home internet fiber connection wasn’t great (like everyone else’s) but my son was watching it over his mobile device without any issues.

That may be an interesting data point – because mobile networks typically rate-shape video streams.

JL

Perhaps, sometimes , less is more.

Perhaps, the “greedy” nature of adaptive bit-rate always trying to bid-up the data rate is counter productive… at a system level.

Also, how far have we come that 65 MILLION streams were active at the same time and we’re like “omg, so bad!”

5 years ago, never possible.

That being said. I watched it on my iPad with no problems whatsoever. Not even one hiccup. Meanwhile I had X open in a side by side and I saw people complaining about it.

Something that would be interesting to see (particularly if someone has eyes in Comcast’s network) is to see how customers in areas where L4S trials are happening faired in comparison to others.

Part of the narrative of what we are seeing in the CDN market i.e., some not necessarily being prepared for a streaming era.

We have several caches and PNIs.

Call-volume was high.

Something that would be interesting to see (particularly if someone has eyes in Comcast’s network) is to see how customers in areas where L4S trials are happening faired in comparison to others.

The sample area of the deployment is still to small from which to draw conclusions (~20K homes). We’ll know in a few weeks more how things look in comparison. But in this example, I think the bottleneck was more likely on the server/CDN side of things, so CPE and last mile AQM and/or dual queue L4S would probably not have made a difference. But never know without knowing full root cause. I have no doubt the Netflix folks will sort it – they’ve got some very smart transport layer and CDN folks.

JL

i have (3) oca's ... 2 connected at 100g each, and 1 at dual 100g lag... with an operational throughput capacity of the nodes being something less than that, i forget the exact node(s) throughput specs, but anyway...

about the 11/15/2024 Tyson/Paul Netflix fights....

from 6 - 7 p.m. central time i saw extreme ramp up on my OCA utilization...reaching an all-time high
- 15g
- 27g
- 50g
= 92g

at 7:31 p.m. i saw what equated to a ~40g dive, total, across all 3 of my oca caches
- 10g
- 17g
- 27g
= 54g

I never saw the utilization ramp up to the same level again after that. actually the first one did get back to 16g, but the other 2 never ramped up that much again

I was waiting for the main event (Paul/Tyson) to generate an even higher load than originally seen at the 7 p.m. but i didn't happen

The aforementioned graph ramp up seen from 6-7 p.m.was a clean scaling graph, as you would expect as more and more eyeballs were "tuning in".... After the sharp drop at 7:31 p.m. the graphs never really cleaned up after that. The graphs were just down and up.

- 7:31 p.m. - sharp sag/drop

- 7:51 p.m. - sharp sag/drop

- 8:18 p.m. - sharp sag/drop

- 9:04 p.m. - sharp sag/drop

- 9:53 p.m. - ramp up

- 10:08 p.m. - aggressive ramp down

I wonder if the overall nationwide/worldwide issues affected even my local caches. I figured my local caches would have been "protected" or unaffected by issues outside of my network, but I'm not so sure about it

I can say, that we didn't have a ton of customer complaints from our 60k resi bb subs, but I did hear about some customer complaints, but I don't think it was many

I wonder if there was some sort of adaptive rate changes in the streams, altering the overall raw bandwidth utilization I observed, causing the main event to not be seen as high of a peak on the graph, or if it was just the Netflix was having issues everywhere. I don't know.
Hopefully Netflix NFL Christmas Day is much better

Aaron