wow, lots of akamai

That was a lot of traffic coming out of akamai aanp clusters the last couple nights! What was it?

Aaron

aaron1@gvtc.com

Peace,

HAHA

Yes, we have had a number of big customer traffic in recent days. Hopefully traffic is flowing well for many networks. If this is negatively impacting you, please reach out.

- Jared

Gaming update... I had a feeling. Thanks for the feedback folks.

Thanks Jared, it's running well, before, during and after. We have a lot of capacity there.

-Aaron

I remembered working for a big ISP in Europe offering cable tv + internet with +20M subscribers

Every time there was a huge power outage in major cities, all tv`s would go off at the same time. I don`t have stats on power grid stability in Europe Vs N/A.

The problem, was when the power was coming back in big cities, all the tv subscribers would come back online at the exact same second or minute.
More or less the same 2 or 3 minutes.

What happened is that it would create a kind of internal DDoS and they would all timed out and give a weird error message. Something very useful like Error Code 0x8098808 Please call our support line at this phone number.

The server sysadmins would go on a panic because all systems were overloaded. They often needed to do overtime because DB crashed, key servers there crashed, DB here crashed, whatever... there was always something crashing. This was before the cloud when you could just push a slider and have tons of VMs or containers to absorb the load in real time. (in my dream)

This would every time create frustration from the clients, the help desk, the support teams and also the upper management. Every time the teams were really tired after that. It was draining juice.

Anyway, after some years of talking internally (red tape), we finally managed to install a random artificial penalty in the setup boxes when they boot after a power outage. Nothing like 20 minutes, but just enough to spread the load over a longer period of time. For the end user, it went transparent for them because, if the setup box would boot in 206 seconds instead of the super aggressive 34 seconds, well it booted and they could watch tv.
Vs

my system is totally frozen and it`s been like that for 20 minutes with weird messages because all your systems are down and the error msg said to call the help desk.

This simple change to add 3 lines of code to add a random artificial boot penalty of few seconds, completely solve the problem. This way, when a city would black out, we wouldn't be self DDoS, because the systems would slowly rampup. The setup boxes would all reboot but, wait randomly before asking for the DRM package to unlock the cable TV service and validate whether billing is right.

I`m no Call of Duty expert nor Akamai, but it's been many times that I observe the same question here:

What's happening?
Call of Duty!
Okay.

Would a kind of throttle help here?

An artificial roll out penalty somehow? Probably not at the ISP level, but more at the game level. Well, ISP could also have some mechanisms to reduce the impact or even Akamai could force a progressive roll out.

I`m not sure that the proposed solutions could work, but it seems to impact NANOG frequently and/or at least generate a call overnight/weekend. It seems to also happens just before long holidays when operations are sometimes on reduce personnel.

Are big games roll out really impacting NANOG? or it's more a: Hey I was curious what happened and I thought to ask here on NANOG?

#JustCurious

Jean

IOS 7 seemed to be sent to everyone at once causing large spikes along with saturating many links for smaller ISPs.

I believe after that it went more to a distribution type of sorts though I could be wrong. Maybe it was that 7 was so vastly different everyone was itching to try it.

Sent from my iPhone

* nanog@nanog.org (Jean St-Laurent via NANOG) [Thu 01 Apr 2021, 21:03 CEST]:

An artificial roll out penalty somehow? Probably not at the ISP level, but more at the game level. Well, ISP could also have some mechanisms to reduce the impact or even Akamai could force a progressive roll out.

It's an online game. You can't play the game with outdated assets. You'd not see walls where other players would, for example.

What you're suggesting is the ability of ISPs to market Internet access at a certain speed but not have to deliver it based on conditions they create.

  -- Niels.

  • nanog@nanog.org (Jean St-Laurent via NANOG) [Thu 01 Apr 2021, 21:03 CEST]:

An artificial roll out penalty somehow? Probably not at the ISP
level, but more at the game level. Well, ISP could also have some
mechanisms to reduce the impact or even Akamai could force a
progressive roll out.

It’s an online game. You can’t play the game with outdated assets.
You’d not see walls where other players would, for example.

What you’re suggesting is the ability of ISPs to market Internet access
at a certain speed but not have to deliver it based on conditions they
create.

It’s actually worst. You can’t even login without having latest version to play multiplayer.

There are a couple things going on that all combine together.

  • Competition between CDNs has pushed $/byte numbers down a lot. (Good or bad, depending on which side you’re on. :slight_smile: )
  • Game developers are under constant pressure to deliver content to users quicker
  • Games are graphically much higher resolution and multi resolution, which means more assets that don’t compress well.

The only real pressure on a developer to shrink their file sized comes from users running out of disk space on their consoles. Otherwise it’s cheaper to just pay for the content delivery than hire more developers to improve the file sizes.

No I didn't suggest that.

There likely is some amount of time between the product being “done” and the activation date. That time could be used (and may very well be for some platforms) to distribute the content ahead of when people need it. If too many points of congestion arise, the above mentioned time would need to be longer.

Of course as an IX operator, I encourage everyone (CDNs and eyeballs) to join IXes and push them bits at maximum speed! :wink:

As an eyeball ISP, sometimes the congestion is in the home, creating a poor experience, yet no one above them is to blame.

It’s in fact better, because the one with the new asset can actually play and not be totally frozen at loading and/or suffering lag and being kill in action.

Progressive rolls out is the key to happiness

This would be a good compromises for all.

Slowly deliver the assets few days/weeks ahead.

Then, on April 1st at this exact same second, you open the gate.

@Mike: bull’s eye!

Jean

* jean@ddostest.me (Jean St-Laurent) [Thu 01 Apr 2021, 21:41 CEST]:

This would be a good compromises for all.
Slowly deliver the assets few days/weeks ahead.

Excellent compromise except for the people who paid for the game. Why do they need to spend storage to solve your bandwidth problem?

CoD is being played on lots of devices with limited storage space, like PlayStation 4. Needing to have two versions of the game would be a heavy burden on owners. And not everybody has infinite disk space in their gaming PCs either.

  -- Niels.

This would be a good compromises for all.

Slowly deliver the assets few days/weeks ahead.

Then, on April 1st at this exact same second, you open the gate.

@Mike: bull’s eye!

Jean

Niels,

I think to clarify Jean’s point, when you buy a 300mbps circuit, you’re paying for 300mbps of internet access.

That does not mean that a network should (and in this case small-medium ones simply can’t) build all of their capacity to service a large number of customer circuits at line rate at the same time for an extended period, ESPECIALLY to the exact same endpoint. It’s just not economically reasonable to expect that. Remember we’re talking about residential service here, not enterprise circuits.

Therefore, how do you prevent this spike of [insert large number here] gigabits traversing the network at the same time from causing issues? Build more network? That sounds easy, but there are plenty of legitimate reasons why ISPs can’t or don’t want to do that, particularly for an event that only occurs once per quarter or so.

Does Akamai bear some burden here to make these rollouts less troublesome for the ISPs they traverse through the last mile(s)? IMO yes, yes they do. When you’re doing something new and unprecedented, as Akamai frequently brags about on Twitter, like having rapid, bursty growth of traffic, you need to consider that just because you can generate it, doesn’t mean it can be delivered. They’ve gotta be more sophisticated than a bunch of servers with SSD arrays, ramdisks, and 100 gig interfaces, so there’s no excuse for them here to just blindly fill every link they have after sitting idle for weeks/months at a time and expect everything to come out alright and nobody to complain about it.

Matt:

I am going to disagree with your characterization of how Akamai - and many other CDNs - manage things. First, to be blunt, if you really think Akamai nodes are “sitting idle for weeks” before CoD comes out with a new game, you are clearly confused.

More importantly, I know for a fact Akamai has spent ungodly amounts of money & resources putting content precisely where the ISPs ask them to put it, deliver it over the pipes the ISPs ask them to deliver it, at precisely the capacity the ISPs tell them.

On the other hand, I agree with your characterization of residential broadband. It is ridiculous to expect a neighborhood with 1,000 homes each with 1 Gbps links to have a terabit of uplink capacity. But it also should have a lot more than 10 Gbps, IMHO. Unfortunately, most neighborhoods I have seen are closer to the latter than the former.

Finally, this could quickly devolve into finger pointing. You say the CDNs bear some responsibility? They may well respond that the large broadband providers ask for cash to interconnect - but still require the CDNs to do all the work. The CDNs did not create the content, or tell the users which content to pull. When I pay $NATIONAL_PROVIDER, I expect them to provide me with access to the Internet. Not just to the content that pays that provider.

Personally, I have zero problems with the ISPs saying “give me a cache to put here with this sized uplink” or “please deliver to these users over this xconn / IX / whatever”. I have a huge problem with the ISPs blaming the ISPs for delivering what the ISP’s users request.

Of course, this could all be solved if there were more competition in broadband in the US (and many other countries). But that is a totally different 10,000 post thread (that we have had many dozens of times).

IX’s don’t really help the source doesn’t use them.

Akamai traffic.

17G via Local Cache

17G via Transit

8G via IXs.

Plenty of room on IXs for more on our side.

Does Akamai bear some burden here to make these rollouts less troublesome for the ISPs they traverse through the last mile(s)? IMO yes, yes they do. When you’re doing something new and unprecedented, as Akamai frequently brags about on Twitter, like having rapid, bursty growth of traffic, you need to consider that just because you can generate it, doesn’t mean it can be delivered.

Akamai, and other CDNs, do not generate traffic ; they serve the requests generated by users.