FCC: Staff Report on T-Mobile Outage on June 15 2020

FCC Issues Staff Report On T-Mobile Outage

https://www.fcc.gov/document/fcc-issues-staff-report-t-mobile-outage-0

The outage was initially caused by an equipment failure and then exacerbated by a network routing misconfiguration that occurred when T-Mobile introduced a new router into its network. In addition, the outage was magnified by a software flaw in T-Mobile�s network that had been latent for months and interfered with customers� ability to initiate or receive voice calls during the outage.

[...]
44. While fiber link failures are common, PSHSB finds that these steps, taken together, will reduce the likelihood that a fiber link failure could result in the recurrence of a similar event in TMobile�s network because traffic would be routed to an alternative path that could handle it. Moreover, if such an event recurred on T-Mobile�s network, it would not cause such a large service disruption because T-Mobile would have improved its networks� ability to manage congestion in the case of a similar event
and would have increased network capacity to maintain the network in a working state even with an increased volume of traffic.

Hi,

FCC Issues Staff Report On T-Mobile Outage

FCC Issues Staff Report on T-Mobile Outage | Federal Communications Commission

This part, I find most interesting as well:

However, they were unable to resolve the issue by restoring the link because
the network management tools required to do so remotely relied on the same
paths they had just disabled.

I can't begin to tell you how often I battled senior mgmt to get some investment
into an OOB network. This only proves the point.

Parantap, are you reading this? I know you are.

Thanks,

Sabri

The larger story here is...

"7. Routing. Routers connect T-Mobile’s LTE towers to T-Mobile’s LTE
network. These routers utilize a routing protocol called Open
Shortest Path First."

Calling Vijay Gill to the courtesy phone.

The larger story here is...

"7. Routing. Routers connect T-Mobile’s LTE towers to T-Mobile’s LTE
network. These routers utilize a routing protocol called Open
Shortest Path First."

you can blow it with is-is, just as you can with ospf, just as you can
with pretty much any dynamic [routing] protocol. though i am an is-is
fanboy, i would not blame the protocol. and if they can not manage the
currently deployed protocol, i am not sure i would recommend they try a
delicate transition.

randy

Absolutely all of these guns pointed at toes can be problematic.
I don't often get a chance to poke fun at vijay though :slight_smile:

On the bright side that write up by TMO was pretty great as a read...
good details (or more than I expected from telco).