Estimated Time To Repair (was Re: History: lengthy outages)

I deliberately included them for a reason.

Historically, when I look at a lot of network problems I notice
an interesting coincidence. Outages involving "operator error"
tended to take the longest time to fix, while outages involving
equipment failure tend to be the shortest times.

Complete hardware box failure (smoke makes debugging easy): 1 hour
Power failure (utility, generator, etc): 3 hours
External malicious attack (ddos, etc): 4 hours
Fiber/Cable cut: 5 hours
Electronic DCS failure: 18 hours
Operator error: 1 business day (24-72 hours depending whether the operator
made the change before leaving on a Friday night or a Tuesday night)
Vendor software error: 2 business days (1 day to "escalate" the problem
through customer channels, 1 day to actually get the fix, can take as
long as 5 days if the problem happens after 3pm on Friday)

Psychologists study why people have a difficult time recognizing their
own mistakes. It is a very difficult problem. The problem is worse with
"smart" people.