San Francisco Power Outage

Seth wrote:

Jonathan Lassoff wrote:

Just a heads up to anyone on list that PG&E has just sustained a large
outage in San Francisco that has caused a few hiccups (both network,
electrical, infrastructural, etc.) around the city.

I've confirmed that both customers in 365 Main and parts of telecom 1
have both sustained brief blackouts. No word yet form 200 Paul.

Anyone in the area that could use a hand with anything, I'll probably
be wrapping up fixes for my stuff soon, and would be glad to help
however I can.

I have a question: does anyone seriously accept "oh, power trouble" as a
reason your servers went offline? Where's the generators? UPS? Testing
said combination of UPS and generators? What if it was important? I
honestly find it hard to believe anyone runs a facility like that and
people actually *pay* for it.

If you do accept this is a good reason for failure, why?

Unfortunate real-world lesson: there is a functional difference between
pushing the UPS test cutover button, and some of the stuff that can happen
out on the power lines (including rapid voltage swings, harmonics, etc).

I know 365 Main has the equipment and tests it, I've been standing outside
when the generators spool up.

I've had generator firmware upgrades generate reporting info on the
serial uplink that flipped the UPSes into permanent error state
until the Liebert guys got off the plane with the replacement
mainboard. I've had grid voltage fluctuations that toasted VSDs
in chillers. I watched a building's electrical service go "pop"
when a transformer blew and ran 10kv into the 220 mains for a
fraction of a second as it arced. I was at home but called in
after a 5 MW generator popped under a sufficiently badly harmonic
UPS and PDU load of only about 2.4 MW. I had a client who forgot
to wire the A/C into the UPS, and nearly melted a whole
server room.

And the stories that the power guy I'm working with tells about
foreign facilities, particularly in middle east war zones,
are really scary...

We fundamentally do not have the facilities problem completely
nailed down to the point that things will never drop. Level 4
datacenters can, and will, fail. Nothing you can do including
just doing 48V DC for everything are truly foolproof solutions.

-george williiam herbert
gherbert@retro.com

And the stories that the power guy I'm working with tells
about foreign facilities, particularly in middle east war
zones, are really scary...

We fundamentally do not have the facilities problem
completely nailed down to the point that things will never
drop. Level 4
datacenters can, and will, fail. Nothing you can do including
just doing 48V DC for everything are truly foolproof solutions.

A single level 4 datacenter is a Single Point of Failure!

Two of those middle-eastern style facilities is... ?
Has anyone actually kept track of all these data center failures over
the years and done some statistical analysis on it? Maybe two half-baked
data centers is better than one over the long run?

Remember that one 10-12 years ago in (Palo Alto, Mountainview?) where a
lady in a car caused a backhoe driver to move out of the way which
resulted in him cutting a gas line which resulted in the fire department
evacuating the data center, cutting off electricity in the area, and
forbidding the diesel generators to be switched on?

--Michael Dillon

I know a guy who was at the US Data Centers Inc facility in
Marlborough, MA (before USDCI failed). Soon after they first opened
it up, they had a fire. The problem was the fire was *in* the giant
APC/Silicon system they had. They had to kill the APC, and that took
the load down too.

  So they installed an external transfer switch, rather than depending
on the one built-in to the APC system. There was some SNAFU with the
wiring, so right after the install, there was an electrical fire --
this time in the external transfer switch panel.

  While I suspect poor planning/testing contributed to their woes, it
still goes to show: Some days you're the windshield, and some days
you're the bug.

-- Ben