Fiber cut in SF area

Joe Greco wrote:

> My point was more the inverse, which is that a determined, equipped,
> and knowledgeable attacker is a very difficult thing to defend against.

"The Untold Story of the World's Biggest Diamond Heist" published
recently in Wired was a good read on that subject:

http://www.wired.com/politics/law/magazine/17-04/ff_diamonds

Thanks, *excellent* example.

> Which brings me to a new point: if we accept that "security by obscurity
> is not security," then, what (practical thing) IS security?

Obscurity as a principle works just fine provided the given token is
obscure enough.

Of course, but I said "if we accept that". It was a challenge for the
previous poster. :wink:

Ideally there are layers of "security by obscurity" so
compromise of any one token isn't enough by itself: my strong ssh
password (1 layer of obscurity) is protected by the ssh server key (2nd
layer) that is only accessible via vpn which has it's own encryption key
(3rd layer). The loss of my password alone doesn't get anyone anything.
The compromise of either the VPN or server ssh key (without already
having direct access to those systems) doesn't get them my password either.

I think the problem is that the notion of "security by obscurity isn't
security" was originally meant to convey to software vendors "don't rely
on closed source to hide your bugs" and has since been mistakenly
applied beyond that narrow context. In most of our applications, some
form of obscurity is all we really have.

That's really it, and bringing us back to the fiber discussion, we are
forced, generally, to rely on obscurity. In general, talk to a hundred
people on the street, few of them are likely to be able to tell you how
fiber gets from one city to another, or that a single fiber may be
carrying immense amounts of traffic. Most people expect that it just
all works somehow. The fact that it's buried means that it is
sufficiently inaccessible to most people. It will still be vulnerable
to certain risks, including backhoes, anything else that disrupts the
ground (freight derailments, earthquakes, etc), but those are all more
or less natural hazards that you protect against with redundancy. The
guy who has technical specifics about your fiber network, and who picks
your vulnerable points and hits you with a hacksaw, that's just always
going to be much more complex to defend against.

... JG

One thing that is missing here is before we can define "security" we
need to define the "threat" and the "obstruction" the security creates.
With an ATM machine, the threat is someone comes and steals the machine
for the cash. The majority of the assailants in an ATM case are not
interested in the access passwords, so that is not viewed as a threat by
the bank. Then bank then says, "If we set really complicated passwords,
our repair guys (or contractors) will not be able to fix them." So
setting hard passwords is an obstruction. This happens every day, in
every IT department in the world.

So lets define the "Threat" to the fiber network? We know it isn't
monetary as their isn't much value in selling cut sections of fiber. So
that leaves out your typical ATM theif. That leaves us with directed
attack, revenge or pure vandalism.

In a directed attack or revenge scenario, which is what this case looks
like, how are manhole locks going to help? If it is was the fiber union,
wouldn't they already have the keys anyway? If this was some kind of
terrorism scenario wouldn't they also have the resources to get the
keys, either by getting employed by the phone company or the fiber union
or any one of the other thousand companies that would need those keys?

Manhole locks are just going to stop vandalism, and I think the threat
to obstruction calculation just doesn't add up for that small level of
isolated cases.

Here in Qwest territory, manhole locks would be disasterours for repair
times. We have had times when our MOE network has an outage and Qwest
cannot fix the problem because their repair guys don't have the keys to
their own buildings. Seriously. Their own buildings.

Ultimately, what really needs to be addresses is the redundancy problem.
And this needs to be addresses by everyone who was affected, not just
ATT and Verizon, etc.

A few years ago we had a site go down when a sprint DS-3 was cut. This
was a major wake-up call for us because we had 2 t-1's for the site and
they were suppose to have path divergence. And they did, up to the qwest
CO where they handed off the circuit to sprint. In the end, we built in
workflow redundancies so if any site goes down, we can still operate at
near 100% capacity.

My point is, it is getting harder and harder to gurantee path divergence
and sometimes the redundancies need to be built into the workflow
instead of IT.

But that does't mean we cannot try. I remember during Katrima a
datacenter in downtown New Orleans managed to stay online for the
duration of disaster. These guys were on the ball and it paid off for
them.

In the end, as much as I like to blame the phone companies when we have
problems, I also have to take some level of responsibility. And with
each of these types of incidents we learn. For everyone affected, you
now know even though you have two carriers, you do not have path
divergence. And for everyone who colos at an affected Datacenter and
get's your service from that center, you know they don't have
divergence. So we need to ask ourselves, "where do we go from here?"

It will be easier to get more divergence than secure all the manholes in
the country.

Dylan Ebner, Network Engineer
Consulting Radiologists, Ltd.
1221 Nicollet Mall, Minneapolis, MN 55403
ph. 612.573.2236 fax. 612.573.2250
dylan.ebner@crlmed.com
www.consultingradiologists.com

It doesn't stop it, it just makes it slightly harder, and they'll go after another point.

<http://swm.pp.se/bayarea.jpg>

This is the bay area as well... How long do you need to spend with a torch to cut thru that? A couple of minutes?

There is absolutely no way you can stop a determined attacker, and it would increase cost a lot more than it's worth. Time is better spent stopping the few people who actually do these kinds of things, same way as it's not worth it for regular people to wear body armour all the time, just in case they might get shot, or have parachutes and emergency exits that work in mid-flight on commercial airliners. The various police agencies and the NTSB cost less in a cost/benefit analysis.

It all comes down to money... It will cost them lots of it to get power and some type of readers installed to monitor manhole access... There has always been a lack of security on the telco side, this incident just brings it to light... In my town many of the verizon fios boxes are not locked and the wiring frame boxes for pots line neither.. Its all of a matter of how much cash they wanna throw at it...
Sent on the Now Network™ from my Sprint® BlackBerry

IMHO, I think manhole locks would only serve to HEIGHTEN the threat, not minimize it. Flag this under the whole "obscurity" category, but think about this - if you're a vandal itching to do something stupid, and you see a bunch of manhole covers and a couple of them have locks on them, which ones are you going to target? The ones with the locks, of course. Why? Because by the very existence of the locks, it implies there's something of considerable value beyond the lock.

-Andy

Actually, in many ways it's getting easier; now, you can sign an NDA
with your fiber providers and get GIS data for the fiber runs which you can
pop into Google Earth, and verify path separation along the entire run;
you put notification requirements into the contract stipulating that the
fiber provider *must* notify you and provide updated GIS data if the
path must be physically moved, and the move deviates the path by
more than 50 feet from the previous GIS data; and you put escape
clauses into the contract in case the re-routing of the fiber unavoidably
reduces or eliminates your physical run diversity from your other
providers.

In years past, trying to overlay physical map printouts to validate
path separation was a nightmare. Now, standardized GIS data
formats make it a breeze.

"protected rings" are a technology of the past. Don't count on your
vendor to provide "redundancy" for you. Get two unprotected runs
for half the cost each, from two different providers, and verify the
path separation and diversity yourself with GIS data from the two
providers; handle the failover yourself. That way, you *know* what
your risks and potential impact scenarios are. It adds a bit of
initial planning overhead, but in the long run, it generally costs a
similar amount for two unprotected runs as it does to get a
protected run, and you can plan your survival scenarios *much*
better, including surviving things like one provider going under,
work stoppages at one provider, etc.

Sometimes a little bit of paranoia can help save your butt...or at
least keep you out of the hot seat.

Matt

I guess the next generation fiber networks will need to be installed with
tunnel boring machines and just not surface anywhere except the endpoints
:slight_smile: After all, undersea cables get along just fine without convenient access
along their length...

Or skip the locks and fill the manholes with sand. Then provide the service
folks those big suction trucks to remove the sand for servicing :slight_smile:

Boat anchors and earthquakes do a pretty effective job of cutting submarine cables.

jms

I still think skipping the securing of manholes and access points in favor
  of active monitoring with offsite access is a better solution. You can't
  keep people out, especially since these manholes and tunnels are designed
  FOR human access. But a better job can be done of monitoring and knowing
  what is going on in the tunnels and access points from a remote location.

     Cheap: light sensor + cell phone = knowing exactly when and where the
     amount of light in the tunnel changes. Detects unauthorized
     intrusions. Make sure to detect all visible and IR spectrum, should
     someone very determined use night vision and IR lights to disable the
     sensor.

     Mid-Range: Webcam + cell phone = SEEING what is going on plus
     everything above.

     High-end: Webcam + cell phone + wifi or wimax backup both watching the
     entrance and the tunnels.

     James Bond: Lasers.

  Active monitoring of each site makes sure each one is online.

  Pros:
     * Knowing immediately that there is a change in environment in your
       tunnels.
     * Knowing who or at least THAT something is in there
     * Being able to proactively mitigate attempts
     * Availability of Arduino, SIM card adapters, and sophisticated sensor
       and camera equipment at low cost

  Cons:
     * Cell provider outage or spectrum blocker removes live notifications
     * False positives are problematic and can lower monitoring thresholds
     * Initial expense of deployment of monitoring systems

  Farmers use tiny embedded devices on their farms to monitor moisture,
  rain, etc. in multiple locations to customize irrigation and to help avoid
  loss of crops. These devices communicate with themselves, eventually
  getting back to a main listening post which relays the information to the
  farmer's computers.

  Tiny, embedded, networked devices that monitor the environment in the
  tunnels that run our fiber to help avoid loss of critical communications
  services seems to be a good idea. Cheap, disposable devices that can
  communicate with each other as well as back to some HQ is a way to at
  least know about problems of access before they happen. No keys to lose,
  no technology keeping people out and causing repair problems.

  Some other things that could detect access problems:
     * Pressure sensors (maybe an open manhole causes a detectable change in
       air pressure in the tunnel)
     * Temperature sensors (placed near access points, detects welding and
       thermite use)
     * Audio monitor (can help determine if an alert is just a rat squealing
       or people talking -- could even be automated to detect certain types of
       noises)
     * IR (heat) motion detection, as long as giant rats/rodents aren't a problem
     * Humidity sensors (sell the data to weatherbug!)

  One last thought inspired by the guy who posted about pouring quick-set
  concrete in to slow repair. Get some heavy-duty bags, about 10 feet long
  and large enough to fill the space in the tunnel. More heavily secure the
  fiber runs directly around the access space, then inflate two bags on
  either side of the access point. Easily deflated, these devices also have
  an electronic device which can notify HQ that they are being deflated or
  the pressure inside is changing (indicating pushing or manipulation).
  That way you only need to put these bags at access points, not throughout
  the whole tunnel.

  Kinda low-tech, but could be effective. No keys needed, could be
  inflated/deflated quickly, and you still get notification back to a
  monitoring point.

Beckman

The only thing missing from your plan was a cost analysis. Cost of each, plus operational costs, * however many of each type. How much would that be?

Then amortize that out to our bills. Extra credit: would you pay for it?

Chris

Sent: Monday, April 13, 2009 11:19 AM
To: Dylan Ebner
Cc: nanog@nanog.org
Subject: RE: Fiber cut in SF area

It will be easier to get more divergence than secure all the
manholes in the country.

I still think skipping the securing of manholes and access
points in favor of active monitoring with offsite access is a
better solution.

The only thing missing from your plan was a cost analysis. Cost of each,
plus operational costs, * however many of each type. How much would that
be?

  So, let's see. I'm pulling numbers out of my butt here, but basing it on
  non-quantity-discounted hardware available off the shelf.

  $500,000 to get it built with off-the-shelf components, tested in hostile
  tunnel environments and functioning.

  Then $350 per device, which would cover 1000 feet of tunnel, or about
  $2000 per mile for the devices. I'm not sure how things are powered in
  the tunnels, so power may need to be run, or the system could run off
  sealed-gel batteries (easily replaced and cheap, powers device for a
  year), system can be extremely low power. Add a communication device
  ($1000) every mile or two (the devices communicate between themselves back
  to the nearest communications device).

  Total cost, assuming 3 year life span of the device, is about $3000 per
  mile for equipment, or $1000 per year for equipment, plus $500 per year
  per mile for maintenance (batteries, service contracts, etc). Assumes
  your existing cost of tunnel maintenance can also either replace devices
  or batteries or both.

  Add a speedy roomba like RC device in the tunnel with an HD cam and a 10
  or 20 mile range between charging stations that can move to the location
  where an anomaly was detected, and save some money on the per-device cost.
  It could run on an overhead monorail, or just wheels, depending on the
  tunnel configuration and moisture content.

  Add yet another system -- an alarm of sorts -- that goes off upon any
  anomaly being detected, and goes off after 5 minutes of no detection, to
  thwart teenagers and people who don't know how sophisticated the
  monitoring system really is. Put the alarm half way between access
  points, so it is difficult to get to and disable.

  Network it all, so that it can be controlled and updated from a certain
  set of IPs, make sure all changes are authenticated using PKI or
  certificates, and now you've made it harder to hack. Bonus points -- get
  a communication device that posts updates via SSL to multiple
  pre-programmed or random Confickr-type domains to make sure the system
  continues to be able to communicate in the event of a large outage.

Then amortize that out to our bills. Extra credit: would you pay for it?

  Assuming bills in the hundreds of thousands of dollars per month, maybe to
  the millions of dollars, and then figure out what an outage costs you
  according to the SLAs.

  Then figure out how much a breach and subsequent fiber cut costs you in
  SLA payouts or credits, multiply by 25%, and that's your budget. If the
  proposed system is less, why wouldn't you do it?

  The idea is inspired by the way Google does their datacenters -- use
  cheap, off-the-shelf hardware, network it together in smart ways, make it
  energy efficient, ... profit!

  Anyone want to invest? Maybe I should start the business.

Beckman

Hi Peter,

You wrote:

So, let's see. I'm pulling numbers out of my butt here,

<snip>

Total cost...is about $3000 per mile for equipment

<snip>

It could run on an overhead monorail

<snip>

Network it all

<snip>

Confickr-type domains to make sure

I get the feeling you haven't deployed or operated large networks. You never did say what the multiplier was. How many miles or detection nodes there were. Think millions. The number that popped into my head when thinking of active detection measures for the physical network is $billions.

Joel is right: the thing about the outdoors is there's a lot of it. The cost over time investment of copper and fiber communucations networks, power transmission networks, cable transmission networks is pretty well documented elsewhere. Google around a little for them. The investment is tremendous.

All for a couple of minutes advanced notice of an outage? Would it reduce the risk? No. Would it reduce the MTBF or MTTR? No. Of all outages, how often does this scenario (or one that would trigger your alarm) occur? I'm sure it's down on the list.

Then amortize that out to our bills. Extra credit: would

you pay for it?

Assuming bills in the hundreds of thousands of dollars per
month, maybe to
the millions of dollars, and then figure out what an outage costs you
according to the SLAs.

Then figure out how much a breach and subsequent fiber cut
costs you in
SLA payouts or credits, multiply by 25%, and that's your
budget. If the
proposed system is less, why wouldn't you do it?

SLA's account for force de majure (including sabotage), so I really doubt there will be any credits. In fact, there will likely be an uptick on spending as those who really need nines build multi-provider multi-path diversity. Here come the microwave towers!

The idea is inspired by the way Google does their datacenters -- use
cheap, off-the-shelf hardware, network it together in smart
ways, make it
energy efficient, ... profit!

Works great inside four walls.

Anyone want to invest? Maybe I should start the business.

Nahh, I already have a web cam on my Smarties orb. What else do I really need?

Chris

I get the feeling you haven't deployed or operated large networks.

  Nope.

You never did say what the multiplier was. How many miles or detection
nodes there were. Think millions. The number that popped into my head
when thinking of active detection measures for the physical network is
$billions.

  It depends on where you want to deploy it and how many miles you want to
  protect. I was thinking along the lines of $1.5 million for 1000 miles of
  tunnel, equipment only. It assumes existing maintenance crews would
  replace sensors that break or go offline, and that those expenses already
  exist.

All for a couple of minutes advanced notice of an outage? Would it
reduce the risk? No. Would it reduce the MTBF or MTTR? No. Of all
outages, how often does this scenario (or one that would trigger your
alarm) occur? I'm sure it's down on the list.

  What if you had 5 minutes of advanced notice that something was happening
  in or near one of your Tunnels that served hundreds of thousands of people
  and businesses and critical infrastructure? Could you get someone on site
  to stop it? Maybe. Is it worth it? Maybe.

  Given my inexperience with large networks, maybe fiber cuts and outages
  due to vandals, backhoes and other physical disruptions are just what we
  hear about in the news, and that it isn't worth the expense to monitor for
  those outages. If so, my idea seems kind of silly.

SLA's account for force de majure (including sabotage), so I really doubt
there will be any credits. In fact, there will likely be an uptick on
spending as those who really need nines build multi-provider multi-path
diversity. Here come the microwave towers!

  *laugh* Thank goodness for standardized GIS data. :slight_smile:

This all implies that the majority of fiber is in "tunnels" that can be monitored. In my experience, almost none of it is in tunnels.

In NYC, it's usually buried in conduits directly under the street, with no access, except through the man holes which are located about every 500 feet.

In LA, a large amount of the fiber is direct bored under the streets, with access from hand holes and splice boxes located in the grassy areas between the street and the side walks.

Along train tracks, the fiber is buried in conduits which are direct buried in the direct along side the train tracks, with hand holes every 1000 feet or so.

In any of these scenarios, especially in the third, where the fiber might run through a rural area with no road access and no cellphone coverage. Simply walk through the woods to the train tracks, put open a hand hole and snip, snip, snip, fiber cut.

Shane Ronan

Matthew Petach writes:

"protected rings" are a technology of the past. Don't count on your
vendor to provide "redundancy" for you. Get two unprotected runs
for half the cost each, from two different providers, and verify the
path separation and diversity yourself with GIS data from the two
providers; handle the failover yourself. That way, you *know* what
your risks and potential impact scenarios are. It adds a bit of
initial planning overhead, but in the long run, it generally costs a
similar amount for two unprotected runs as it does to get a
protected run, and you can plan your survival scenarios *much*
better, including surviving things like one provider going under,
work stoppages at one provider, etc.

This completely ignores the grooming problem.

About five years ago, we had a major WebEx outage caused by
our diverse path routed fibers both being groomed into the
same new cable / new path.

We had the contracts. We paid the money. We got the data.
We got updates to the data. The updates said we were still
fine and all good.

The new data lied. Downtown SJ backhoe hit damaged the cable,
and took down 1 of our 2 links. As nobody was sure what was
in it they failed to notify us that they were about to
chop the rest of it to repair the bundle. So, about an hour
after we lost the first leg, we went dark, and there was no
coming back until the splices were all done.

(typically, while the whole operations team was out at an
offisite teambuilding effort. pagers go beep beep beep,
and everyone hops back in the cars...)

We ran it up the flagpole to CEO level of the fiber vendor
(aggregator) and fiber physical plant owner (big 4 ISP),
as we were paying $$$ for bandwidth and were a Highly
Visible Client, and were told that they'd been making
a best effort and couldn't guarantee any better in the
future, no matter how much we paid or who we sued.

They were very apologetic, but insisted that best effort
means just that.

The only way to be sure? Own your own fiber. Use a microwave
link backup.

You have to get out of the game the fiber owners are playing.
They can't even keep score for themselves, much less accurately
for the rest of us. If you count on them playing fair or
right, they're going to break your heart and your business.

-george william herbert
gherbert@retro.com

But you are ignoring the cost of designing, procuring, installing, monitoring, maintaining such a solution for the THOUSANDS of man holes and hand holes in even a small fiber network.

The reality is, the types of outages that these things would protect against (intentional damage to the physical fiber) just don't happen often enough to warrant the cost. These types of solutions don't protect against back hoes digging up the fiber, as even if they gave a few minutes of advanced notice, the average telco can't get someone to respond to a site in an hour let alone minutes.

Matthew Petach writes:
>"protected rings" are a technology of the past. Don't count on your
>vendor to provide "redundancy" for you. Get two unprotected runs
>for half the cost each, from two different providers, and verify the
>path separation and diversity yourself with GIS data from the two
>providers; handle the failover yourself. That way, you *know* what
>your risks and potential impact scenarios are. It adds a bit of
>initial planning overhead, but in the long run, it generally costs a
>similar amount for two unprotected runs as it does to get a
>protected run, and you can plan your survival scenarios *much*
>better, including surviving things like one provider going under,
>work stoppages at one provider, etc.

This completely ignores the grooming problem.

Not completely; it just gives you teeth for exiting your
contract earlier and finding a more responsible provider
to go with who won't violate the terms of the contract
and re-groom you without proper notification. I'll admit
I'm somewhat simplifying the scenario, in that I also
insist on no single point of failure, so even an entire
site going dark doesn't completely knock out service;
those who have been around since the early days will
remember my email to NANOG about the gas main cut
in Santa Clara that knocked a good chunk of the area's
connectivity out, *not* because the fiber was damaged,
but because the fire marshall insisted that all active
electrical devices be powered off (including all UPSes)
until the gas in the area had dissipated. Ever since then,
I've just acknowledged you can't keep a single site always
up and running; there *will* be events that require it to be
powered down, and part of my planning process accounts
for that, as much as possible, via BCP planning. Now, I'll
be the first to admit it's a different game if you're providing
last-mile access to single-homed customers. But sitting
on the content provider side of the fence, it's entirely possible
to build your infrastructure such that having 3 or more OC192s
cut at random places has no impact on your ability to carry
traffic and continue functioning.

You have to get out of the game the fiber owners are playing.
They can't even keep score for themselves, much less accurately
for the rest of us. If you count on them playing fair or
right, they're going to break your heart and your business.

You simply count on them not playing entirely fair, and penalize
them when they don't; and you have enough parallel contracts with
different providers at different sites that outages don't take you
completely offline.

Matthew Petach wrote:

Matthew Petach writes:
>"protected rings" are a technology of the past. Don't count on your
>vendor to provide "redundancy" for you. Get two unprotected runs
>for half the cost each, from two different providers, and verify the
>path separation and diversity yourself with GIS data from the two
>providers; handle the failover yourself. That way, you *know* what
>your risks and potential impact scenarios are. It adds a bit of
>initial planning overhead, but in the long run, it generally costs a
>similar amount for two unprotected runs as it does to get a
>protected run, and you can plan your survival scenarios *much*
>better, including surviving things like one provider going under,
>work stoppages at one provider, etc.

This completely ignores the grooming problem.

Not completely; it just gives you teeth for exiting your
contract earlier and finding a more responsible provider
to go with who won't violate the terms of the contract
and re-groom you without proper notification.

That's a post-facto financial recovery / liability limitation
technique, not a high availability / hardening technique...

I'll admit
I'm somewhat simplifying the scenario, in that I also
insist on no single point of failure, so even an entire
site going dark doesn't completely knock out service;
those who have been around since the early days will
remember my email to NANOG about the gas main cut
in Santa Clara that knocked a good chunk of the area's
connectivity out, *not* because the fiber was damaged,
but because the fire marshall insisted that all active
electrical devices be powered off (including all UPSes)
until the gas in the area had dissipated. Ever since then,
I've just acknowledged you can't keep a single site always
up and running; there *will* be events that require it to be
powered down, and part of my planning process accounts
for that, as much as possible, via BCP planning.

I was less than a mile away from that, I remember it well.
My corner cube even faced in that direction.

I heard the noise then the net went poof. One of those
"Oh, that's not good at all" combinations.

Now, I'll
be the first to admit it's a different game if you're providing
last-mile access to single-homed customers. But sitting
on the content provider side of the fence, it's entirely possible
to build your infrastructure such that having 3 or more OC192s
cut at random places has no impact on your ability to carry
traffic and continue functioning.

You have to get out of the game the fiber owners are playing.
They can't even keep score for themselves, much less accurately
for the rest of us. If you count on them playing fair or
right, they're going to break your heart and your business.

You simply count on them not playing entirely fair, and penalize
them when they don't; and you have enough parallel contracts with
different providers at different sites that outages don't take you
completely offline.

The problem with grooming is that in many cases, due to provider
consolidation and fiber vendor consolidation and cable swap and
so forth, you end up with parallel contracts with different
providers at different sites that all end up going through
one fiber link anyways.

I had (at another site) separate vendors with fiber going
northbound and southbound out of the two diverse sites.

Both directions from both sites got groomed without notification.

Slightly later, the northbound fiber was Then rerouted a bit up the road,
into a southbound bundle (same one as our now-groomed southbound link),
south to another datacenter then north again via another path.
To improve route reduncancy northbound overall, for the providers'
overall customer links.

And the shared link south of us was what got backhoed.

This was all in one geographical area. Diversity out of area will get
you around single points like that, if you know the overall topology
of the fiber networks around the US and chose locations carefully.

But even that won't protect you against common mode vendor hardware
failures, or a largescale BGP outage, or the routing chaos that comes
with a very serious regional net outage (exchange points, major
undersea cable cuts, etc)....

There may be 4 or 5 nines, but the 1 at the end has your name on it.

-george william herbert
gherbert@retro.com

Matthew Petach wrote:
>> Matthew Petach writes:

[much material snipped in the interests of saving precious electron
resources...]

This was all in one geographical area. Diversity out of area will get
you around single points like that, if you know the overall topology
of the fiber networks around the US and chose locations carefully.

But even that won't protect you against common mode vendor hardware
failures, or a largescale BGP outage, or the routing chaos that comes
with a very serious regional net outage (exchange points, major
undersea cable cuts, etc)....

There may be 4 or 5 nines, but the 1 at the end has your name on it.

Ultimately, I think a .sig line I saw years back summed it up very
succinctly:

"Earth is a single point of failure."

Below that, you're right, we're all just quibbling about which digits to put
to the right of the decimal point. If the entire west coast of the US drops
into the ocean, yes, having my data backed up on different continents
will help; but I'll be swimming with the sharks at that point, and won't
really be able to care much, so the extent of my disaster planning
tends to peter out around the point where entire states disappear,
and most definitely doesn't even wander into the realm of entire continents
getting cut off, or the planet getting incinerated in a massive solar flare.

Fundamentally, though, I think it's actually good we have outages
periodically; they help keep us employed. When networks run too
smoothly, management tends to look upon us as unnecessary
overhead that can be trimmed back during the next round of
layoffs. The more they realize we're the only bulwark against
the impending forces of chaos you mentioned above, the less
likely they are to trim us off the payroll.

Matt

Note--tongue was firmly planted in cheek; no slight was intended
against those who may have lost jobs recently; post was intended
for humourous consumption only; any resemblence to useful
content was purely coincidental and not condoned by any present
or past employer. Repeated exposure may be habit forming. Do
not read while operating heavy machinery.