Alexei Roudnev wrote:
We had 6509 which failed, because backplain failed (it can
not happen - but it happen) - iof course, no any 'dual
CPU dual power' could prevent it... Image broken line card
- it can crash whole box no matter how much 'dual' things
you have. The same with software error (I crashed one of
6509 just running 'snmpwalk' on it).
I lost a 7507 dual power dual RSP earlier this year: one of the cards
died, something in the power circuitry. It put the entire router in
short-circuit, both power supplies decided to go south and would not
power back up again until the faulty card was physically removed. After
the card was removed it worked fine again. It does not happen often, but
it does happen.
Redundancy is not a slam dunk with IOS though; same as dCEF, don't
expect RPR-compatible images to run every config you'll bump into, YMMV.
There is an annoying number of things that are not working on RPR images
of fall back to route cache instead of distributed cache.
So, I always prefer to have 2 boxes and application level
reliaility instead of playing with 'dual everything'
solutions (last example - 2 days ago one of our dual-power
Intel servers failed because of 1 power supply failure -
it did not broke, but it did something wrong''' and system
Actually, what I try to do for routers is having a "dual everything" for
production and an "el-cheapo eBay special" sitting in the same rack for
backup. The reason I still do dual power and dual CPU is that over the
last 20 years I have seen very little failures of redundant systems
(although I have seen some) however a dual-something saved my bottom
several times. That part of my body is priceless
For PCs I install dual Xeons on every production machine for example,
even though the CPU power needed for some is a 486; Intel processors do
die like anything else; a processor dying will typically lead to a
system crash, but it does reboot in single-processor mode when the
graveyard dude pushes the reset button. I also try do have RAID-10
arrays span over two raid cards; same as CPUs, a RAID card that dies
will likely crash the system but it will reboot in degraded mode.