Teaching/developing troubleshooting skills

It's also important that one avoid:

* The faulty assumption there is but one problem
* Incorrectly-formed causal relationships (NANOG-L has some
examples of these)
* Making too many changes in one iteration
* Attempting to tackle a system with more unknowns than are
absolutely necessary.

These words should be hanging on a wall in every IT department. You
wouldn't believe how many times I've had to gently correct someone
because of these mistakes, particularly the first two.


>It's also important that one avoid:
>* The faulty assumption there is but one problem

Here's an interesting example that I came across
several years ago. It was in an office with lots
of PCs plugged into RJ45 10baseT ports near each desk.
One PC had lost connectivity.

I came and checked that the software was
installed and running. Probably did something
like ping to satisfy myself that it
wasn't a problem on the PC itself. Then I unplugged
the cable from the RJ45 port in the wall and tried
another port. It still did not work. I swapped
in a new cable and it worked fine.

Most people would stop right there, but I
followed up and tested the existing cable
in the lab. It worked just fine. Why did
it not work before? There must be some problem
with the switch or the wall wiring and somehow
two RJ45 ports did not work. After a bit of
poking and discussions with the employee at
that desk, it turned out that the cable lay
in a bad spot and often got caught on her foot
as she rushed off somewhere. It turns out that
the little metal pins inside the RJ45 socket
had been bent. It was just sheer luck that
swapping the cable caused contact to be made again.
And the second socket was also bent. When that
one ceased to work the employee had swapped
cables themselves.

The real solution was to replace both sockets
and install a longer patch cable that could be
placed where feet would not get caught up in it.

Troubleshooting is made easier by methodically
doing the work and following through. If I had
not had the lab handy I probably would have
swapped the "bad " cable back in to verify that
"trouble" accompanied the cable. But it is also
easier to troubleshoot when you have a stock of
interesting war stories in your memory to encourage
you to "think outside the box". It's the blend of
creativity and methodical work practices that makes
a good troubleshooter, technical or otherwise.

--Michael Dillon

Hash: SHA1