Temperature monitoring

All,

We had an issue with a DC where temps were elevated. The one bit of
hardware that wasn't watched much was the one that sent out the initial
alert. Looking for recommendations on hardware that I can mount/hang in
each cabinet that is easy to set up and will alert us if temps go beyond a
certain point.

TIA.

Dovid

Yo Dovid!

Most everything has temperature sensors from switches, servers and most
modern PDUs. A dedicated solution is just creating the problem again in the
future. Monitor the temps on everything and gain knowledge related to
failure rates. Most companies with physical infrastructure could pay for
another engineer to discover these unexpected expenses. Also note that
modern air conditioning and refrigeration have SNMP or BACNET protocol
support, just download the manual.

Are you running ntpd on your boxes? See what happens when you plot its
drift against other temperature sensors, the closer to the clock chip
the better.

If you do this on enough boxes, you should have an easy time seeing what
happens on boxes where you have an easier time watching ntpd's drift
value than you have watching a nearby dedicated temperature sensor.

We have Sensaphones (sensaphone.com) in remote offices. We use IMS-4000s. They are a 1RU box with RJ45 jacks on the front. You can run CAT-5 to where you want to monitor something, and stick a module on the end of the cable. They have temp, humidity, generic NO/NC sensors, power sensors to graph voltage and alert on voltage swings etc. They can send emails, SNMP traps, or dial out with a modem. They also have a built in mic that can alert on noise increases. Some models allow you to dial in and listen to the room.

Weathergoose by IT watchdogs. 1U rackmount devices with very shallow depth of about an inch or two. Sensors are cheap, varied, and you can daisychain dozens of them together. So one server box can monitor entire row of racks. Loads of other features too for notification, escalation, and SNMP manageable.

-mel via cell

http://tyconsystems.com/index.php/products/tycon-power/tpdin-monitor-web/751-tpdin-monitor-web2

Is what I use in my cabinets. Has two temp sensors, one internal and one
external. I put the external near the AC cold air output so I can get a
diff and know if the AC is on. SNMP cacti graphs them nicely. I use one
of the voltage sensors to monitor the cabinet doors via reed switches. In
remote mountain sites also use for battery/solar voltages and to monitor
wall warts for Utility power loss.

/rh

If all that you require is temperature monitoring, I recommend going
through the SNMP MIBs and doing an snmpwalk of your devices to identify the
sensors at the air intake... Unfortunately there are some devices which do
not have air intake sensors, but only a sensor somewhere generally in the
center of the motherboard. But other devices have temperature diodes nearly
everywhere. The attached chart example from an Arista 1U switch is a device
which is really good about identifying the location of the individual
sensors in the MIB.

When purchasing a temperature monitoring standalone device, I highly
recommend something that is capable of not only temperature sensors but
also highly useful things like relay controls, wire contacts for other
equipment alarms, contacts for things like door/cabinet opening sensors,
etc. With the right high-frequency snmp polling and trap setup you can use
such a thing for a great deal more than just temperature. I have seen
examples of the Tinycontrol v3 used by NOCs to grant third parties access
to POPs via remotely triggered relays and magnetic strike door locks.

Here's a couple of good examples:

http://tinycontrol.pl/en/lan-controller/

http://tinycontrol.pl/en/accessories-lk-3-sensor/

http://www.controlbyweb.com/x332/

If all that you require is temperature monitoring, I recommend going
through the SNMP MIBs and doing an snmpwalk of your devices to identify the
sensors at the air intake... Unfortunately there are some devices which do
not have air intake sensors, but only a sensor somewhere generally in the
center of the motherboard. But other devices have temperature diodes nearly
everywhere. This chart example from an Arista 1U switch is a device which
is really good about identifying the location of the individual sensors in
the MIB.

http://imgur.com/a/4CfpK

When purchasing a temperature monitoring standalone device, I highly
recommend something that is capable of not only temperature sensors but
also highly useful things like relay controls, wire contacts for other
equipment alarms, contacts for things like door/cabinet opening sensors,
etc. With the right high-frequency snmp polling and trap setup you can use
such a thing for a great deal more than just temperature. I have seen
examples of the Tinycontrol v3 used by NOCs to grant third parties access
to POPs via remotely triggered relays and magnetic strike door locks.

Here's a couple of good examples:

http://tinycontrol.pl/en/lan-controller/

http://tinycontrol.pl/en/accessories-lk-3-sensor/

http://www.controlbyweb.com/x332/

Harlan Stenn wrote:

If you do this on enough boxes, you should have an easy time seeing what
happens on boxes where you have an easier time watching ntpd's drift
value than you have watching a nearby dedicated temperature sensor.

sweet from a technical point of view, but if you have elevated
temperatures in a DC (happens all the time with CRACs tripping out), and
you need to report this to the DC operations centre and the conversation
would be hilarious if you started talking about ntpd drift

customer: ohai, we're seeing the sort of clock drift from ntp monitoring
that suggests there is a temperature issue near cabinet X.

datacentre: huh, what's ntp?

customer: it's a time control protocol.

datacentre: the time is 12:23. Closing the ticket now.

customer: but you have a temperature problem! Our clocks said so!

datacentre: ticket is closed. please open a new ticket.

Repeat until customer gives up in despair.

Three weeks later, the customer is billed for 14 smart hands tickets
relating to asking what the time was.

Nick

We use Asentria.

we use: https://serverscheck.com/sensors/ - simple setup, graph nicely in
Cacti. I went with ServerCheck wired based units + external temp+humidity
probe. The base unit displays the temperature which is a nice quick
reference if you are in the room.

+1 for the serverscheck.com gear. Been running it as a humidity monitor in the plant for a year or so now and it's been rock solid. If you're the kind of shop that requires calibration for that sort of equipment they'll handle that as well. Great company to work with. Pair it with Cacti + thold plugin or whatever other snmp monitoring you like - or the base units can handle alerting on their own.

FYI for those interested - the stated max length of connecting cable between the base station and the sensor units (30ft iirc) is way under what it'll do in the real world - I've got at least one sensor unit that's a good 500ft away from the base station and it's been working just fine

Ed Pers

Would be pretty great if mobile worked... :sunglasses:

Agreed -- there are already tons of temp sensors throughout old and new
hardware. I've used SCSI drive queries via sdparm and more recently hddtemp
to get the current temperature of the drives. No need for SNMP or ILO,
though that can give you a more detailed picture where possible.

You first monitor and record for 24 hours to get your baseline temp for a
given rack or server, then set your threshold, then let your monitoring
platform do the rest.

Since I use hosted dedicated servers, I don't want to pay for yet another
device. In monitoring only those disk temps I've caught two cooling issues
before they became a crisis, one of which my hosting provider was not aware
of.

If you control the hardware, or at least have access to it, there should be
enough sensors to let you know at least something is causing a problem.

Beckman