Rack rails on network equipment

This. Once they’re racked, they’re not going anywhere. I would summarize as they’re certainly nice, but more of a nice to have. The only racking systems I try to avoid are the WECO (Western Electric COmpany) standard. The square “holes”.

Warm regards,

-M<

* cma@cmadams.net (Chris Adams) [Sat 25 Sep 2021, 00:17 CEST]:

Which - why do I have to order different part numbers for back to front airflow? It's just a fan, can't it be made reversible? Seems like that would be cheaper than stocking alternate part numbers.

The fan is inside the power supply right next to the high-voltage capacitors. You shouldn't be near that without proper training.

Meh… Turn off power supply input switch, open chassis carefully, apply high-wattage 1Ω resistor across capacitor terminals for 10 seconds.

There isn’t going to be any high voltage left after that.

Owen

Didn't require any additional time at all when equipment wasn't bulky
enough to need rails in the first place....

I've never been happy about that change.

If dealing with a charged capacitor, do not use a low resistance such as a
ohm. This is the same as using a screwdriver, and will cause a big arc. You
want to use a 100k ohm device for a couple seconds, this will bleed it off
over 5-10 seconds.

Most (all?) power supplies will have a bleeder over any large value caps, and
will likely be shielded/encased near the input anyways. If you let it sit for
5-10 minutes the leakage resistance will dissipate the charge in any typical
capacitor.

Hi,

Seriously, the physical build of network equipment is not entirely
competent.

Except, sometimes there is little choice. Look at 400G QSFP-DD for
example. Those optics can generate up to 20 watts of heat that needs
to be dissipated. For 800G that can go up to 25 watts.

That makes back-to-front cooling, as some people demand, very
challenging, if not impossible.

Thanks,

Sabri

Hi,

Well, folks, the replies have certainly been interesting. I did get my answer, which seems to be “no one cares”, which, in turn, explains why network equipment manufacturers give very little to no attention to this problem. A point of clarification is I’m talking about the problem in the context of operating a data center with cabinet racks, not a telecom closet with 2 post racks.

Let me just say from the get go that no one is making toolless rails a priority to the point of shutting vendors out of the evaluation process. I am not quite sure why that assumption was made by at least a few folks. With that said, when all things being equal or fairly equal, which they rarely are, that’s when the rails come in as a factor.

We operate over 1000 switches in our data centers, and hardware failures that require a switch swap are common enough where the speed of swap starts to matter to some extent. We probably swap a switch or two a month. Furthermore, those switches several of you referenced, which run for 5+ years are not the ones we use. I think you are thinking of the legacy days where you pay $20k plus for a top of rack switch from Cisco, and then sweat that switch until it dies of old age. I used to operate exactly like that in my earlier days. This does not work for us for a number of reasons, and so we don’t go down that path.

We use Force10 family Dell switches which are basically Broadcom TD2+/TD3 based switches (ON4000 and ON5200 series) and we run Cumulus Linux on those, so swapping hardware without swapping the operating system for us is quite plausible and very much possible. We just haven’t had the need to switch away from Dell until recently after Cumulus Networks (now Nvidia) had a falling out with Broadcom and effectively will cease support for Broadcom ASICs in the near future. We have loads of network config automation rolled out and very little of it is tied to anything Cumulus Linux specific, so there is a fair chance to switch over to Sonic with low to medium effort on our part, thus returning to the state where we can switch hardware vendors with fairly low effort. We are looking at Nvidia (former Mellanox) switches which hardly have any toolless rails, and we are also looking at all the other usual suspects in the “white box” world, which is why I asked how many of you care about the rail kit and I got my answer: “very little to not at all”. In my opinion, if you never ask, you’ll never get it, so I am asking my vendors for toolless rails, even if most of them will likely never get there, since I’m probably one of the very few who even brights that question up to them. I’d say network equipment has always been in a sad state of being compared to, well, just about any other equipment and for some reason we are all more or less content with it. May I suggest you all at least raise that question to your suppliers even if you know full well the answer is “no”. At least it will start showing the vendors there is demand for this feature.

On the subject of new builds. Over the course of my career I have hired contractors to rack/stack large build-outs and a good number of them treat your equipment the same way they treat their 2x4s. They torque all the screws to such a degree that when you have to undo that, you are sweating like a pig trying to undo one screw, eventually stripping it, so you have to drill that out, etc, etc. How is that acceptable? I’m not trying to say that every contractor does that, but a lot do to the point that that matters. I have no interest in discussing how to babysit contractors so they don’t screw up your equipment.

I will also concede that operating 10 switches in a colo cage probably doesn’t warrant considerations for toolless rails. Operating 500 switches and growing per site?.. It slowly starts to matter. And when your outlook is expansion, then it starts to matter even more.

Thanks to all of you for your contribution. It definitely shows the perspective I was looking for.

Special thanks to Jason How-Kow, who linked the Arista toolless rails (ironically we have Arista evals in the pipeline and I didn’t know they do toolless, so it’s super helpful)

The “niceness” of equipment does factor in but it might be invisible. For example if you like junipers cli environment, you will look at their stuff first even if you do not have it explicitly in your requirement list.

Better rack rails will make slightly more people prefer your gear, although it might be hard to measure exactly how much. Which is probably the problem.

Our problem with racking switches is how vendors deliver NO rack rails and expect us to have them hanging on just the front posts. I have a lot of switches on rack shelfs for that reason. Does not look very professional but neither does rack posts bent out of shape.

My personal itch is how new equipment seems to have even worse boot time than previous generations. I am currently installing juniper acx710 and while they are nice, they also make me wait 15 minutes to boot. This is a tremendous waste of time during installation. I can not leave the site without verification and typically I also have some tasks to do after boot.

Besides if you have a crash or power interruption, the customers are not happy to wait additionally 15 minutes to get online again.

Desktop computers used to be ages to boot until Microsoft declared that you need to be ready in 30 seconds to be certified. And suddenly everything could boot in 30 seconds or less. There is no good reason to waste techs time by scanning the SCSI bus in a server that does not even have the hardware.

Regards

Baldur

lør. 25. sep. 2021 21.49 skrev Andrey Khomyakov <khomyakov.andrey@gmail.com>:

Switches in particular have a lot of ASICs that need to be loaded on boot. This takes time and they're really not optimized for speed on a process that occurs once.

Perhaps from this paragraph?

Owen

If I was going to rule any out based on rails it'd be their half width
model. Craziest rails I've seen. It's actually a frame that sits inside
the rack rails so you need quite a bit of space above to angle it to
fit between the rails.

Once you have stuff above and below the frame isn't coming out (at
least the switches just slide into it)

branodn

It doesn't seem like it would take too many reboots to really mess with your reliability numbers for uptime. And what on earth are the developers doing with that kind of debug cycle time?

Mike

Why about thinks like the Cisco 4500 switch series that are almost as long as a 1u server. But yet only has mounts for a relay type rack.

As far as boot times, try a Asr920. Wait 15 minutes and decide if it’s time to power cycle again or wait 5 more minutes

(Crying, thinking about racks and racks and racks of AT&T 56k modems strapped to shelves above PM-2E-30s…)

The early 90s were a dangerous place, man.

-George

(Crying, thinking about racks and racks and racks of AT&T 56k modems strapped to shelves above PM-2E-30s…)

And all of their wall-warts and serial cables....

The early 90s were a dangerous place, man.

Yes, but the good news is that shortly thereafter you got to replace that all of that gear with Ascend TNT space heaters which did double duty as modem banks.

You were doing it wrong, then. :slight_smile:

ExecPC had this down to a science, and had used a large transformer
to power a busbar along the back of two 60-slot literature organizers,
with 4x PM2E30's on top, a modem in each slot, and they snipped off
the wall warts, using the supplied cable for power. A vertical board
was added over the top so that the rears of the PM2s were exposed, and
the board provided a mounting point for an ethernet hub and three Amp
RJ21 breakouts. This gave you a modem "pod" that held 120 USR Courier
56K modems, neatly cabled and easily serviced. The only thing coming
to each of these racks was 3x AMP RJ21, 1x power, and 1x ethernet.

They had ten of these handling their 1200 (one thousand two hundred!)
modems before it got unmanageable, and part of that was that US Robotics
offered a deal that allowed them to be a testing site for Total Control.

At which point they promptly had a guy solder all the wall warts back on
to the power leads and proceeded to sell them at a good percentage of
original price to new Internet users.

The other problem was that they were getting near two full DS3's worth
of analog lines being delivered this way, and it was taking up a TON of
space. A full "pod" could be reduced to 3x USR TC's, so two whole pods
could be replaced with a single rack of gear.

... JG

We operate over 1000 switches in our data centers, and hardware failures that require a switch swap are common enough where the speed of swap starts to matter to some extent. We probably swap a switch or two a month.

having operated a network of over 2000 switches, where we would see
maybe one die a year (and let me tell you, some of those switches were
not in nice places...no data centre air handled clean rack spaces etc)
this failure
rate is very high and would certainly be a factor in vendor choice.
for initial install, there are quicker ways of dealing with cage nut
installs... but when a switch die in service, the mounting isnt a
speed factor, its the cabling (and
as others have said, the startup time of some modern switches, you can
patch every cable back in before the thing has even booted these
days).

alan

I can install an entire 384lb 21U core router in 30 minutes.

Most of that time is removing every module to lighten the chassis, then re-installing every module.

We can build an entire POP in a day with a crew of 3, so I’m not sure there’s worthwhile savings to be had here. Also consider that network engineers babysitting it later cost more than the installers (usually) who don’t have to be terribly sophisticated at say BGP.

Those rapid-rails are indeed nice for servers and make quick work of putting ~30+ 1U pizza boxes in a rack quickly. We use them on 2U servers we like a lot.

And these days everyone is just buying merchant silicon and throwing a UI around it, so there’s less of a reason to pick any particular vendor, however there still is differentiation that can dramatically increase the TCO.

I don’t think they’re needed for switches, and for onesie-twosie, they’ll probably slow things down compared with basic (good, bad ones exist) rack rails.

I write all of this from the perspective of a network engineer, businesswoman, and telecom carrier - not necessarily that of a hyperscale cloud compute provider, although we are becoming one of those too it seems, so this perspective may shift for that unique use-case.

-LB

* Andrey Khomyakov

Interesting tidbit is that we actually used to manufacture custom rails for our Juniper EX4500 switches so the switch can be actually inserted from the back of the rack (you know, where most of your server ports are...) and not be blocked by the zero-U PDUs and all the cabling in the rack. Stock rails didn't work at all for us unless we used wider racks, which then, in turn, reduced floor capacity.

As far as I know, Dell is the only switch vendor doing toolless rails so it's a bit of a hardware lock-in from that point of view.

Amen.

I suspect that Dell is pretty much alone in realising that rack mount
kits that require insertion/removal from the hot aisle is pure idiocy,
since the rear of the rack tends to be crowded with cables, PDUs, and
so forth.

This might be due to Dell starting out as a server manufacturer. *All*
rack-mount servers on the market are inserted into (and removed from)
the cold aisle of the rack, after all. The reasons that make this the
only sensible thing for servers apply even more so for data centre
switches.

I got so frustrated with this after having to remove a couple of
decommissioned switches that I wrote a post about it a few years back:

Nowadays I employ various strategies to facilitate cold aisle
installation/removal, such as: reversing the rails if possible,
attaching only a single rack ear (for four-post mounted equipment) or
installing rivet nuts directly in the rack ears (for shallow two-post
mounted equipment).

(Another lesson the data centre switch manufacturers could learn from
the server manufacturers is to always include a BMC. I would *much
rather* spend my serial console infrastructure budget on switches with
built-in BMCs. That way I would get remote power control, IPMI Serial-
Over-LAN and so on – all through a *single* Ethernet management cable.)

Tore

...

This level of failure surprises me. While I can't say I have 1000
switches, I do have hundreds of switches, and I can think of a failure
of only one or two in at least 15 years of operation. They tend to be
pretty reliable, and have to be swapped out for EOL more than anything.