Arista filesystem rewinding back 3 years

Hello,

It must be my lucky day. I’ve run into quite a strange situation where what shows up in flash: isn’t the same as what is actually in /mnt/flash on an Arista switch.

I found out that this was an issue when I reloaded the switch and the filesystem looks like it rewound itself to 2022 in Aboot.

I’m guessing that the SSD stopped taking writes sometime in 2022 and EOS never found out about it.

Anyway, if anyone has any information on this feel free to send it to me offlist. I don’t want to create a spamwave or anything.

Thanks and have a nice day.

-Drew

I've seen this before with MicroSD cards in a Raspberry Pi. The card
stops accepting writes but continues to report write success to the
OS. On the Pi, this eventually shows up as seeming filesystem
corruption when blocks are flushed and then reloaded to the disk
cache. Upon reboot, the Pi reverts to the state it was in when the
writes actually stopped happening.

I'm not really sure what the theory behind designing cards this way
is. It does mean that the OS will boot even if the boot process must
write to succeed, but it also means that the OS has no idea that the
flash drive has failed and experiences odd random faults instead.

Regards,
Bill Herrin

My favorite was a “Sony 480GB Flash Drive” which I purchased at an electronics market in Beijing in 2010, for around $5 USD. I knew that it couldn’t be real, but I figured it would be a entertaining…

It reported itself to the OS as having 480GB of capacity, but actually only had a 16Mb flash chip. Anything that you wrote past the and of the storage would wrap around to the start.

It actually turned out to be remarkable useful - I mounted it on /var/log/syslog on a server, and magically had circular buffer of logs which would never fill up / run out of space….

W

William Herrin wrote:

I've seen this before with MicroSD cards in a Raspberry Pi. The card stops accepting writes but continues to report write success to the OS. On the Pi, this eventually shows up as seeming filesystem corruption when blocks are flushed and >>then reloaded to the disk cache. Upon reboot, the Pi reverts to the state it was in when the writes actually stopped happening.

Well, I am glad it didn't boot up using the startup-config from 2022 that would've been an actual catastrophe.

Chatting with TAC about it. Yikes.
-Drew

We've had several SSD failures in Arista devices, I can only assume a bad batch of SSDs because they were all in a batch of routers ordered and delivered together.

For us the SSDs dropped into RO mode.

When this happens there are syslog messages to let you know, and if you drop into a BASH shell you can see the issue, but none of the EOS show commands show the problem:

show file systems -> shows all FS as rw
show system health storage -> shows "OK"

Cheers,
James.

James wrote:

We've had several SSD failures in Arista devices, I can only assume a bad batch of SSDs because they were all in a batch of routers ordered and delivered together.
For us the SSDs dropped into RO mode.
When this happens there are syslog messages to let you know, and if you drop into a BASH shell you can see the issue, but none of the EOS show commands show the problem:

I just went through 6 years of a syslog file and there wasn't anything mentioned about /dev/sda, /mnt/flash, or anything else that indicated that something was wrong with the disk or the filesystem.

RANCID was even showing that there were EOS image files there that weren't actually there when it was backing up the configs and the startup-config date showed 2/28/2025 until the system was reloaded and then it began saying startup-config was actually last written on 03/11/2022.

RANCID shows that there have been 57 configuration updates since 3/11/2022 and every time 'write memory' was run it said: : “Copy completed successfully.”

Anyway thanks for replying.
-Drew