Critical Information: DELL/Toshiba SSDs dying after 70,000 hours of operation - ceph-users

19 Jun 2023

Hello, 

This message does not concern Ceph itself but a hardware vulnerability which can lead to
permanent loss of data on a Ceph cluster equipped with the same hardware in separate fault
domains. 

The DELL / Toshiba PX02SMF020, PX02SMF040, PX02SMF080 and PX02SMB160 SSD drives of the 13G
generation of DELL servers are subject to a vulnerability which renders them unusable
after 70,000 hours of operation, i.e. approximately 7 years and 11 months of activity. 

This topic has been discussed here:
https://www.dell.com/community/PowerVault/TOSHIBA-PX02SMF080-has-lost-commu…

The risk is all the greater since these disks may die at the same time in the same server
leading to the loss of all data in the server. 

To date, DELL has not provided any firmware fixing this vulnerability, the latest firmware
version being "A3B3" released on Sept. 12, 2016:
https://www.dell.com/support/home/en-us/ drivers/driversdetails?driverid=hhd9k 

If your have servers running these drives, check their uptime. If they are close to the
70,000 hour limit, replace them immediately. 

The smartctl tool does not report the uptime for these SSDs, but if you have HDDs in the
server, you can query their SMART status and get their uptime, which should be about the
same as the SSDs. 
The smartctl command is: smartctl -a -d megaraid,XX /dev/sdc (where XX is the iSCSI bus
number). 

We have informed DELL about this but have no information yet on the arrival of a fix. 

We have lost 6 disks, in 3 different servers, in the last few weeks. Our observation shows
that the drives don't survive full shutdown and restart of the machine (power off then
power on in iDrac), but they may also die during a single reboot (init 6) or even while
the machine is running. 

Fujitsu released a corrective firmware in June 2021 but this firmware is most certainly
not applicable to DELL drives: https://www.fujitsu.com/us/imagesgig5/PY-CIB070-00.pdf 

Regards, 
Frederic 

Sous-direction Infrastructure and Services 
Direction du Numérique 
Université de Lorraine