BlueStore not surviving power outage - Dev

27 Apr 2021

Hi, everyone.

Recently, one of our online cluster experienced a whole cluster power
outage, and after the power recovered, many osd started to log the
following error:

2021-04-27 15:38:05.503 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:05.504 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:05.505 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:05.506 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:28.379 2b372c158700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x40000, got 0xce935e16, expected 0x9b502da7,
device location [0xa9a80000~1000], logical extent 0x80000~1000, object
#9:c2a6d9ae:::rbd_data.3b35df93038d.0000000000000696:head#

We are using Nautilus 14.2.10 version, and we put rocksdb on top of
SSDs while bluestore data on SATA disks. It seems that the BlueStore
didn't survive the power outage, is it supposed to behave this way? Is
there any way to prevent it?

Thanks:-)