Thanks everyone.
We did massive power failure tests on our virtualization platform, and now
have the confidence to say that this is not bluestore's fault. I think the
best choice now is to change to a jbod configuration:)
Thanks again:)
On Wed, Apr 28, 2021, 20:51 Maged Mokhtar <mmokhtar(a)petasan.org> wrote:
On 27/04/2021 10:54, Xuehan Xu wrote:
Hi, everyone.
Recently, one of our online cluster experienced a whole cluster power
outage, and after the power recovered, many osd started to log the
following error:
2021-04-27 15:38:05.503 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:05.504 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:05.505 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:05.506 2b372b957700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x36000, got 0x41fe1397, expected 0x8d7f5975,
device location [0xa7e76000~1000], logical extent 0x1b6000~1000,
object #9:45a4e02a:::rbd_data.3b35df93038d.0000000000000095:head#
2021-04-27 15:38:28.379 2b372c158700 -1
bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
checksum at blob offset 0x40000, got 0xce935e16, expected 0x9b502da7,
device location [0xa9a80000~1000], logical extent 0x80000~1000, object
#9:c2a6d9ae:::rbd_data.3b35df93038d.0000000000000696:head#
We are using Nautilus 14.2.10 version, and we put rocksdb on top of
SSDs while bluestore data on SATA disks. It seems that the BlueStore
didn't survive the power outage, is it supposed to behave this way? Is
there any way to prevent it?
Thanks:-)
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io
it could also happen due to low cost consumer SSD drives, the majority
do not support Power Loss Protection (PLP). PLP is different than
support for sync/flush which is the supported by nearly all SSDs, as i
understand SSDs do a read/edit/erase/write cycle in larger blocks of
64-256KB and without PLP could lose data during the erase phase in case
of power loss. Always use enterprise grade SSDs for this as well as
other reasons like DWPD..etc
A while ago we spent about a month doing power failure tests testing
durability of dm-writecache while writing high throughput and iops, we
would get 1 or 2 failures out of 10 power cycles when using cheap SSDs
ranging from inconsistent pgs to unfound objects, we would get no
problems when using drives with PLP.
/Maged