Hi,
My ceph cluster has 9 nodes for Ceph Object Store. Recently, I have experienced data
loss that reply 404 (NoSuchKey) by s3cmd get xxx command. However, I can get metadata
info by s3cmd ls xxx. The RGW object size is above 1GB that have many multipart object.
Commanding 'rados -p default.rgw.buckets.data stats object' show that it only have
head object, all of multipart and shadow part have gone. The bucket data only support
write and read operation, no delete, and has no lifecycle policy.
I have found similar problem in
https://tracker.ceph.com/issues/47866 that had repaired
in v16.0.0. Maybe this is new data loss problem that very serious for us.
ceph version: 16.2.5
#command info:
s3cmd ls
s3://solr-scrapy.commoncrawl-warc/batch_2024031314/Scrapy/main/CC-MAIN-20200118052321-20200118080321-00547.warc.gz
2024-03-13 09:27 1208269953
s3://solr-scrapy.commoncrawl-warc/batch_2024031314/Scrapy/main/CC-MAIN-20200118052321-20200118080321-00547.warc.gz
s3cmd get
s3://solr-scrapy.commoncrawl-warc/batch_2024031314/Scrapy/main/CC-MAIN-20200118052321-20200118080321-00547.warc.gz
download:
's3://solr-scrapy.commoncrawl-warc/batch_2024031314/Scrapy/main/CC-MAIN-20200118052321-20200118080321-00547.warc.gz'
-> './CC-MAIN-20200118052321-20200118080321-
00547.warc.gz' [1 of 1] ERROR: Download of
'./CC-MAIN-20200118052321-20200118080321-00547.warc.gz' failed (Reason: 404
(NoSuchKey))
ERROR: S3 error: 404 (NoSuchKey)
# head exist and size is 0, multipart and shadow had lost
rados -p default.rgw.buckets.data stat
df8c0fe6-01c8-4c07-b310-2d102356c004.76248.1__multipart_batch_2024031314/Scrapy/main/CC-MAIN-20200118052321-20200118080321-00547.warc.gz.2~C2M72EJLHrNe_fnHnifS4N7pw70hVmE.1
error stat-ing
eck6m2.rgw.buckets.data/df8c0fe6-01c8-4c07-b310-2d102356c004.76248.1__multipart_batch_2024031314/Scrapy/main/CC-MAIN-20200118052321-20200118080321-00547.warc.gz.2~C2M72EJLHrNe_fnHnifS4N7pw70hVmE.1:
(2) No such file or directory
thanks.