You are correct, even though the repair reports an error, I was able to join the disk back
into the cluster, and it stopped reporting the legacy omap warning. I had assumed an
"error" was something that needed to be rectified before anything could proceed,
but apparently it's more like "warning: there was an error on this one
non-critical task" :)
We'll probably just destroy and rebuild that OSD once we're back to HEALTH_OK.
Thank you!
________________________________
From: Igor Fedotov <ifedotov(a)suse.de>
Sent: Thursday, May 20, 2021 05:15
To: Pickett, Neale T; ceph-users(a)ceph.io
Subject: [EXTERNAL] [ceph-users] Re: fsck error: found stray omap data on omap_head
I think there is no way to fix that at the moment other than manually
identify and remove relevant record(s) in RocksDB with
ceph-kvstore-tool. Which might be pretty tricky..
Looks like we should implement these stray records removal when
repairing BlueStore...
On 5/19/2021 11:12 PM, Pickett, Neale T wrote:
We just upgraded to pacific, and I'm trying to
clear warnings about legacy bluestore omap usage stats by running 'ceph-bluestore-tool
repair`, as instructed by the warning message. It's been going fine, but we are now
getting this error:
[root@vanilla bin]# ceph-bluestore-tool repair --path $osd_path
2021-05-19T19:25:26.485+0000 7f67ca3593c0 -1 bluestore(/var/lib/ceph/osd/ceph-9) fsck
error: found stray omap data on omap_head 12256434 0 0
repair status: remaining 1 error(s) and warning(s)
[root@vanilla bin]# ceph-bluestore-tool fsck --path $osd_path -deep
2021-05-19T20:03:17.002+0000 7f4d1d6603c0 -1 bluestore(/var/lib/ceph/osd/ceph-9) fsck
error: found stray omap data on omap_head 12256434 0 0
fsck status: remaining 1 error(s) and warning(s)
We're only 10% of the way through our OSDs, so I'd like to find some way to fix
this other than destroying and rebuilding the OSD, in case it happens again. Fixing this
error is especially attractive since we can't get out of HEALTH_WARN until we've
run recover on all OSDs.
One can silent 'legacy omap' warning via
"bluestore_warn_on_no_per_pool_omap" and
"bluestore_warn_on_no_per_pg_omap" config parameterrs.
And I'm not sure I understand why the above fsck error prevents from
proceeding with the upgrade. Indeed the repair leaves this stray omap
record as-is but all the other omaps should be properly converted at
this point. I presume this should eliminate the "legacy omap" warning
for this specific OSD. Isn't this the case?
Any suggestions?
Neale Pickett <neale(a)lanl.gov>
A-4: Advanced Research in Cyber Systems
Los Alamos National Laboratory
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io