Hi,
We had a major crash which ended with ~1/3 of our osd downs.
Trying to fix it we reinstalled a few down osd (that was a mistake, I
agree) and destroy the datas on it.
Finally, we could fix the problem (thanks to Igor Fedotov) and restart
almost all of our osds except one for which the rocksdb seems corrupted
(at least for one file).
Unfortunately, we now have 4 pgs down (all involving the dead osd) and 8
pg incompletes (some of them also involving the down osd).
Before considering data loss, we would like to try to restart the down
osds hopping to recover the down pgs and maybe some of the incomplete ones.
Does someone have an idea on how to do that (maybe by removing the file
corrupting the rocksdb or forcing to ignore the data in error) ?
If it's not possible, how can we fix (even with dataloss) the downs and
incomplete pgs ?
Thanks for your advices.
F.
Show replies by date
Francois,
I have never tried that myself but I recall it's possible to
export/import PG using ceph-objectstore-tool. Probably there are some
examples in this mailing list...
Your broken OSD passes fsck, i.e. works fine in read-only mode.
Unfortunately AFAIK export does a regular mount (i.e. opens in R/W) but
this definitely worths a try.
Thanks,
Igor
On 5/2/2020 1:37 AM, Francois Legrand wrote:
> Hi,
> We had a major crash which ended with ~1/3 of our osd downs.
> Trying to fix it we reinstalled a few down osd (that was a mistake, I
> agree) and destroy the datas on it.
> Finally, we could fix the problem (thanks to Igor Fedotov) and restart
> almost all of our osds except one for which the rocksdb seems
> corrupted (at least for one file).
> Unfortunately, we now have 4 pgs down (all involving the dead osd) and
> 8 pg incompletes (some of them also involving the down osd).
> Before considering data loss, we would like to try to restart the down
> osds hopping to recover the down pgs and maybe some of the incomplete
> ones.
> Does someone have an idea on how to do that (maybe by removing the
> file corrupting the rocksdb or forcing to ignore the data in error) ?
> If it's not possible, how can we fix (even with dataloss) the downs
> and incomplete pgs ?
> Thanks for your advices.
> F.
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io