I tried that (and just tried again by setting it in /etc/ceph/ceph.conf). OSD still won’t
start.
Dr. T.J. Ragan
Senior Research Computation Officer
Leicester Institute of Structural and Chemical Biology
University of Leicester, University Road, Leicester LE1 7RH, UK
t: +44 (0)116 223 1287
e: TJ.Ragan@leicester.ac.uk<mailto:tjr22@leicester.ac.uk>
w:
www.le.ac.uk/liscb<http://www.le.ac.uk/liscb>
[University of Leicester Logo][Athena Swan Silver Award]
On 31 Jan 2020, at 14:44, Paul Emmerich
<paul.emmerich@croit.io<mailto:paul.emmerich@croit.io>> wrote:
If you don't care about the data: set
osd_find_best_info_ignore_history_les = true on the affected OSDs
temporarily.
This means losing data.
For anyone else reading this: don't ever use this option. It's evil
and causes data loss (but gets your PG back and active, yay!)
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcroit.io&…
croit GmbH
Freseniusstr. 31h
81247 München
https://eur03.safelinks.protection.outlook.com/?url=www.croit.io&da…
Tel: +49 89 1896585 90
On Fri, Jan 31, 2020 at 3:14 PM Ragan, Tj (Dr.)
<tj.ragan(a)leicester.ac.uk> wrote:
Hi All,
Long story-short, we’re doing disaster recovery on a cephfs cluster, and are at a point
where we have 8 pgs stuck incomplete. Just before the disaster, I increased the pg_count
on two of the pools, and they had not completed increasing the pgp_num yet. I’ve since
forced pgp_num to the current values.
So far, I’ve tried mark_unfound_lost but they don’t report any unfound objects, and I’ve
tried force-create-pg but that has no effect, except on one of the pgs, which went to
creating+incomplete. During the disaster recovery, I had to re-create several OSDs (due
to unreadable superblocks,) and now one of the new osds, as well as one of the existing
osds won’t start. The log from the startup of osd.29 is here:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.…p;reserved=0,
which seems to indicate that it won’t start because it’s supposed to have copies of the
incomplete placement groups.
ceph pg 5.38 query (one of the incomplete) gives:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.…
I have hunted around in the osds listed for all the placement groups for any sign of a pg
that I could mark as complete with ceph-objectstore-tool, but can’t find any. I don’t
care about the data in the pgs, but I can’t abandon the filesystem.
Any help would be greatly appreciated.
-TJ Ragan
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io