Hi Weiwen,
Yes it is EC 4+2 pool. Should I do "osd out" first to affected OSDs before
doing the procedure you mentioned? Do you mean to down the affected OSD one
by one, doing the procedure and then bring it up again? If I make all of
them down again, I'm afraid that this will impact to other PGs which has
the same OSDs members. Would you mind to give me safe step by step? I don't
mind to lost this PG since it is the risk, but I need no I/O freeze
whenever doing recovery on the RBD images which consists object inside this
PG where this pool is a RBD data-pool.
Best regards,
On Fri, May 7, 2021 at 10:17 PM 胡玮文 <huww98(a)outlook.com> wrote:
在 2021/5/7 下午6:46, Lazuardi Nasution 写道:
Hi,
After recreating some related OSDs (3, 71 and 237), now the acting set is
normal but the PG is incomplete now and there are slow ops on primary OSD
(3). I have tried to make it normal
Hi Lazuardi,
I assume this pg is in a EC 4+2 pool, so you can lost at most 2 OSDs? Now
you have wiped the data of 3 OSDs, this pg does not have enough information
to recover the content in it.
If it is the case, I guess unless you can recover the data in some of
these recreated OSDs, you cannot recover the content in this pg. Your best
choice may be deleting all objects in it (with "ceph-objectstore-tool
--op remove", then "ceph osd force-create-pg", I believe). Be aware of
the data loss.
Weiwen Hu
with osd_find_best_info_ignore_history_les way but the PG is still
incomplete. On this condition the I/O from clients sometimes is freezing, I
suspect that the blocks inside this PG cause I/O freeze. How can I resolve
this incomplete PG or at least to make the client I/O not freeze for
recovering the rest of the normal block like recovering the drive with bad
sectors?
Best regards,
On Wed, May 5, 2021 at 12:29 AM Lazuardi Nasution <mrxlazuardin(a)gmail.com>
<mrxlazuardin(a)gmail.com>
wrote:
Hi,
Suddenly we have a recovery_unfound situation. I find that PG acting set
is missing some OSDs which are up. Why can't OSD 3 and 71 on following PG
query result be members of PG acting set? Currently, we use v15.2.8. How to
recover from this situation?
{
"snap_trimq": "[]",
"snap_trimq_len": 0,
"state":
"active+forced_recovery+recovery_unfound+undersized+degraded+remapped",
"epoch": 237505,
"up": [
3,
237,
71,
132,
115,
56
],
"acting": [
2147483647,
237,
2147483647,
132,
115,
56
],
"backfill_targets": [
"3(0)",
"71(2)"
],
"acting_recovery_backfill": [
"3(0)",
"56(5)",
"71(2)",
"115(4)",
"132(3)",
"237(1)"
],
Best regards.
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io