In an exploration of trying to speedup the long tail of backfills resulting
from marking a failing OSD out I began looking at my PGs to see if i could
tune some settings and noticed the following:
Scenario: on a 12.2.12 Cluster, I am alerted of an inconsistent PG and am
alerted of SMART failures on that OSD. I inspect that PG and notice it is a
read_error from the SMART-failing osd.
Steps I take: Set the primary affinity of the failing OSD to 0 (thought
process being, I dont want a failing drive to be responsible for
backfilling data), wait for peering to complete, then mark the OSD out. At
this point backfill begins.
90% of the PGs complete backfill very quickly. Towards the tail end of the
backfill I have 20 PGs or so in backfill_wait and 1 backfilling (presuming
because of osd_max_backfills = 1).
I do a `ceph pg ls backfill_wait` and notice that 100% of the tail end PGs
are such that all OSDs in the up_set are different than those of acting_set
and that the acting_primary is the OSD that was set with primary affinity 0
and marked out.
My questions are the following:
- Upon learning a disk has failed smart and has an inconsistent PG I want
to prevent its potentially-corrupt data from being replicated out to other
OSDs, even for PGs which may not have been discovered to be inconsistent
yet so I set primary affinity to 0. At this step shouldn't the
acting_primary be another OSD from the acting_set and backfill be copied
out of a different OSD?
- Should I be additionally marking the OSD as down, which would cause the
PGs to go degraded until backfill finishes but would presumably finish
faster as more OSDs would become the acting_primary and I wouldnt be
throttled by osd_max_backfills. My thought here is its best to avoid
degraded PGs as I do not want to drop below min_size.
I recognize some of these things may be different in Nautilus but I am
waiting on the 14.2.6 release as i am aware of some bugs I do not want to
contend with. Thanks.
Respectfully,
*Wes Dillingham*
wes(a)wesdillingham.com
LinkedIn <http://www.linkedin.com/in/wesleydillingham>
Show replies by date