[ceph-users] Re: PGS INCONSISTENT - read_error - replace disk or pg repair then replace disk

23 May 2020

When I see this problem usually:

- I run pg repair
- I remove the OSD from the cluster
- I replace the disk
- I recreate the OSD on the new disk

Cheers, Massimo

On Wed, May 20, 2020 at 9:41 PM Peter Lewis &lt;plewis(a)kdinfotech.com&gt; wrote:

...
  Hello,

 I  came across a section of the documentation that I don't quite
 understand.  In the section about inconsistent PGs it says if one of the
 shards listed in `rados list-inconsistent-obj` has a read_error the disk is
 probably bad.

 Quote from documentation:

https://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/…
 `If read_error is listed in the errors attribute of a shard, the
 inconsistency is likely due to disk errors. You might want to check your
 disk used by that OSD.`

 I determined that the disk is bad by looking at the output of smartctl.  I
 would think that replacing the disk by removing the OSD from the cluster
 and allowing the cluster to recover would fix this inconsistency error
 without having to run `ceph pg repair`.

 Can I just replace the OSD and the inconsistency will be resolved by the
 recovery?  Or would it be better to run `ceph pg repair` and then replace
 the OSD associated with that bad disk?

 Thanks!
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: PGS INCONSISTENT - read_error - replace disk or pg repair then replace disk