[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?

2 Nov 2020

...
  But there can be a on chip disk controller on the
motherboard, I'm not sure. 
There is always some kind of controller. Could be on-board. Usually, the cache settings
are accessible when booting into the BIOS set-up.

...
  If your worry is fsync persistence 
No, what I worry about is volatile write cache, which is usually enabled by default. This
cache exists on disk as well as on controller. To avoid loosing writes on power fail, the
controller needs to be in write-through mode and the disk write cache disabled. The latter
can be done with smartctl, the former in the BIOS setup.

Did you test power failure? If so, how often? On how many hosts simultaneously? Pulling
network cables will not trigger cache related problems. The problem with write cache is,
that you rely on a lot of bells and whistles where some usually fail. With ceph, this will
lead to exactly the problem you are observing now.

Your pool configuration looks OK. You need to find out where exactly the scrub errors are
situated. It looks like meta-data damage and you might loose some data. Be careful to do
only read-only admin operations for now.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Sagara Wijetunga &lt;sagarawmw(a)yahoo.com&gt;
Sent: 02 November 2020 16:08:58
To: ceph-users(a)ceph.io; Frank Schilder
Subject: Re: [ceph-users] Re: How to recover from
active+clean+inconsistent+failed_repair?

...
  Hmm, I'm getting a bit confused. Could you also
send the output of "ceph osd pool ls detail". 
File ceph-osd-pool-ls-detail.txt attached.

...
  Did you look at the disk/controller cache settings?

I don't have disk controllers on Ceph machines. The hard disk is directly attached to
the motherboard via SATA cable. But there can be a on chip disk controller on the
motherboard, I'm not sure.

If your worry is fsync persistence, I have thoroughly tested database fsync reliability on
Ceph RBD with hundreds of transactions per second and remove network cable and restart the
database machine, etc. while inserts going on. and I did not lose a single transaction. I
simulated this many times and persistence on my Ceph cluster was perfect (i.e not a single
loss).

...
  I think you should start a deep-scrub with "ceph
pg deep-scrub 3.b" and record the output of "ceph -w | grep '3\.b'"
(note the single quotes). 
...
  The error messages you included in one of your first
e-mails are only on 1 out of 3 scrub errors (3 lines for 1 error). We need to find all 3
errors. 
I ran again the "ceph pg deep-scrub 3.b", here is the whole output of ceph -w:

2020-11-02 22:33:48.224392 osd.0 [ERR] 3.b shard 2 soid
3:d577e975:::1000023675e.00000000:head : candidate had a missing snapset key, candidate
had a missing info key

2020-11-02 22:33:48.224396 osd.0 [ERR] 3.b soid 3:d577e975:::1000023675e.00000000:head :
failed to pick suitable object info

2020-11-02 22:35:30.087042 osd.0 [ERR] 3.b deep-scrub 3 errors

Btw, I'm very grateful for your perseverance on this.

Best regards

Sagara

2024

2023

2022

2021

2020

2019

[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?