[ceph-users] Re: damaged cephfs

4 Sep 2020

Hello Magnus,

On Thu, Sep 3, 2020 at 11:55 PM Magnus HAGDORN &lt;Magnus.Hagdorn(a)ed.ac.uk&gt; wrote:
...

 Hi there,
 we reconfigured our ceph cluster yesterday to remove the cluster
 network and things didn't quite go to plan. I am trying to figure out
 what went wrong and also what to do next.

 We are running nautilus 14.2.10 on Scientific Linux 7.8.

 So, we are using a mixture of RBDs and cephfs. For the transition we
 switched off all machines that are using the RBDs and switched off the
 cephfs using
 ceph fs set one down true
 Once no more MDS were running we reconfigured ceph to remove the
 cluster network and set various flags

 ceph osd set noout
 ceph osd set nodown
 ceph osd set pause
 ceph osd set nobackfill
 ceph osd set norebalance
 ceph osd set norecover

 We then restarted the OSDs one host at a time. During this process ceph
 was mostly happy, except for two PGs. After all OSDs had been restarted
 we switched off the cluster network switches to make sure it was
 totally gone. ceph was still happy. The PG error also disappeared. We
 then unset all those errors and re-enabled cephfs.

 We then switched on the servers using the RBDs with no issues. So far
 so good.

 We then started using the cephfs (we keep VM images on the cephfs). The
 MDS were showing an error. I restarted the MDS but they didn't come
 back.We then followed the instructions here:

https://docs.ceph.com/docs/nautilus/cephfs/disaster-recovery-experts/#disas…
 up to truncating the journal. The MDS started again. However, as soon
 as we started writing the cephfs the MDS crashed. A scrub of the cephfs
 revealed backtrace damage. 
I'm confused why you started the disaster recovery procedure when the
procedure you follow should result in no damage to the PGs (and
subsequently CephFS). It'd be helpful to know what this original error
was.

Backtrace damage is usually resolved with a scrub.

...
  We have now followed the remaining steps of the
disaster recovery
 procedure and are waiting for the cephfs-data-scan scan_extents to
 complete.

 It would be really helpful if you could give an indication of how long
 this process will take (we have ~40TB in our cephfs) and how many
 workers to use. 
I don't have any recent data on how long it could take but you might
try using at least 8 workers.

...
  The other missing bit of documentation is the cephfs
scrubbing. Is that
 something we should run routinely? 
CephFS scrubbing is usually done when something goes wrong or backing
metadata needs updated for some reason as part of an upgrade (e.g.
Mimic and snapshot formats). It's not considered necessary to do it on
a routine basis. RADOS PG scrubbing is sufficient for ensuring that
the backing data is routinely checked for correctness/redundancy.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

2024

2023

2022

2021

2020

2019

[ceph-users] Re: damaged cephfs