[ceph-users] mds readonly, mds all down

27 Feb 2023

Hello. We trying to resolve some issue with ceph. Our openshift cluster is blocked and we
tried do almost all. 
Actual state is:
MDS_ALL_DOWN: 1 filesystem is offline
MDS_DAMAGE: 1 mds daemon damaged
FS_DEGRADED: 1 filesystem is degraded
MON_DISK_LOW: mon be is low on available space
RECENT_CRASH: 1 daemons have recently crashed
We try to perform 
cephfs-journal-tool --rank=gml-okd-cephfs:all event recover_dentries summary
cephfs-journal-tool --rank=gml-okd-cephfs:all journal reset
cephfs-table-tool gml-okd-cephfs:all reset session
ceph mds repaired 0
ceph config rm mds mds_verify_scatter
ceph config rm mds mds_debug_scatterstat
ceph tell gml-okd-cephfs scrub start / recursive repair force

After these commands, mds rises but an error appears:
MDS_READ_ONLY: 1 MDSs are read only

We also tried to create new fs with new metadata pool, delete and recreate old fs with
same name with old\new metadatapool. 
We got rid of the errors, but the Openshift cluster did not want to work with the old
persistence volumes. The pods wrote an error that they could not find it, while it was
present and moreover, this volume was associated with pvc. 

Now we have rolled back the cluster and are trying to remove the mds error. Any ideas what
to try?
Thanks

2024

2023

2022

2021

2020

2019

[ceph-users] mds readonly, mds all down