Hi Sake,
I would start by decrementing max_mds by 1:
ceph fs set atlassian-prod max_mds 2
The mds.1 no longer restarts?
logs?
Le jeu. 21 déc. 2023 à 08:11, Sake Ceph <ceph(a)paulusma.eu> a écrit :
Starting a new thread, forgot subject in the
previous.
So our FS down. Got the following error, what can I do?
# ceph health detail
HEALTH_ERR 1 filesystem is degraded; 1 mds daemon damaged
[WRN] FS_DEGRADED: 1 filesystem is degraded
fs atlassian/prod is degraded
[ERR] MDS_DAMAGE: 1 mds daemon damaged
fs atlassian-prod mds.1 is damaged
# ceph fs get atlassian-prod
Filesystem 'atlassian-prod' (2)
fs_name atlassian-prod
epoch 43440
flags 32 joinable allow_snaps allow_multimds_snaps allow_standby_replay
created 2023-05-10T08:45:46.911064+0000
modified 2023-12-21T06:47:19.291154+0000
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
required_client_features {}
last_failure 0
last_failure_osd_epoch 29480
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline
data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds 3
in 0,1,2
up {0=1073573,2=1073583}
failed
damaged 1
stopped
data_pools [5]
metadata_pool 4
inline_data disabled
balancer
standby_count_wanted 1
[mds.atlassian-prod.pwsoel13142.egsdfl{0:1073573} state up:resolve seq 573
join_fscid=2 addr [v2:
10.233.127.22:6800/61692284,v1:10.233.127.22:6801/61692284] compat
{c=[1],r=[1],i=[7ff]}]
[mds.atlassian-prod.pwsoel13143.qlvypn{2:1073583} state up:resolve seq 571
join_fscid=2 addr [v2:
10.233.127.18:6800/3627858294,v1:10.233.127.18:6801/3627858294] compat
{c=[1],r=[1],i=[7ff]}]
Best regards,
Sake
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io