[ceph-users] Re: MDS stuck in rejoin

24 Jul 2023

Hi Xiubo,

I seem to have gotten your e-mail twice.

Its a very old kclient. It was in that state when I came to work in the morning and I
looked at it in the afternoon. Was hoping the problem would clear by itself.

It was probably a compute job that crashed it, its a compute node in our HPC cluster. I
didn't look at dmesg, but I can take a look at he MDS log when I'm back at work. I
guess you are interested in the time before the warning started appearing.

The MDS was probably stuck 5-10 minutes. This is very loooong on our cluster. An MDS
failover usually takes 30s only. Also I couldn't see any progress in the MDS log,
that's why I decided to fail the rank a second time.

During the time it was stuck in rejoin it didn't log any other messages than he MDS
map update messages. I don't think it was doing anything at all.

Some background: Before I started recovery I found an old post stating that the
MDS_CLIENT_OLDEST_TID warning is quite serious, because the affected MDS will experience
growing cache allocation and eventually fail in a practically irrecoverable state. I did
not observe unusual RAM consumption and there were no MDS large cache messages either.
Seems like our situation was of a more harmless nature. Still, the fail did not go
entirely smooth.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Xiubo Li &lt;xiubli(a)redhat.com&gt;
Sent: Friday, July 21, 2023 1:32 PM
To: ceph-users(a)ceph.io
Subject: [ceph-users] Re: MDS stuck in rejoin

On 7/20/23 22:09, Frank Schilder wrote:
...
  Hi all,

 we had a client with the warning "[WRN] MDS_CLIENT_OLDEST_TID: 1 clients failing to
advance oldest client/flush tid". I looked at the client and there was nothing going
on, so I rebooted it. After the client was back, the message was still there. To clean
this up I failed the MDS. Unfortunately, the MDS that took over is remained stuck in
rejoin without doing anything. All that happened in the log was: 
BTW, are you using the kclient or user space client ? How long was the
MDS stuck in rejoin state ?

This means in the client side the oldest client has been stuck too long,
maybe in heavy load case there were to many requests generated in a
short time and the oldest request was stuck too long in MDS.

...
  [root@ceph-10 ceph]# tail -f ceph-mds.ceph-10.log
 2023-07-20T15:54:29.147+0200 7fedb9c9f700  1 mds.2.896604 rejoin_start
 2023-07-20T15:54:29.161+0200 7fedb9c9f700  1 mds.2.896604 rejoin_joint_start
 2023-07-20T15:55:28.005+0200 7fedb9c9f700  1 mds.ceph-10 Updating MDS map to version
896614 from mon.4
 2023-07-20T15:56:00.278+0200 7fedb9c9f700  1 mds.ceph-10 Updating MDS map to version
896615 from mon.4
 [...]
 2023-07-20T16:02:54.935+0200 7fedb9c9f700  1 mds.ceph-10 Updating MDS map to version
896653 from mon.4
 2023-07-20T16:03:07.276+0200 7fedb9c9f700  1 mds.ceph-10 Updating MDS map to version
896654 from mon.4 
Did you see any slow request log in the mds log files ? And any other
suspect logs from the dmesg if it's kclient ?

...
  After some time I decided to give another fail a try
and, this time, the replacement daemon went to active state really fast.

 If I have a message like the above, what is the clean way of getting the client clean
again (version: 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable))?

I think your steps are correct.

Thanks

- Xiubo

...
  Thanks and best regards,
 =================
 Frank Schilder
 AIT Risø Campus
 Bygning 109, rum S14
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io
 _______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: MDS stuck in rejoin