[ceph-users] Re: Upgrade from 17.2.5 to 17.2.6 stuck at MDS

10 Apr 2023

I did what you told me.

I also see in the log, that the command went through:

2023-04-10T19:58:46.522477+0000 mgr.ceph04.qaexpv [INF] Schedule 
redeploy daemon mds.mds01.ceph06.rrxmks
2023-04-10T20:01:03.360559+0000 mgr.ceph04.qaexpv [INF] Schedule 
redeploy daemon mds.mds01.ceph05.pqxmvt
2023-04-10T20:01:21.787635+0000 mgr.ceph04.qaexpv [INF] Schedule 
redeploy daemon mds.mds01.ceph07.omdisd

But the MDS never start. They stay in error state. I tried to redeploy 
and start them a few times. Even restarted one host where a MDS should run.

mds.mds01.ceph03.xqwdjy  ceph03               error           32m ago 
2M        -        -  <unknown>  <unknown>     <unknown>
mds.mds01.ceph04.hcmvae  ceph04               error           31m ago 
2h        -        -  <unknown>  <unknown>     <unknown>
mds.mds01.ceph05.pqxmvt  ceph05               error           32m ago 
9M        -        -  <unknown>  <unknown>     <unknown>
mds.mds01.ceph06.rrxmks  ceph06               error           32m ago 
10w        -        -  <unknown>  <unknown>     <unknown>
mds.mds01.ceph07.omdisd  ceph07               error           32m ago 
2M        -        -  <unknown>  <unknown>     <unknown>

And other ideas? Or am I missing something.

Cheers,
Thomas

On 10.04.23 21:53, Adam King wrote:
...
  Will also note that the normal upgrade process scales
down the mds 
 service to have only 1 mds per fs before upgrading it, so maybe 
 something you'd want to do as well if the upgrade didn't do it already. 
 It does so by setting the max_mds to 1 for the fs.

 On Mon, Apr 10, 2023 at 3:51 PM Adam King &lt;adking(a)redhat.com 
 <mailto:adking@redhat.com>> wrote:

     You could try pausing the upgrade and manually "upgrading" the mds
     daemons by redeploying them on the new image. Something like "ceph
     orch daemon redeploy <mds-daemon-name> --image <17.2.6 image>"
     (daemon names should match those in "ceph orch ps" output). If you
     do that for all of them and then get them into an up state you
     should be able to resume the upgrade and have it complete.

     On Mon, Apr 10, 2023 at 3:25 PM Thomas Widhalm
     &lt;widhalmt(a)widhalm.or.at <mailto:widhalmt@widhalm.or.at>> wrote:

         Hi,

         If you remember, I hit bug https://tracker.ceph.com/issues/58489
         <https://tracker.ceph.com/issues/58489> so I
         was very relieved when 17.2.6 was released and started to update
         immediately.

         But now I'm stuck again with my broken MDS. MDS won't get into
         up:active
         without the update but the update waits for them to get into
         up:active
         state. Seems like a deadlock / chicken-egg problem to me.

         Since I'm still relatively new to Ceph, could you help me?

         What I see when watching the update status:

         {
               "target_image":

"quay.io/ceph/ceph@sha256:1161e35e4e02cf377c93b913ce78773f8413f5a8d7c5eaee4b4773a4f9dd6635
<http://quay.io/ceph/ceph@sha256:1161e35e4e02cf377c93b913ce78773f8413f5a8d7c5eaee4b4773a4f9dd6635>",
               "in_progress": true,
               "which": "Upgrading all daemon types on all hosts",
               "services_complete": [
                   "crash",
                   "mgr",
                  "mon",
                  "osd"
               ],
               "progress": "18/40 daemons upgraded",
               "message": "Error: UPGRADE_OFFLINE_HOST: Upgrade: Failed
         to connect
         to host ceph01 at addr (192.168.23.61)",
               "is_paused": false
         }

         (The offline host was one host that broke during the upgrade. I
         fixed
         that in the meantime and the update went on.)

         And in the log:

         2023-04-10T19:23:48.750129+0000 mgr.ceph04.qaexpv [INF] Upgrade:
         Waiting
         for mds.mds01.ceph04.hcmvae to be up:active (currently up:replay)
         2023-04-10T19:23:58.758141+0000 mgr.ceph04.qaexpv [WRN] Upgrade:
         No mds
         is up; continuing upgrade procedure to poke things in the right
         direction

         Please give me a hint what I can do.

         Cheers,
         Thomas
         -- 
         http://www.widhalm.or.at <http://www.widhalm.or.at>
         GnuPG : 6265BAE6 , A84CB603
         Threema: H7AV7D33
         Telegram, Signal: widhalmt(a)widhalm.or.at
         <mailto:widhalmt@widhalm.or.at>
         _______________________________________________
         ceph-users mailing list -- ceph-users(a)ceph.io
         <mailto:ceph-users@ceph.io>
         To unsubscribe send an email to ceph-users-leave(a)ceph.io
         <mailto:ceph-users-leave@ceph.io>

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Upgrade from 17.2.5 to 17.2.6 stuck at MDS