[ceph-users] Re: Orchestration seems not to work

4 May 2023

Hi,

What I'm seeing a lot is this: "[stats WARNING root] cmdtag  not found 
in client metadata" Can't make anything of it but I guess it's not 
showing the initial issue.

Now that I think of it - I started the cluster with 3 nodes which are 
now only used as OSD. Could it be there's something missing on the new 
nodes that are now used as mgr/mon?

Cheers,
Thomas

On 04.05.23 14:48, Eugen Block wrote:
...
  Hi,

 try setting debug logs for the mgr:

 ceph config set mgr mgr/cephadm/log_level debug

 This should provide more details what the mgr is trying and where it's 
 failing, hopefully. Last week this helped to identify an issue between a 
 lower pacific issue for me.
 Do you see anything in the cephadm.log pointing to the mgr actually 
 trying something?

 Zitat von Thomas Widhalm &lt;widhalmt(a)widhalm.or.at&gt;at>:

  Hi,

 I'm in the process of upgrading my cluster from 17.2.5 to 17.2.6 but 
 the following problem existed when I was still everywhere on 17.2.5 .

 I had a major issue in my cluster which could be solved with a lot of 
 your help and even more trial and error. Right now it seems that most 
 is already fixed but I can't rule out that there's still some problem 
 hidden. The very issue I'm asking about started during the repair.

 When I want to orchestrate the cluster, it logs the command but it 
 doesn't do anything. No matter if I use ceph dashboard or "ceph orch" 
 in "cephadm shell". I don't get any error message when I try to deploy 
 new services, redeploy them etc. The log only says "scheduled" and 
 that's it. Same when I change placement rules. Usually I use tags. But 
 since they don't work anymore, too, I tried host and umanaged. No 
 success. The only way I can actually start and stop containers is via 
 systemctl from the host itself.

 When I run "ceph orch ls" or "ceph orch ps" I see services I deployed

 for testing being deleted (for weeks now). Ans especially a lot of old 
 MDS are listed as "error" or "starting". The list doesn't match

 reality at all because I had to start them by hand.

 I tried "ceph mgr fail" and even a complete shutdown of the whole 
 cluster with all nodes including all mgs, mds even osd - everything 
 during a maintenance window. Didn't change anything.

 Could you help me? To be honest I'm still rather new to Ceph and since 
 I didn't find anything in the logs that caught my eye I would be 
 thankful for hints how to debug.

 Cheers,
 Thomas
 -- 
 http://www.widhalm.or.at
 GnuPG : 6265BAE6 , A84CB603
 Threema: H7AV7D33
 Telegram, Signal: widhalmt(a)widhalm.or.at

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Orchestration seems not to work