New subject: ceph orch status hangs forever

20 May 2021

Hi Eugen thank you very much for your reply. I'm Manuel, a colleague of Sebastián.

I complete what you ask us. 

We have checked more ceph commands, not only ceph crash and ceph org and many other
commands are equally hung:

[spsrc-mon-1 ~]# cephadm shell -- ceph pg stat
hangs forever
[spsrc-mon-1 ~]# cephadm shell -- ceph status
Works
[spsrc-mon-1 ~]# cephadm shell -- ceph progress
hangs forever
[spsrc-mon-1 ~]# cephadm shell -- ceph balancer status
hangs forever
[spsrc-mon-1 ~]# cephadm shell -- ceph crash ls
hangs forever
[spsrc-mon-1 ~]# cephadm shell -- ceph crash stat
hangs forever
[spsrc-mon-1 ~]# cephadm shell -- ceph telemetry status
hangs forever

We have checked the call made from the container by checking DEBUG logs and I see that it
is correct, in some commands work but others hang:

2021-05-20 09:56:02,903 DEBUG Running command (timeout=None): /bin/docker run --rm
--ipc=host --net=host --privileged --group-add=disk -e
CONTAINER_IMAGE=172.16.3.146:4000/ceph/ceph:v15.2.9 -e NODE_NAME=spsrc-mon-1 -v
/var/run/ceph/3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c:/var/run/ceph:z -v
/var/log/ceph/3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c:/var/log/ceph:z -v
/var/lib/ceph/3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c/crash:/var/lib/ceph/crash:z -v
/dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v
/var/lib/ceph/3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c/mon.spsrc-mon-1/config:/etc/ceph/ceph.conf:z
-v /etc/ceph/ceph.client.admin.keyring:/etc/ceph/ceph.keyring:z --entrypoint ceph
172.16.3.146:4000/ceph/ceph:v15.2.9 pg stat

 We have 3 monitor nodes and these are the containers that are running (on all monitor
nodes):

acf8870fc788   172.16.3.146:4000/ceph/ceph:v15.2.9                                        
   "/usr/bin/ceph-mds -…"   7 days ago       Up 7 days                
ceph-3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c-mds.manila.spsrc-mon-1.gpulzs
cfac86f29db4   172.16.3.146:4000/ceph/ceph:v15.2.9                                        
   "/usr/bin/ceph-mon -…"   7 days ago       Up 7 days                
ceph-3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c-mon.spsrc-mon-1
4e6e600fa915   172.16.3.146:4000/ceph/ceph:v15.2.9                                        
   "/usr/bin/ceph-crash…"   7 days ago       Up 7 days                
ceph-3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c-crash.spsrc-mon-1
dae36c48568e   172.16.3.146:4000/ceph/ceph:v15.2.9                                        
   "/usr/bin/ceph-mgr -…"   7 days ago       Up 7 days                
ceph-3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c-mgr.spsrc-mon-1.eziiam

All with running status in all the 3 monitor nodes. As you see in this monitor, we have
MDS, MON, CRASH and MGR.

Any ideas what we can check?.

Best regards, 
Manu

Re: ceph orch status hangs forever