cephadm orch thinks hosts are offline - ceph-users

6 Sep 2020

We have a 5 node cluster, all monitors, installed with cephadm. recently, the hosts needed
to be rebooted for upgrades, but as we rebooted them, hosts fail their cephadm check. as
you can see ceph1 is in quorum and is the host the command is run from. following is the
output of ceph -s and ceph orch host ls. ceph orch pause and resume only removed the
"offline" status of cephmon-temp, which really is offline. how do we fix ceph
orchs confusion?

the third is a temp node we had that ceph orch remove host couldnt get rid of.

ceph1:~# ceph -s

    health: HEALTH_WARN
            3 hosts fail cephadm check

  services:
    mon: 5 daemons, quorum ceph5,ceph4,ceph3,ceph2,ceph1 (age 2d)
    mgr: ceph3.dmpmih(active, since 3w), standbys: ceph5.pwseyi
    osd: 30 osds: 30 up (since 2d), 30 in (since 3w)

  data:
    pools:   2 pools, 129 pgs
    objects: 2.29M objects, 8.7 TiB
    usage:   22 TiB used, 87 TiB / 109 TiB avail
    pgs:     129 active+clean

  io:
    client:   149 KiB/s wr, 0 op/s rd, 14 op/s wr

ceph1:~# ceph orch host ls
HOST          ADDR          LABELS  STATUS   
ceph1         ceph1         mon     Offline  
ceph2         ceph2         mon     Offline  
ceph3         ceph3         mon              
ceph4         ceph4         mon              
ceph5         ceph5         mon              
cephmon-temp  cephmon-temp          Offline