Hi,

I am facing an issue on Cephadm cluster setup. Whenever, I try to add remote devices as OSDs, command just hangs.

The steps I have followed :

sudo ceph orch daemon add osd node1:device

 

  1. For the setup I have followed the steps mentioned in :
    https://ralph.blog.imixs.com/2020/04/14/ceph-octopus-running-on-debian-buster/

 

  1. To make sure it is not facing ssh errors and  host is reachable I have tried the following commands:
    cephadm shell -- ceph config-key get mgr/cephadm/ssh_identity_key > key
    cephadm shell -- ceph cephadm get-ssh-config > config
    ssh -F config -i key root@hostname

              I am able to connect to the host as root.

 

  1. Then I have tired collecting the log information
    1. Command : sudo cephadm logs --fsid e236062e-96ad-11ea-bedb-5254002e4127 --name osd
      Result :
      Traceback (most recent call last):
      File "/usr/sbin/cephadm", line 4282, in <module>
      r = args.func()
      File "/usr/sbin/cephadm", line 921, in _infer_fsid
      return func()
      File "/usr/sbin/cephadm", line 2689, in command_logs
      (daemon_type, daemon_id) = args.name.split('.', 1)
      ValueError: not enough values to unpack (expected 2, got 1)
    2. Commad : sudo ceph log last cephadm
       

Result :

 

INFO:cephadm:Verifying port 9100 ...

 WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address already in use

 ERROR: TCP Port(s) '9100' required for node-exporter is already in use

 Traceback (most recent call last):

File "/usr/share/ceph/mgr/cephadm/module.py", line 1638, in _run_cephadm

code, '\n'.join(err)))

 RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:Deploying daemon node-exporter.ceph-mon ...

 INFO:cephadm:Verifying port 9100 ...

 WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address already in use

 ERROR: TCP Port(s) '9100' required for node-exporter is already in use

 2020-05-15T13:33:46.966159+0000 mgr.ceph-mgr.dixgvy (mgr.14161) 678 : cephadm [WRN] Failed to apply node-exporter spec ServiceSpec(

{'placement': PlacementSpec(host_pattern='*'), 'service_type': 'node-exporter', 'service_id': None, 'unmanaged': False}

): cephadm exited with an error code: 1, stderr:INFO:cephadm:Deploying daemon node-exporter.ceph-mon ...

 INFO:cephadm:Verifying port 9100 ...

 WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address already in use

 ERROR: TCP Port(s) '9100' required for node-exporter is already in use

 

 

But I am not able to infer from these log information. Can you please help me with the issue.