Hi Adam.
Thanks a lot for your feedback. I think I found the reason:
https://stackoverflow.com/a/75456529/10186921
Looks like some ports, needed for ceph, were blocked by iptables.
The only strange thing - I didn't get any error, which can help to understand what is
wrong - it just hangs...
Anyhow, now it is resolved for me.
BR/Anton
________________________________
From: Adam King <adking(a)redhat.com>
Sent: Wednesday, February 15, 2023 9:32 PM
To: Anton Chivkunov <anton(a)conversant.com.sg>
Cc: ceph-users(a)ceph.io <ceph-users(a)ceph.io>
Subject: Re: [ceph-users] Ceph (cepadm) quincy: can't add osd from remote nodes.
If it got as far as running that ceph-volume command on the remote host, I wouldn't
think it was anything with the ssh connection. Do ceph commands generally hang on that
host when you run them manually there as well?
On Wed, Feb 15, 2023 at 11:19 AM Anton Chivkunov
<anton@conversant.com.sg<mailto:anton@conversant.com.sg>> wrote:
Hello!
I stuck with a problem, while trying to create cluster of 3 nodes (AWS EC2 instancies):
fa11 ~ # ceph orch host ls
HOST ADDR LABELS STATUS
fa11 172.16.24.67 _admin
fa12 172.16.23.159 _admin
fa13 172.16.25.119 _admin
3 hosts in cluster
Each of them have 2 disks (all accepted by CEPH):
fa11 ~ # ceph orch device ls
HOST PATH TYPE DEVICE ID SIZE
AVAILABLE REFRESHED REJECT REASONS
fa11 /dev/nvme1n1 ssd Amazon_Elastic_Block_Store_vol016651cf7f3b9c9dd 8589M Yes
7m ago
fa11 /dev/nvme2n1 ssd Amazon_Elastic_Block_Store_vol034082d7d364dfbdb 5368M Yes
7m ago
fa12 /dev/nvme1n1 ssd Amazon_Elastic_Block_Store_vol0ec193fa3f77fee66 8589M Yes
3m ago
fa12 /dev/nvme2n1 ssd Amazon_Elastic_Block_Store_vol018736f7eeab725f5 5368M Yes
3m ago
fa13 /dev/nvme1n1 ssd Amazon_Elastic_Block_Store_vol0443a031550be1024 8589M Yes
84s ago
fa13 /dev/nvme2n1 ssd Amazon_Elastic_Block_Store_vol0870412d37717dc2c 5368M Yes
84s ago
fa11 is first host, where from I manage cluster.
Adding OSD from fa11 itself works fine:
fa11 ~ # ceph orch daemon add osd fa11:/dev/nvme1n1
Created osd(s) 0 on host 'fa11'
But it doesn't work for other 2 hosts (it hangs forever):
fa11 ~ # ceph orch daemon add osd fa12:/dev/nvme1n1
^CInterrupted
Logs on fa12 shows that it hangs at following step:
fa12 ~ # tail /var/log/ceph/a9ef6c26-ac38-11ed-9429-06e6bc29c1db/ceph-volume.log
...
[2023-02-14 07:38:20,942][ceph_volume.process][INFO ] Running command:
/usr/bin/ceph-authtool --gen-print-key
[2023-02-14 07:38:20,964][ceph_volume.process][INFO ] Running command: /usr/bin/ceph
--cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
a51506c2-e910-4763-9a0c-f6c2194944e2
I'm not sure what might be the reason for this hanging?
*Additional details:
1) cephadm installed, using curl
(
https://docs.ceph.com/en/quincy/cephadm/install/#curl-based-installation)
2) I use user "ceph", instead of "root" and port 2222 instead of 22.
First node was bootstrapped, using below command:
cephadm bootstrap --mon-ip 172.16.24.67 --allow-fqdn-hostname --ssh-user ceph
--ssh-config /home/anton/ceph/ssh_config --cluster-network
172.16.16.0/20<http://172.16.16.0/20> --skip-monitoring-stack
Content of /home/anton/ceph/ssh_config:
fa11 ~ # cat /home/anton/ceph/ssh_config
Host *
User ceph
Port 2222
IdentityFile /home/ceph/.ssh/id_rsa
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
3) Hosts fa12 and fa13 were added, using commnds:
ceph orch host add fa12.testing.swiftserve.com<http://fa12.testing.swiftserve.com>
172.16.23.159 --labels _admin
ceph orch host add fa13.testing.swiftserve.com<http://fa13.testing.swiftserve.com>
172.16.25.119 --labels _admin
Thanks in advance!
BR/Anton
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to
ceph-users-leave@ceph.io<mailto:ceph-users-leave@ceph.io>