Hello,
I am trying to set up a test cluster with the cephadm tool on Ubuntu 20.04 nodes.
Following the directions at
https://docs.ceph.com/en/octopus/cephadm/install/, I have set
up the monitor and manager on a management node, and added two hosts that I want to use
for storage. All storage devices present on those nodes are included in the output of
`ceph orch device ls`, and all are marked “available”. However, when I try to deploy OSDs
with `ceph orch apply osd -i spec.yml`, following the example for HDD+SSD storage spec at
https://docs.ceph.com/en/latest/cephadm/drivegroups/#the-simple-case, I see the new
service in the output of `ceph orch ls`, but it is not running anywhere (“0/2”), and no
OSDs get created. I am not sure how to debug this, and any pointers would be much
appreciated.
Thank you,
Davor
Output:
```
# ceph orch host ls
INFO:cephadm:Inferring fsid 150b5f1a-64bf-11eb-a7e9-d96bd5ac4db3
INFO:cephadm:Inferring config
/var/lib/ceph/150b5f1a-64bf-11eb-a7e9-d96bd5ac4db3/mon.sps-head/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
HOST ADDR LABELS STATUS
sps-head sps-head mon
sps-st1 sps-st1 mon
sps-st2 sps-st2
# ceph orch device ls
INFO:cephadm:Inferring fsid 150b5f1a-64bf-11eb-a7e9-d96bd5ac4db3
INFO:cephadm:Inferring config
/var/lib/ceph/150b5f1a-64bf-11eb-a7e9-d96bd5ac4db3/mon.sps-head/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Hostname Path Type Serial Size Health Ident Fault Available
sps-head /dev/nvme0n1 ssd S5JXNS0N504446R 1024G Unknown N/A N/A Yes
sps-st1 /dev/nvme0n1 ssd S5JXNS0N504948D 1024G Unknown N/A N/A Yes
sps-st1 /dev/nvme1n1 ssd S5JXNS0N504958T 1024G Unknown N/A N/A Yes
sps-st1 /dev/sdb hdd 5000cca28ed36018 14.0T Unknown N/A N/A Yes
sps-st1 /dev/sdc hdd 5000cca28ed353e5 14.0T Unknown N/A N/A Yes
[…]
# cat /mnt/osd_spec.yml
service_type: osd
service_id: default_drive_group
placement:
host_pattern: 'sps-st[1-6]'
data_devices:
rotational: 1
db_devices:
rotational: 0
[**After running `ceph orch apply osd -i spec.yml`:**]
# ceph orch ls
NAME RUNNING REFRESHED AGE PLACEMENT IMAGE NAME
IMAGE ID
alertmanager 1/1 9m ago 6h count:1
docker.io/prom/alertmanager:v0.20.0 0881eb8f169f
crash 3/3 9m ago 6h * docker.io/ceph/ceph:v15
5553b0cb212c
grafana 1/1 9m ago 6h count:1
docker.io/ceph/ceph-grafana:6.6.2 a0dce381714a
mgr 2/2 9m ago 6h count:2 docker.io/ceph/ceph:v15
5553b0cb212c
mon 1/2 9m ago 3h label:mon docker.io/ceph/ceph:v15
5553b0cb212c
node-exporter 0/3 - - * <unknown>
<unknown>
osd.default_drive_group 0/2 - - sps-st[1-6] <unknown>
<unknown>
prometheus 1/1 9m ago 6h count:1
docker.io/prom/prometheus:v2.18.1 de242295e225
[** I am not sure why neither “osd.default_drive_group” nor “node-exporter” are running
anywhere. How do I check that? **]
# ceph osd tree
INFO:cephadm:Inferring fsid 150b5f1a-64bf-11eb-a7e9-d96bd5ac4db3
INFO:cephadm:Inferring config
/var/lib/ceph/150b5f1a-64bf-11eb-a7e9-d96bd5ac4db3/mon.sps-head/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0 root default
# ceph orch --version
ceph version 15.2.8 (bdf3eebcd22d7d0b3dd4d5501bee5bac354d5b55) octopus (stable)
```