Hi,
I stumbled across an issue where an OSD the gets redeployed has a CRUSH weight of 0 after
cephadm finishes.
I have created a service definition for the orchestrator to automatically deploy OSDs on
SSDs:
service_type: osd
service_id: SSD_OSDs
placement:
label: 'osd'
data_devices:
rotational: 0
size: '100G'
These are my steps to reproduce this in a small test cluster running 15.2.4:
root@ceph01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.63994 root default
-18 0.81995 rack rack10
-3 0.40996 host ceph01
8 hdd 0.10699 osd.8 up 1.00000 1.00000
9 hdd 0.10699 osd.9 up 1.00000 1.00000
0 ssd 0.09799 osd.0 up 1.00000 1.00000
1 ssd 0.09798 osd.1 up 1.00000 1.00000
-5 0.40999 host ceph02
10 hdd 0.10699 osd.10 up 1.00000 1.00000
11 hdd 0.10699 osd.11 up 1.00000 1.00000
2 ssd 0.09799 osd.2 up 1.00000 1.00000
3 ssd 0.09799 osd.3 up 1.00000 1.00000
-17 0.81999 rack rack11
-7 0.40999 host ceph03
12 hdd 0.10699 osd.12 up 1.00000 1.00000
13 hdd 0.10699 osd.13 up 1.00000 1.00000
4 ssd 0.09799 osd.4 up 1.00000 1.00000
5 ssd 0.09799 osd.5 up 1.00000 1.00000
-9 0.40999 host ceph04
14 hdd 0.10699 osd.14 up 1.00000 1.00000
15 hdd 0.10699 osd.15 up 1.00000 1.00000
6 ssd 0.09799 osd.6 up 1.00000 1.00000
7 ssd 0.09799 osd.7 up 1.00000 1.00000
root@ceph01:~# ceph osd out 1
marked out osd.1.
root@ceph01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.63994 root default
-18 0.81995 rack rack10
-3 0.40996 host ceph01
8 hdd 0.10699 osd.8 up 1.00000 1.00000
9 hdd 0.10699 osd.9 up 1.00000 1.00000
0 ssd 0.09799 osd.0 up 1.00000 1.00000
1 ssd 0.09798 osd.1 up 0 1.00000
-5 0.40999 host ceph02
10 hdd 0.10699 osd.10 up 1.00000 1.00000
11 hdd 0.10699 osd.11 up 1.00000 1.00000
2 ssd 0.09799 osd.2 up 1.00000 1.00000
3 ssd 0.09799 osd.3 up 1.00000 1.00000
-17 0.81999 rack rack11
-7 0.40999 host ceph03
12 hdd 0.10699 osd.12 up 1.00000 1.00000
13 hdd 0.10699 osd.13 up 1.00000 1.00000
4 ssd 0.09799 osd.4 up 1.00000 1.00000
5 ssd 0.09799 osd.5 up 1.00000 1.00000
-9 0.40999 host ceph04
14 hdd 0.10699 osd.14 up 1.00000 1.00000
15 hdd 0.10699 osd.15 up 1.00000 1.00000
6 ssd 0.09799 osd.6 up 1.00000 1.00000
7 ssd 0.09799 osd.7 up 1.00000 1.00000
root@ceph01:~# ceph orch osd rm 1
Scheduled OSD(s) for removal
2020-09-10T16:29:58.176991+0200 mgr.ceph02.ouelws [INF] Removing daemon osd.1 from ceph01
2020-09-10T16:30:00.148659+0200 mgr.ceph02.ouelws [INF] Successfully removed OSD <1>
on ceph01
root@ceph01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.54196 root default
-18 0.72197 rack rack10
-3 0.31198 host ceph01
8 hdd 0.10699 osd.8 up 1.00000 1.00000
9 hdd 0.10699 osd.9 up 1.00000 1.00000
0 ssd 0.09799 osd.0 up 1.00000 1.00000
-5 0.40999 host ceph02
10 hdd 0.10699 osd.10 up 1.00000 1.00000
11 hdd 0.10699 osd.11 up 1.00000 1.00000
2 ssd 0.09799 osd.2 up 1.00000 1.00000
3 ssd 0.09799 osd.3 up 1.00000 1.00000
-17 0.81999 rack rack11
-7 0.40999 host ceph03
12 hdd 0.10699 osd.12 up 1.00000 1.00000
13 hdd 0.10699 osd.13 up 1.00000 1.00000
4 ssd 0.09799 osd.4 up 1.00000 1.00000
5 ssd 0.09799 osd.5 up 1.00000 1.00000
-9 0.40999 host ceph04
14 hdd 0.10699 osd.14 up 1.00000 1.00000
15 hdd 0.10699 osd.15 up 1.00000 1.00000
6 ssd 0.09799 osd.6 up 1.00000 1.00000
7 ssd 0.09799 osd.7 up 1.00000 1.00000
root@ceph01:~# ceph orch device zap ceph01 /dev/sdc --force
INFO:cephadm:/usr/bin/docker:stderr --> Zapping: /dev/sdc
INFO:cephadm:/usr/bin/docker:stderr --> Zapping lvm member /dev/sdc. lv_path is
/dev/ceph-0d19a151-30b6-459e-936a-488f143e11f6/osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb
INFO:cephadm:/usr/bin/docker:stderr Running command: /usr/bin/dd if=/dev/zero
of=/dev/ceph-0d19a151-30b6-459e-936a-488f143e11f6/osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb
bs=1M count=10 conv=fsync
INFO:cephadm:/usr/bin/docker:stderr stderr: 10+0 records in
INFO:cephadm:/usr/bin/docker:stderr 10+0 records out
INFO:cephadm:/usr/bin/docker:stderr stderr: 10485760 bytes (10 MB, 10 MiB) copied,
0.0583658 s, 180 MB/s
INFO:cephadm:/usr/bin/docker:stderr --> Only 1 LV left in VG, will proceed to destroy
volume group ceph-0d19a151-30b6-459e-936a-488f143e11f6
INFO:cephadm:/usr/bin/docker:stderr Running command: /usr/sbin/vgremove -v -f
ceph-0d19a151-30b6-459e-936a-488f143e11f6
INFO:cephadm:/usr/bin/docker:stderr stderr: Removing
ceph--0d19a151--30b6--459e--936a--488f143e11f6-osd--block--d5062900--abe7--413a--9d9a--d1cdda2948eb
(253:3)
INFO:cephadm:/usr/bin/docker:stderr stderr: Archiving volume group
"ceph-0d19a151-30b6-459e-936a-488f143e11f6" metadata (seqno 5).
INFO:cephadm:/usr/bin/docker:stderr Releasing logical volume
"osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb"
INFO:cephadm:/usr/bin/docker:stderr stderr: Creating volume group backup
"/etc/lvm/backup/ceph-0d19a151-30b6-459e-936a-488f143e11f6" (seqno 6).
INFO:cephadm:/usr/bin/docker:stderr stdout: Logical volume
"osd-block-d5062900-abe7-413a-9d9a-d1cdda2948eb" successfully removed
INFO:cephadm:/usr/bin/docker:stderr stderr: Removing physical volume "/dev/sdc"
from volume group "ceph-0d19a151-30b6-459e-936a-488f143e11f6"
INFO:cephadm:/usr/bin/docker:stderr stdout: Volume group
"ceph-0d19a151-30b6-459e-936a-488f143e11f6" successfully removed
INFO:cephadm:/usr/bin/docker:stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/sdc
bs=1M count=10 conv=fsync
INFO:cephadm:/usr/bin/docker:stderr stderr: 10+0 records in
INFO:cephadm:/usr/bin/docker:stderr 10+0 records out
INFO:cephadm:/usr/bin/docker:stderr stderr: 10485760 bytes (10 MB, 10 MiB) copied,
0.016043 s, 654 MB/s
INFO:cephadm:/usr/bin/docker:stderr --> Zapping successful for: <Raw Device:
/dev/sdc>
2020-09-10T16:31:15.951617+0200 mgr.ceph02.ouelws [INF] Zap device ceph01:/dev/sdc
2020-09-10T16:31:24.738974+0200 mgr.ceph02.ouelws [INF] Found osd claims for drivegroup
SSD_OSDs -> {}
2020-09-10T16:31:24.740489+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host
ceph01...
2020-09-10T16:31:31.549897+0200 mgr.ceph02.ouelws [INF] Deploying daemon osd.1 on ceph01
2020-09-10T16:31:33.057061+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host
ceph02...
2020-09-10T16:31:33.057373+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host
ceph03...
2020-09-10T16:31:33.057519+0200 mgr.ceph02.ouelws [INF] Applying SSD_OSDs on host
ceph04...
2020-09-10T16:31:37.569914+0200 mon.ceph01 [INF] osd.1
[v2:10.24.4.128:6810/4173467371,v1:10.24.4.128:6811/4173467371] boot
2020-09-10T16:31:46.531544+0200 mon.ceph01 [INF] Cluster is now healthy
root@ceph01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.54196 root default
-18 0.72197 rack rack10
-3 0.31198 host ceph01
8 hdd 0.10699 osd.8 up 1.00000 1.00000
9 hdd 0.10699 osd.9 up 1.00000 1.00000
0 ssd 0.09799 osd.0 up 1.00000 1.00000
1 ssd 0 osd.1 up 1.00000 1.00000
-5 0.40999 host ceph02
10 hdd 0.10699 osd.10 up 1.00000 1.00000
11 hdd 0.10699 osd.11 up 1.00000 1.00000
2 ssd 0.09799 osd.2 up 1.00000 1.00000
3 ssd 0.09799 osd.3 up 1.00000 1.00000
-17 0.81999 rack rack11
-7 0.40999 host ceph03
12 hdd 0.10699 osd.12 up 1.00000 1.00000
13 hdd 0.10699 osd.13 up 1.00000 1.00000
4 ssd 0.09799 osd.4 up 1.00000 1.00000
5 ssd 0.09799 osd.5 up 1.00000 1.00000
-9 0.40999 host ceph04
14 hdd 0.10699 osd.14 up 1.00000 1.00000
15 hdd 0.10699 osd.15 up 1.00000 1.00000
6 ssd 0.09799 osd.6 up 1.00000 1.00000
7 ssd 0.09799 osd.7 up 1.00000 1.00000
Why does osd.1 have a weight of 0 now?
When the OSDs had been initially deployed with the first ceph orch apply command the
weights have been correctly set according to their size.
Why is there a difference between this process and an OSD (re-)deployed later on?
Regards
--
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin
http://www.heinlein-support.de
Tel: 030 / 405051-43
Fax: 030 / 405051-19
Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin