I ended up in the same situation while playing around with a test cluster. The SUSE team
has an article [1] for this case, the following helped me resolve this issue. I had three
different osd specs in place for the same three nodes:
osd 3 <deleting> 3w nautilus2;nautilus3
osd.osd-hdd-ssd 3 2m ago 2w nautilus;nautilus2;nautilus3
osd.osd-hdd-ssd-mix 3 2m ago - <unmanaged>
I replaced the "service_name" with the more suiting value
("osd.osd-hdd-ssd") in the unit.meta file of each OSD containing the invalid
spec, then restarted each affected OSD. It probably wouldn't have been necessary but I
wanted to see the effect immediately, so I failed over the mgr (ceph mgr fail), now I only
have one valid osd spec.
# before
nautilus3:~ # grep service_name
/var/lib/ceph/201a2fbc-ce7b-44a3-9ed7-39427972083b/osd.3/unit.meta
"service_name": "osd",
# after
nautilus3:~ # grep service_name
/var/lib/ceph/201a2fbc-ce7b-44a3-9ed7-39427972083b/osd.3/unit.meta
"service_name": "osd.osd-hdd-ssd",
nautilus3:~ # ceph orch ls osd
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
osd.osd-hdd-ssd 9 10m ago 2w nautilus;nautilus2;nautilus3
Regards,
Eugen
[1]
https://www.suse.com/support/kb/doc/?id=000020667