On Wed, Sep 11, 2019 at 11:17:47AM +0100, Matthew Vernon wrote:
Hi,
We keep finding part-made OSDs (they appear not attached to any host,
and down and out; but still counting towards the number of OSDs); we
never saw this with ceph-disk. On investigation, this is because
ceph-volume lvm create makes the OSD (ID and auth at least) too early in
the process and is then unable to roll-back cleanly (because the
bootstrap-osd credential isn't allowed to remove OSDs).
As an example (very truncated):
Running command: /usr/bin/ceph --cluster ceph --name
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
-i - osd new 20cea174-4c1b-4330-ad33-505a03156c33
Running command: vgcreate --force --yes
ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e /dev/sdbh
stderr: Device /dev/sdbh not found (or ignored by filtering).
Unable to add physical volume '/dev/sdbh' to volume group
'ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e'.
--> Was unable to complete a new OSD, will rollback changes
--> OSD will be fully purged from the cluster, because the ID was generated
Running command: ceph osd purge osd.828 --yes-i-really-mean-it
stderr: 2019-09-10 15:07:53.396528 7fbca2caf700 -1 auth: unable to find
a keyring on
/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,:
(2) No such file or directory
stderr: 2019-09-10 15:07:53.397318 7fbca2caf700 -1 monclient:
authenticate NOTE: no keyring found; disabled cephx authentication
2019-09-10 15:07:53.397334 7fbca2caf700 0 librados: client.admin
authentication error (95) Operation not supported
This is annoying to have to clear up, and it seems to me could be
avoided by either:
i) ceph-volume should (attempt to) set up the LVM volumes &c before
making the new OSD id
or
ii) allow the bootstrap-osd credential to purge OSDs
i) seems like clearly the better answer...?
Agreed. Would you mind opening a bug
report on
https://tracker.ceph.com/projects/ceph-volume.
I have found other situation where a roll-back is working as it should, though
not with as much impact as this.
Regards,
Matthew
_______________________________________________
ceph-users mailing list
ceph-users(a)lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Jan Fajerski
Senior Software Engineer Enterprise Storage
SUSE Software Solutions Germany GmbH
(HRB 247165, AG München)
Geschäftsführer: Felix Imendörffer