Hi Eugen,
You will find in attachment cephadm.log and cepĥ-volume.log. Each
contains the outputs for the 2 versions. v16.2.10-20220920 is really
more verbose or v16.2.11-20230125 does not execute all the detection
process
Patrick
Le 12/10/2023 à 09:34, Eugen Block a écrit :
Good catch, and I found the thread I had in my
mind, it was this
exact one. :-D Anyway, can you share the ceph-volume.log from the
working and the not working attempt?
I tried to look for something significant in the pacific release
notes for 16.2.11, and there were some changes to ceph-volume, but
I'm not sure what it could be.
Zitat von Patrick Begou <Patrick.Begou(a)univ-grenoble-alpes.fr>fr>:
> I've ran additional tests with Pacific releases and with
> "ceph-volume inventory" things went wrong with the first v16.11
> release (v16.2.11-20230125)
>
> =================== Ceph v16.2.10-20220920 =======================
>
> Device Path Size rotates available Model name
> /dev/sdc 232.83 GB True True SAMSUNG HE253GJ
> /dev/sda 232.83 GB True False SAMSUNG HE253GJ
> /dev/sdb 465.76 GB True False WDC WD5003ABYX-1
>
> =================== Ceph v16.2.11-20230125 =======================
>
> Device Path Size Device nodes rotates
> available Model name
>
>
> May be this could help to see what has changed ?
>
> Patrick
>
> Le 11/10/2023 à 17:38, Eugen Block a écrit :
>> That's really strange. Just out of curiosity, have you tried
>> Quincy (and/or Reef) as well? I don't recall what inventory does
>> in the background exactly, I believe Adam King mentioned that in
>> some thread, maybe that can help here. I'll search for that
>> thread tomorrow.
>>
>> Zitat von Patrick Begou <Patrick.Begou(a)univ-grenoble-alpes.fr>fr>:
>>
>>> Hi Eugen,
>>>
>>> [root@mostha1 ~]# rpm -q cephadm
>>> cephadm-16.2.14-0.el8.noarch
>>>
>>> Log associated to the
>>>
>>> 2023-10-11 16:16:02,167 7f820515fb80 DEBUG
>>>
--------------------------------------------------------------------------------
>>> cephadm ['gather-facts']
>>> 2023-10-11 16:16:02,208 7f820515fb80 DEBUG /bin/podman: 4.4.1
>>> 2023-10-11 16:16:02,313 7f820515fb80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:02,317 7f820515fb80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:02,322 7f820515fb80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:02,326 7f820515fb80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:02,329 7f820515fb80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:02,333 7f820515fb80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:04,474 7ff2a5c08b80 DEBUG
>>>
--------------------------------------------------------------------------------
>>> cephadm ['ceph-volume', 'inventory']
>>> 2023-10-11 16:16:04,516 7ff2a5c08b80 DEBUG /usr/bin/podman: 4.4.1
>>> 2023-10-11 16:16:04,520 7ff2a5c08b80 DEBUG Using default config:
>>> /etc/ceph/ceph.conf
>>> 2023-10-11 16:16:04,573 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 0d28d71358d7,445.8MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 2084faaf4d54,13.27MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 61073c53805d,512.7MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 6b9f0b72d668,361.1MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 7493a28808ad,163.7MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> a89672a3accf,59.22MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> b45271cc9726,54.24MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> e00ec13ab138,707.3MB / 50.32GB
>>> 2023-10-11 16:16:04,574 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> fcb1e1a6b08d,35.55MB / 50.32GB
>>> 2023-10-11 16:16:04,630 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 0d28d71358d7,1.28%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 2084faaf4d54,0.00%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 61073c53805d,1.19%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 6b9f0b72d668,1.03%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 7493a28808ad,0.78%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> a89672a3accf,0.11%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> b45271cc9726,1.35%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> e00ec13ab138,0.43%
>>> 2023-10-11 16:16:04,631 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> fcb1e1a6b08d,0.02%
>>> 2023-10-11 16:16:04,634 7ff2a5c08b80 INFO Inferring fsid
>>> 250f9864-0142-11ee-8e5f-00266cf8869c
>>> 2023-10-11 16:16:04,691 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>> 2023-10-11 16:16:04,692 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>>
quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
>>> 2023-10-11 16:16:04,692 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>>
docker.io/ceph/ceph@sha256:056637972a107df4096f10951e4216b21fcd8ae0b9fb4552e628d35df3f61139
>>> 2023-10-11 16:16:04,694 7ff2a5c08b80 INFO Using recent ceph
>>> image
>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>> 2023-10-11 16:16:05,094 7ff2a5c08b80 DEBUG stat: 167 167
>>> 2023-10-11 16:16:05,903 7ff2a5c08b80 DEBUG Acquiring lock
>>> 140679815723776 on
>>> /run/cephadm/250f9864-0142-11ee-8e5f-00266cf8869c.lock
>>> 2023-10-11 16:16:05,903 7ff2a5c08b80 DEBUG Lock 140679815723776
>>> acquired on /run/cephadm/250f9864-0142-11ee-8e5f-00266cf8869c.lock
>>> 2023-10-11 16:16:05,929 7ff2a5c08b80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:05,933 7ff2a5c08b80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:16:06,700 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> 2023-10-11 16:16:06,701 7ff2a5c08b80 DEBUG /usr/bin/podman:
>>> Device Path Size Device nodes rotates available
>>> Model name
>>>
>>>
>>> I have only one version of cephadm in /var/lib/ceph/{fsid} :
>>> [root@mostha1 ~]# ls -lrt
>>> /var/lib/ceph/250f9864-0142-11ee-8e5f-00266cf8869c/cephadm*
>>> -rw-r--r-- 1 root root 350889 28 sept. 16:39
>>>
/var/lib/ceph/250f9864-0142-11ee-8e5f-00266cf8869c/cephadm.f6868821c084cd9740b59c7c5eb59f0dd47f6e3b1e6fecb542cb44134ace8d78
>>>
>>>
>>> Running " python3
>>>
/var/lib/ceph/250f9864-0142-11ee-8e5f-00266cf8869c/cephadm.f6868821c084cd9740b59c7c5eb59f0dd47f6e3b1e6fecb542cb44134ace8d78
>>> ceph-volume inventory" give the same output and the same log
>>> (execpt the valu of the lock):
>>>
>>> 2023-10-11 16:21:35,965 7f467cf31b80 DEBUG
>>>
--------------------------------------------------------------------------------
>>> cephadm ['ceph-volume', 'inventory']
>>> 2023-10-11 16:21:36,009 7f467cf31b80 DEBUG /usr/bin/podman: 4.4.1
>>> 2023-10-11 16:21:36,012 7f467cf31b80 DEBUG Using default config:
>>> /etc/ceph/ceph.conf
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 0d28d71358d7,452.1MB / 50.32GB
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 2084faaf4d54,13.27MB / 50.32GB
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 61073c53805d,513.6MB / 50.32GB
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 6b9f0b72d668,322.4MB / 50.32GB
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 7493a28808ad,164MB / 50.32GB
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> a89672a3accf,58.5MB / 50.32GB
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> b45271cc9726,54.69MB / 50.32GB
>>> 2023-10-11 16:21:36,067 7f467cf31b80 DEBUG /usr/bin/podman:
>>> e00ec13ab138,707.1MB / 50.32GB
>>> 2023-10-11 16:21:36,068 7f467cf31b80 DEBUG /usr/bin/podman:
>>> fcb1e1a6b08d,36.28MB / 50.32GB
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 0d28d71358d7,1.27%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 2084faaf4d54,0.00%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 61073c53805d,1.16%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 6b9f0b72d668,1.02%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 7493a28808ad,0.78%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> a89672a3accf,0.11%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> b45271cc9726,1.35%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> e00ec13ab138,0.41%
>>> 2023-10-11 16:21:36,125 7f467cf31b80 DEBUG /usr/bin/podman:
>>> fcb1e1a6b08d,0.02%
>>> 2023-10-11 16:21:36,128 7f467cf31b80 INFO Inferring fsid
>>> 250f9864-0142-11ee-8e5f-00266cf8869c
>>> 2023-10-11 16:21:36,186 7f467cf31b80 DEBUG /usr/bin/podman:
>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>> 2023-10-11 16:21:36,187 7f467cf31b80 DEBUG /usr/bin/podman:
>>>
quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
>>> 2023-10-11 16:21:36,187 7f467cf31b80 DEBUG /usr/bin/podman:
>>>
docker.io/ceph/ceph@sha256:056637972a107df4096f10951e4216b21fcd8ae0b9fb4552e628d35df3f61139
>>> 2023-10-11 16:21:36,189 7f467cf31b80 INFO Using recent ceph
>>> image
>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>> 2023-10-11 16:21:36,549 7f467cf31b80 DEBUG stat: 167 167
>>> 2023-10-11 16:21:36,942 7f467cf31b80 DEBUG Acquiring lock
>>> 139940396923424 on
>>> /run/cephadm/250f9864-0142-11ee-8e5f-00266cf8869c.lock
>>> 2023-10-11 16:21:36,942 7f467cf31b80 DEBUG Lock 139940396923424
>>> acquired on /run/cephadm/250f9864-0142-11ee-8e5f-00266cf8869c.lock
>>> 2023-10-11 16:21:36,969 7f467cf31b80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:21:36,972 7f467cf31b80 DEBUG sestatus: SELinux
>>> status: disabled
>>> 2023-10-11 16:21:37,749 7f467cf31b80 DEBUG /usr/bin/podman:
>>> 2023-10-11 16:21:37,750 7f467cf31b80 DEBUG /usr/bin/podman:
>>> Device Path Size Device nodes rotates available
>>> Model name
>>>
>>> Patrick
>>>
>>> Le 11/10/2023 à 15:59, Eugen Block a écrit :
>>>> Can you check which cephadm version is installed on the host?
>>>> And then please add (only the relevant) output from the
>>>> cephadm.log when you run the inventory (without the --image
>>>> <octopus>). Sometimes the version mismatch on the host and the
>>>> one the orchestrator uses can cause some disruptions. You could
>>>> try the same with the latest cephadm you have in
>>>> /var/lib/ceph/${fsid}/ (ls -lrt
>>>> /var/lib/ceph/${fsid}/cephadm.*). I mentioned that in this
>>>> thread [1]. So you could try the following:
>>>>
>>>> $ chmod +x /var/lib/ceph/{fsid}/cephadm.{latest}
>>>>
>>>> $ python3 /var/lib/ceph/{fsid}/cephadm.{latest} ceph-volume
>>>> inventory
>>>>
>>>> Does the output differ? Paste the relevant cephadm.log from
>>>> that attempt as well.
>>>>
>>>> [1]
>>>>
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/LASBJCSPFGD…
>>>>
>>>> Zitat von Patrick Begou <Patrick.Begou(a)univ-grenoble-alpes.fr>fr>:
>>>>
>>>>> Hi Eugen,
>>>>>
>>>>> first many thanks for the time spent on this problem.
>>>>>
>>>>> "ceph osd purge 2 --force --yes-i-really-mean-it" works and
>>>>> clean all the bas status.
>>>>>
>>>>> *[root@mostha1 ~]# cephadm shell
>>>>> *Inferring fsid 250f9864-0142-11ee-8e5f-00266cf8869c
>>>>> Using recent ceph image
>>>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>>>> *
>>>>> *
>>>>> *[ceph: root@mostha1 /]# ceph osd purge 2 --force
>>>>> --yes-i-really-mean-it *
>>>>> purged osd.2
>>>>> *
>>>>> *
>>>>> *[ceph: root@mostha1 /]# ceph osd tree*
>>>>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>>>>> -1 1.72823 root default
>>>>> -5 0.45477 host dean
>>>>> 0 hdd 0.22739 osd.0 up 1.00000 1.00000
>>>>> 4 hdd 0.22739 osd.4 up 1.00000 1.00000
>>>>> -9 0.22739 host ekman
>>>>> 6 hdd 0.22739 osd.6 up 1.00000 1.00000
>>>>> -7 0.45479 host mostha1
>>>>> 5 hdd 0.45479 osd.5 up 1.00000 1.00000
>>>>> -3 0.59128 host mostha2
>>>>> 1 hdd 0.22739 osd.1 up 1.00000 1.00000
>>>>> 3 hdd 0.36389 osd.3 up 1.00000 1.00000
>>>>> *
>>>>> *
>>>>> *[ceph: root@mostha1 /]# lsblk*
>>>>> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>>>>> sda 8:0 1 232.9G 0 disk
>>>>> |-sda1 8:1 1 3.9G 0 part /rootfs/boot
>>>>> |-sda2 8:2 1 3.9G 0 part [SWAP]
>>>>> `-sda3 8:3 1 225G 0 part
>>>>> |-al8vg-rootvol 253:0 0 48.8G 0 lvm /rootfs
>>>>> |-al8vg-homevol 253:2 0 9.8G 0 lvm /rootfs/home
>>>>> |-al8vg-tmpvol 253:3 0 9.8G 0 lvm /rootfs/tmp
>>>>> `-al8vg-varvol 253:4 0 19.8G 0 lvm /rootfs/var
>>>>> sdb 8:16 1 465.8G 0 disk
>>>>>
`-ceph--08827fdc--136e--4070--97e9--e5e8b3970766-osd--block--7dec1808--d6f4--4f90--ac74--75a4346e1df5
>>>>> 253:1 0 465.8G 0 lvm
>>>>> sdc 8:32 1 232.9G 0 disk
>>>>>
>>>>> "cephadm ceph-volume inventory" returns nothing:
>>>>>
>>>>> *[root@mostha1 ~]# cephadm ceph-volume inventory **
>>>>> *Inferring fsid 250f9864-0142-11ee-8e5f-00266cf8869c
>>>>> Using recent ceph image
>>>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>>>>
>>>>> Device Path Size Device nodes rotates
>>>>> available Model name
>>>>>
>>>>> [root@mostha1 ~]#
>>>>>
>>>>> But running the same command within cephadm 15.2.17 works:
>>>>>
>>>>> *[root@mostha1 ~]# cephadm --image 93146564743f ceph-volume
>>>>> inventory*
>>>>> Inferring fsid 250f9864-0142-11ee-8e5f-00266cf8869c
>>>>>
>>>>> Device Path Size rotates available Model
>>>>> name
>>>>> /dev/sdc 232.83 GB True True SAMSUNG HE253GJ
>>>>> /dev/sda 232.83 GB True False SAMSUNG HE253GJ
>>>>> /dev/sdb 465.76 GB True False WDC
>>>>> WD5003ABYX-1
>>>>>
>>>>> [root@mostha1 ~]#
>>>>>
>>>>> *[root@mostha1 ~]# podman images -a**
>>>>> *REPOSITORY TAG IMAGE ID CREATED
>>>>> SIZE
>>>>> quay.io/ceph/ceph v16.2.14 f13d80acdbb5 2
>>>>> weeks ago 1.21 GB
>>>>> quay.io/ceph/ceph v15.2.17 93146564743f 14
>>>>> months ago 1.24 GB
>>>>> ....
>>>>>
>>>>>
>>>>> Patrick
>>>>>
>>>>> Le 11/10/2023 à 15:14, Eugen Block a écrit :
>>>>>> Your response is a bit confusing since it seems to be mixed
>>>>>> up with the previous answer. So you still need to remove the
>>>>>> OSD properly, so purge it from the crush tree:
>>>>>>
>>>>>> ceph osd purge 2 --force --yes-i-really-mean-it (only in a
>>>>>> test cluster!)
>>>>>>
>>>>>> If everything is clean (OSD has been removed, disk has been
>>>>>> zapped, lsblk shows no LVs for that disk) you can check the
>>>>>> inventory:
>>>>>>
>>>>>> cephadm ceph-volume inventory
>>>>>>
>>>>>> Please also add the output of 'ceph orch ls osd
--export'.
>>>>>>
>>>>>> Zitat von Patrick Begou
<Patrick.Begou(a)univ-grenoble-alpes.fr>fr>:
>>>>>>
>>>>>>> Hi Eugen,
>>>>>>>
>>>>>>> - the OS is Alma Linux 8 with latests updates.
>>>>>>>
>>>>>>> - this morning I've worked with ceph-volume but it ends
with
>>>>>>> a strange final state. I was connected on host mostha1 where
>>>>>>> /dev/sdc was not reconized. These are the steps followed
>>>>>>> based on the ceph-volume documentation I've read:
>>>>>>> [root@mostha1 ~]# cephadm shell
>>>>>>> [ceph: root@mostha1 /]# ceph auth get client.bootstrap-osd
>
>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring
>>>>>>> [ceph: root@mostha1 /]# ceph-volume lvm prepare --bluestore
>>>>>>> --data /dev/sdc
>>>>>>>
>>>>>>> Now lsblk command shows sdc as an osd:
>>>>>>> ....
>>>>>>> sdb 8:16 1 465.8G 0 disk
>>>>>>>
`-ceph--08827fdc--136e--4070--97e9--e5e8b3970766-osd--block--7dec1808--d6f4--4f90--ac74--75a4346e1df5
>>>>>>> 253:1 0 465.8G 0 lvm
>>>>>>> sdc 8:32 1 232.9G 0 disk
>>>>>>>
`-ceph--b27d7a07--278d--4ee2--b84e--53256ef8de4c-osd--block--45c8e92c--caf9--4fe7--9a42--7b45a0794632
>>>>>>> 253:5 0 232.8G 0 lvm
>>>>>>>
>>>>>>> Then I've tried to activate this osd but it fails as in
>>>>>>> podman I have not access to systemctl:
>>>>>>>
>>>>>>> [ceph: root@mostha1 /]# ceph-volume lvm activate 2
>>>>>>> 45c8e92c-caf9-4fe7-9a42-7b45a0794632
>>>>>>> .....
>>>>>>> Running command: /usr/bin/systemctl start ceph-osd@2
>>>>>>> stderr: Failed to connect to bus: No such file or directory
>>>>>>> --> RuntimeError: command returned non-zero exit status:
1
>>>>>>> [ceph: root@mostha1 /]# ceph osd tree
>>>>>>>
>>>>>>> And now I have now I have a strange status for this osd.2:
>>>>>>>
>>>>>>> [ceph: root@mostha1 /]# ceph osd tree
>>>>>>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT
PRI-AFF
>>>>>>> -1 1.72823 root default
>>>>>>> -5 0.45477 host dean
>>>>>>> 0 hdd 0.22739 osd.0 up 1.00000 1.00000
>>>>>>> 4 hdd 0.22739 osd.4 up 1.00000 1.00000
>>>>>>> -9 0.22739 host ekman
>>>>>>> 6 hdd 0.22739 osd.6 up 1.00000 1.00000
>>>>>>> -7 0.45479 host mostha1
>>>>>>> 5 hdd 0.45479 osd.5 up 1.00000 1.00000
>>>>>>> -3 0.59128 host mostha2
>>>>>>> 1 hdd 0.22739 osd.1 up 1.00000 1.00000
>>>>>>> 3 hdd 0.36389 osd.3 up 1.00000 1.00000
>>>>>>> 2 0 osd.2 down 0 1.00000
>>>>>>>
>>>>>>> I've tried to destroy the osd as you suggest but even if
the
>>>>>>> command returns no error I still have this osd even if
>>>>>>> "lsblk" do not show any more /dev/sdc as a ceph osd
device.
>>>>>>>
>>>>>>> *[ceph: root@mostha1 /]# ceph-volume lvm zap --destroy
>>>>>>> /dev/sdc**
>>>>>>> *--> Zapping: /dev/sdc
>>>>>>> --> Zapping lvm member /dev/sdc. lv_path is
>>>>>>>
/dev/ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c/osd-block-45c8e92c-caf9-4fe7-9a42-7b45a0794632
>>>>>>>
>>>>>>> --> Unmounting /var/lib/ceph/osd/ceph-2
>>>>>>> Running command: /usr/bin/umount -v /var/lib/ceph/osd/ceph-2
>>>>>>> stderr: umount: /var/lib/ceph/osd/ceph-2 unmounted
>>>>>>> Running command: /usr/bin/dd if=/dev/zero
>>>>>>>
of=/dev/ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c/osd-block-45c8e92c-caf9-4fe7-9a42-7b45a0794632
>>>>>>> bs=1M count=10 conv=fsync
>>>>>>> stderr: 10+0 records in
>>>>>>> 10+0 records out
>>>>>>> 10485760 bytes (10 MB, 10 MiB) copied, 0.575633 s, 18.2 MB/s
>>>>>>> --> Only 1 LV left in VG, will proceed to destroy volume
>>>>>>> group ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c
>>>>>>> Running command: nsenter --mount=/rootfs/proc/1/ns/mnt
>>>>>>> --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net
>>>>>>> --uts=/rootfs/proc/1/ns/uts /sbin/vgremove -v -f
>>>>>>> ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c
>>>>>>> stderr: Removing
>>>>>>>
ceph--b27d7a07--278d--4ee2--b84e--53256ef8de4c-osd--block--45c8e92c--caf9--4fe7--9a42--7b45a0794632
>>>>>>> (253:1)
>>>>>>> stderr: Releasing logical volume
>>>>>>> "osd-block-45c8e92c-caf9-4fe7-9a42-7b45a0794632"
>>>>>>> stderr: Archiving volume group
>>>>>>> "ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c"
metadata (seqno 5).
>>>>>>> stdout: Logical volume
>>>>>>> "osd-block-45c8e92c-caf9-4fe7-9a42-7b45a0794632"
>>>>>>> successfully removed.
>>>>>>> stderr: Removing physical volume "/dev/sdc" from
volume
>>>>>>> group "ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c"
>>>>>>> stdout: Volume group
>>>>>>> "ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c"
successfully
>>>>>>> removed
>>>>>>> stderr: Creating volume group backup
>>>>>>>
"/etc/lvm/backup/ceph-b27d7a07-278d-4ee2-b84e-53256ef8de4c"
>>>>>>> (seqno 6).
>>>>>>> Running command: nsenter --mount=/rootfs/proc/1/ns/mnt
>>>>>>> --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net
>>>>>>> --uts=/rootfs/proc/1/ns/uts /sbin/pvremove -v -f -f /dev/sdc
>>>>>>> stdout: Labels on physical volume "/dev/sdc"
successfully
>>>>>>> wiped.
>>>>>>> Running command: /usr/bin/dd if=/dev/zero of=/dev/sdc bs=1M
>>>>>>> count=10 conv=fsync
>>>>>>> stderr: 10+0 records in
>>>>>>> 10+0 records out
>>>>>>> 10485760 bytes (10 MB, 10 MiB) copied, 0.590652 s, 17.8 MB/s
>>>>>>> *--> Zapping successful for: <Raw Device:
/dev/sdc>*
>>>>>>> *
>>>>>>> *
>>>>>>> *[ceph: root@mostha1 /]# ceph osd tree**
>>>>>>> *ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT
PRI-AFF
>>>>>>> -1 1.72823 root default
>>>>>>> -5 0.45477 host dean
>>>>>>> 0 hdd 0.22739 osd.0 up 1.00000 1.00000
>>>>>>> 4 hdd 0.22739 osd.4 up 1.00000 1.00000
>>>>>>> -9 0.22739 host ekman
>>>>>>> 6 hdd 0.22739 osd.6 up 1.00000 1.00000
>>>>>>> -7 0.45479 host mostha1
>>>>>>> 5 hdd 0.45479 osd.5 up 1.00000 1.00000
>>>>>>> -3 0.59128 host mostha2
>>>>>>> 1 hdd 0.22739 osd.1 up 1.00000 1.00000
>>>>>>> 3 hdd 0.36389 osd.3 up 1.00000 1.00000
>>>>>>> 2 0 osd.2 down 0 1.00000
>>>>>>> *
>>>>>>> *
>>>>>>> *[ceph: root@mostha1 /]# lsblk**
>>>>>>> *NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>>>>>>> sda 8:0 1 232.9G 0 disk
>>>>>>> |-sda1 8:1 1 3.9G 0 part /rootfs/boot
>>>>>>> |-sda2 8:2 1 3.9G 0 part [SWAP]
>>>>>>> `-sda3 8:3 1 225G 0 part
>>>>>>> |-al8vg-rootvol 253:0 0 48.8G 0 lvm /rootfs
>>>>>>> |-al8vg-homevol 253:3 0 9.8G 0 lvm /rootfs/home
>>>>>>> |-al8vg-tmpvol 253:4 0 9.8G 0 lvm /rootfs/tmp
>>>>>>> `-al8vg-varvol 253:5 0 19.8G 0 lvm /rootfs/var
>>>>>>> sdb 8:16 1 465.8G 0 disk
>>>>>>>
`-ceph--08827fdc--136e--4070--97e9--e5e8b3970766-osd--block--7dec1808--d6f4--4f90--ac74--75a4346e1df5
>>>>>>> 253:2 0 465.8G 0 lvm
>>>>>>> *sdc *
>>>>>>>
>>>>>>> Patrick
>>>>>>> Le 11/10/2023 à 11:00, Eugen Block a écrit :
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> just wondering if 'ceph-volume lvm zap --destroy
/dev/sdc'
>>>>>>>> would help here. From your previous output you didn't
>>>>>>>> specify the --destroy flag.
>>>>>>>> Which cephadm version is installed on the host? Did you
>>>>>>>> also upgrade the OS when moving to Pacific? (Sorry if I
>>>>>>>> missed that.
>>>>>>>>
>>>>>>>>
>>>>>>>> Zitat von Patrick Begou
>>>>>>>> <Patrick.Begou(a)univ-grenoble-alpes.fr>fr>:
>>>>>>>>
>>>>>>>>> Le 02/10/2023 à 18:22, Patrick Bégou a écrit :
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> still stuck with this problem.
>>>>>>>>>>
>>>>>>>>>> I've deployed octopus and all my HDD have
been setup as
>>>>>>>>>> osd. Fine.
>>>>>>>>>> I've upgraded to pacific and 2 osd have
failed. They have
>>>>>>>>>> been automatically removed and upgrade finishes.
Cluster
>>>>>>>>>> Health is finaly OK, no data loss.
>>>>>>>>>>
>>>>>>>>>> But now I cannot re-add these osd with pacific (I
had
>>>>>>>>>> previous troubles on these old HDDs, lost one osd
in
>>>>>>>>>> octopus and was able to reset and re-add it).
>>>>>>>>>>
>>>>>>>>>> I've tried manually to add the first osd on
the node
>>>>>>>>>> where it is located, following
>>>>>>>>>>
https://docs.ceph.com/en/pacific/rados/operations/bluestore-migration/
>>>>>>>>>> (not sure it's the best idea...) but it fails
too. This
>>>>>>>>>> node was the one used for deploying the cluster.
>>>>>>>>>>
>>>>>>>>>> [ceph: root@mostha1 /]# *ceph-volume lvm zap
/dev/sdc*
>>>>>>>>>> --> Zapping: /dev/sdc
>>>>>>>>>> --> --destroy was not specified, but zapping a
whole
>>>>>>>>>> device will remove the partition table
>>>>>>>>>> Running command: /usr/bin/dd if=/dev/zero
of=/dev/sdc
>>>>>>>>>> bs=1M count=10 conv=fsync
>>>>>>>>>> stderr: 10+0 records in
>>>>>>>>>> 10+0 records out
>>>>>>>>>> 10485760 bytes (10 MB, 10 MiB) copied, 0.663425
s, 15.8 MB/s
>>>>>>>>>> --> Zapping successful for: <Raw Device:
/dev/sdc>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [ceph: root@mostha1 /]# *ceph-volume lvm create
>>>>>>>>>> --bluestore --data /dev/sdc*
>>>>>>>>>> Running command: /usr/bin/ceph-authtool
--gen-print-key
>>>>>>>>>> Running command: /usr/bin/ceph --cluster ceph
--name
>>>>>>>>>> client.bootstrap-osd --keyring
>>>>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd
new
>>>>>>>>>> 9f1eb8ee-41e6-4350-ad73-1be21234ec7c
>>>>>>>>>> stderr: 2023-10-02T16:09:29.855+0000
7fb4eb8c0700 -1
>>>>>>>>>> auth: unable to find a keyring on
>>>>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No
such
>>>>>>>>>> file or directory
>>>>>>>>>> stderr: 2023-10-02T16:09:29.855+0000
7fb4eb8c0700 -1
>>>>>>>>>> AuthRegistry(0x7fb4e405c4d8) no keyring found at
>>>>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
>>>>>>>>>> stderr: 2023-10-02T16:09:29.856+0000
7fb4eb8c0700 -1
>>>>>>>>>> auth: unable to find a keyring on
>>>>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No
such
>>>>>>>>>> file or directory
>>>>>>>>>> stderr: 2023-10-02T16:09:29.856+0000
7fb4eb8c0700 -1
>>>>>>>>>> AuthRegistry(0x7fb4e40601d0) no keyring found at
>>>>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
>>>>>>>>>> stderr: 2023-10-02T16:09:29.857+0000
7fb4eb8c0700 -1
>>>>>>>>>> auth: unable to find a keyring on
>>>>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No
such
>>>>>>>>>> file or directory
>>>>>>>>>> stderr: 2023-10-02T16:09:29.857+0000
7fb4eb8c0700 -1
>>>>>>>>>> AuthRegistry(0x7fb4eb8bee90) no keyring found at
>>>>>>>>>> /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
>>>>>>>>>> stderr: 2023-10-02T16:09:29.858+0000
7fb4e965c700 -1
>>>>>>>>>> monclient(hunting): handle_auth_bad_method server
>>>>>>>>>> allowed_methods [2] but i only support [1]
>>>>>>>>>> stderr: 2023-10-02T16:09:29.858+0000
7fb4e9e5d700 -1
>>>>>>>>>> monclient(hunting): handle_auth_bad_method server
>>>>>>>>>> allowed_methods [2] but i only support [1]
>>>>>>>>>> stderr: 2023-10-02T16:09:29.858+0000
7fb4e8e5b700 -1
>>>>>>>>>> monclient(hunting): handle_auth_bad_method server
>>>>>>>>>> allowed_methods [2] but i only support [1]
>>>>>>>>>> stderr: 2023-10-02T16:09:29.858+0000
7fb4eb8c0700 -1
>>>>>>>>>> monclient: authenticate NOTE: no keyring found;
disabled
>>>>>>>>>> cephx authentication
>>>>>>>>>> stderr: [errno 13] RADOS permission denied
(error
>>>>>>>>>> connecting to the cluster)
>>>>>>>>>> --> RuntimeError: Unable to create a new OSD
id
>>>>>>>>>>
>>>>>>>>>> Any idea of what is wrong ?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> Patrick
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm still trying to understand what can be wrong
or how to
>>>>>>>>> debug this situation where Ceph cannot see the
devices.
>>>>>>>>>
>>>>>>>>> The device :dev/sdc exists:
>>>>>>>>>
>>>>>>>>> [root@mostha1 ~]# cephadm shell lsmcli ldl
>>>>>>>>> Inferring fsid
250f9864-0142-11ee-8e5f-00266cf8869c
>>>>>>>>> Using recent ceph image
>>>>>>>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>>>>>>>> Path | SCSI VPD 0x83 | Link Type | Serial
Number
>>>>>>>>> | Health
>>>>>>>>> Status ceph osd purge 2 --force
--yes-i-really-mean-it
>>>>>>>>>
-------------------------------------------------------------------------
>>>>>>>>> /dev/sda | 50024e92039e4f1c | PATA/SATA |
>>>>>>>>> S2B5J90ZA10142 | Good
>>>>>>>>> /dev/sdb | 50014ee0ad5953c9 | PATA/SATA |
>>>>>>>>> WD-WMAYP0982329 | Good
>>>>>>>>> /dev/sdc | 50024e920387fa2c | PATA/SATA |
>>>>>>>>> S2B5J90ZA02494 | Good
>>>>>>>>>
>>>>>>>>> But I cannot do anything with it:
>>>>>>>>>
>>>>>>>>> [root@mostha1 ~]# cephadm shell ceph orch device
zap
>>>>>>>>> mostha1.legi.grenoble-inp.fr /dev/sdc --force
>>>>>>>>> Inferring fsid
250f9864-0142-11ee-8e5f-00266cf8869c
>>>>>>>>> Using recent ceph image
>>>>>>>>>
quay.io/ceph/ceph@sha256:f30bf50755d7087f47c6223e6a921caf5b12e86401b3d49220230c84a8302a1e
>>>>>>>>> Error EINVAL: Device path '/dev/sdc' not
found on host
>>>>>>>>> 'mostha1.legi.grenoble-inp.fr'
>>>>>>>>>
>>>>>>>>> Since I moved from octopus to Pacific.
>>>>>>>>>
>>>>>>>>> Patrick
>>>>>>>>> _______________________________________________
>>>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io