Hi,
Yes we're trying to remove the osd.3. Here is the result of `ceph osd df` :
IDCLASSWEIGHTREWEIGHTSIZERAWUSEDATAOMAPMETAAVAIL%USEVARPGSSTATUS
3hdd1.818791.000001.8TiB443GiB441GiB6.8MiB1.5GiB1.4TiB23.782.3716up
6hdd1.818791.000001.8TiB114GiB114GiB981KiB343MiB1.7TiB6.140.618up
12hdd1.818791.000001.8TiB359GiB358GiB5.8MiB1.0GiB1.5TiB19.271.9215up
13hdd1.818791.000001.8TiB331GiB330GiB3.9MiB1.5GiB1.5TiB17.771.7715up
15hdd1.818791.000001.8TiB217GiB216GiB2.0MiB1.1GiB1.6TiB11.641.1613up
16hdd9.095201.000009.1TiB785GiB783GiB8.8MiB1.9GiB8.3TiB8.430.8451up
17hdd1.818791.000001.8TiB204GiB203GiB2.9MiB1.2GiB1.6TiB10.951.0911up
1hdd5.457491.000005.5TiB428GiB427GiB4.9MiB876MiB5.0TiB7.660.7624up
4hdd5.457491.000005.5TiB638GiB636GiB6.8MiB2.2GiB4.8TiB11.421.1436up
8hdd5.457491.000005.5TiB594GiB591GiB8.7MiB2.2GiB4.9TiB10.621.0630up
11hdd5.457491.000005.5TiB567GiB565GiB7.8MiB2.1GiB4.9TiB10.151.0129up
14hdd1.818791.000001.8TiB197GiB195GiB2.9MiB1.2GiB1.6TiB10.551.0510up
0hdd9.095201.000009.1TiB764GiB763GiB9.6MiB1.8GiB8.3TiB8.210.8247up
5hdd9.095201.000009.1TiB791GiB789GiB11MiB2.6GiB8.3TiB8.500.8538up
9hdd9.095201.000009.1TiB858GiB856GiB11MiB2.4GiB8.3TiB9.210.9244up
TOTAL71TiB7.1TiB7.1TiB93MiB24GiB64TiB10.03
MIN/MAXVAR:0.61/2.37STDDEV:4.97
And here id `ceph osd pool ls detail` (but yes our replicated size if 3) :
pool1'.mgr'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num1pgp_num1autoscale_modeonlast_change32flagshashpspoolstripe_width0pg_num_max32pg_num_min1applicationmgr
pool2'volumes'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change9327lfor0/0/104flagshashpspool,selfmanaged_snapsstripe_width0applicationrbd
pool3'images'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change9018lfor0/0/104flagshashpspool,selfmanaged_snapsstripe_width0applicationrbd
pool4'vms'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change9149lfor0/0/106flagshashpspool,selfmanaged_snapsstripe_width0applicationrbd
pool5'polyphoto_backup'replicatedsize3min_size2crush_rule0object_hashrjenkinspg_num32pgp_num32autoscale_modeonlast_change372lfor0/0/362flagshashpspool,selfmanaged_snapsstripe_width0compression_algorithmsnappycompression_modeaggressiveapplicationrbd
And we're using quincy :
romain:step@alpha-cen~ $ sudoceph--version
cephversion17.2.6(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
All our physical disks are on their own RAID 0 using the built-in Raid
Controller (PERC H730 Mini).
For the logs, journalctl has rotated and we don't have them anymore. I
recreated the situation where the three OSDs crash (shutting down osd.3
and marking it out), and here are the logs :
ceph -w :
https://pastebin.com/A7gJ3ss2
<https://pastebin.com/A7gJ3ss2>osd.0, osd.3 and osd.11 :
https://gitlab.com/RomainL456/ceph-incident-logs/
I put the full logs (output of `sudo journalctl --since "23:00" -u
ceph-9b4b12fe-4dc6-11ed-9ed9-d18a342d7c2b(a)osd.*`) in a public Git repo,
and I also put a file for the logs right before osd.0 crashed. Here is
the timeline of events (local time) :
23:27 : I manually shut down osd.3
23:46 : osd.0 crashes
23:46 : osd.11 crashes
23:48 : I start osd.3, it crashes in less than a minute
23:49 : After I mark osd.3 "in" and start it again, it comes back online
with osd.0 and osd.11 soon after
Best regards,
Romain Lebbadi-Breteau
On 2024-03-08 3:17 a.m., Eugen Block wrote:
> Hi,
>
> can you share more details? Which OSD are you trying to get out, the
> primary osd.3?
> Can you also share 'ceph osd df'?
> It looks like a replicated pool with size 3, can you confirm with
> 'ceph osd pool ls detail'?
> Do you have logs from the crashing OSDs when you take out osd.3?
> Which ceph version is this?
>
> Thanks,
> Eugen
>
> Zitat von Romain Lebbadi-Breteau <romain.lebbadi-breteau(a)polymtl.ca>ca>:
>
>> Hi,
>>
>> We're a student club from Montréal where we host an Openstack cloud
>> with a Ceph backend for storage of virtual machines and volumes using
>> rbd.
>>
>> Two weeks ago we received an email from our ceph cluster saying that
>> some pages were damaged. We ran "sudo ceph pg repair <pg-id>" but
>> then there was an I/O error on the disk during the recovery ("An
>> unrecoverable disk media error occurred on Disk 4 in Backplane 1 of
>> Integrated RAID Controller 1." and "Bad block medium error is
>> detected at block 0x1377e2ad on Virtual Disk 3 on Integrated RAID
>> Controller 1." messages on iDRAC).
>>
>> After that, the PG we tried to repair was in the state
>> "active+recovery_unfound+degraded". After a week, we ran the command
>> "sudo ceph pg 2.1b mark_unfound_lost revert" to try to recover the
>> damaged PG. We tried to boot the virtual machine that had crashed
>> because of this incident, but the volume seemed to have been
>> completely erased, the "mount" command said there was no filesystem
>> on it, so we recreated the VM from a backup.
>>
>> A few days later, the same PG was once again damaged, and since we
>> knew the physical disk on the OSD hosting one part of the PG had
>> problems, we tried to "out" the OSD from the cluster. That resulted
>> in the two other OSDs hosting copies of the problematic PG to go
>> down, which caused timeouts on our virtual machines, so we put the
>> OSD back in.
>>
>> We then tried to repair the PG again, but that failed and the PG is
>> now "active+clean+inconsistent+failed_repair", and whenever it goes
>> down, two other OSDs from two other hosts go down too after a few
>> minutes, so it's impossible to replace the disk right now, even if we
>> have new ones available.
>>
>> We have backups for most of our services, but it would be very
>> disrupting to delete the whole cluster, and we don't know that to do
>> with the broken PG and the OSD that can't be shut down.
>>
>> Any help would be really appreciated, we're not experts with Ceph and
>> Openstack, and it's likely we handled things wrong at some point, but
>> we really want to go back to a healthy Ceph.
>>
>> Here are some information about our cluster :
>>
>> romain:step@alpha-cen ~ $ sudo ceph health detail
>> HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
>> [ERR] OSD_SCRUB_ERRORS: 1 scrub errors
>> [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
>> pg 2.1b is active+clean+inconsistent+failed_repair, acting [3,11,0]
>>
>> romain:step@alpha-cen ~ $ sudo ceph osd tree
>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>> -1 70.94226 root default
>> -7 20.00792 host alpha-cen
>> 3 hdd 1.81879 osd.3 up 1.00000 1.00000
>> 6 hdd 1.81879 osd.6 up 1.00000 1.00000
>> 12 hdd 1.81879 osd.12 up 1.00000 1.00000
>> 13 hdd 1.81879 osd.13 up 1.00000 1.00000
>> 15 hdd 1.81879 osd.15 up 1.00000 1.00000
>> 16 hdd 9.09520 osd.16 up 1.00000 1.00000
>> 17 hdd 1.81879 osd.17 up 1.00000 1.00000
>> -5 23.64874 host beta-cen
>> 1 hdd 5.45749 osd.1 up 1.00000 1.00000
>> 4 hdd 5.45749 osd.4 up 1.00000 1.00000
>> 8 hdd 5.45749 osd.8 up 1.00000 1.00000
>> 11 hdd 5.45749 osd.11 up 1.00000 1.00000
>> 14 hdd 1.81879 osd.14 up 1.00000 1.00000
>> -3 27.28560 host gamma-cen
>> 0 hdd 9.09520 osd.0 up 1.00000 1.00000
>> 5 hdd 9.09520 osd.5 up 1.00000 1.00000
>> 9 hdd 9.09520 osd.9 up 1.00000 1.00000
>>
>> romain:step@alpha-cen ~ $ sudo rados list-inconsistent-obj 2.1b
>> {"epoch":9787,"inconsistents":[]}
>>
>> romain:step@alpha-cen ~ $ sudo ceph pg 2.1b query
>>
>>
https://pastebin.com/gsKCPCjr
>>
>> Best regards,
>>
>> Romain Lebbadi-Breteau
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>