[ceph-users] Re: Ceph cluster not recover after OSD down

5 May 2021

Create a new crush rule with the correct failure domain, test it 
properly and assign it to the pool(s).

-- 
Beste Grüße, Joachim Kraftmayer
___________________________________

Clyso GmbH

Am 05.05.2021 um 15:11 schrieb Andres Rojas Guerrero:
>
> Nice observation, how can avoid this problem?
>
>
> El 5/5/21 a las 14:54, Robert Sander escribió:
>> Hi,
>>
>> Am 05.05.21 um 13:39 schrieb Joachim Kraftmayer:
>>
>>> the crush rule with ID 1 distributes your EC chunks over the osds
>>> without considering the ceph host. As Robert already suspected.
>>
>> Yes, the "nxtcloudAF" rule is not fault tolerant enough. Having the
OSD
>> as failure zone will lead to data loss or at least intermediate
>> unavailability.
>>
>> The situation is now that all copies (resp. EC chunks) for a PG are
>> stored on OSDs of the same host. These PGs will be unavailable if the
>> host is down.
>>
>> Regards
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Ceph cluster not recover after OSD down