[ceph-users] Re: Worst thing that can happen if I have size= 2

4 Feb 2021

Il giorno gio 4 feb 2021 alle ore 00:33 Simon Ironside <
sironside(a)caffetine.org&gt; ha scritto:

...

 On 03/02/2021 19:48, Mario Giammarco wrote:

 To labour Dan's point a bit further, maybe a RAID5/6 analogy is better
 than RAID1. Yes, I know we're not talking erasure coding pools here but
 this is similar to the reasons why people moved from RAID5 (size=2, kind
 of) to RAID6 (size=3, kind of). I.e. the more disks you have in an array
 (cluster, in our case) and the bigger those disks are, the greater the
 chance you have of encountering a second problem during a recovery.

 Yes I know the motivations for raid6 but to simplify  the use case I am comparing
ceph size=2 to raid1.

...
   What I ask is
this: what happens with min_size=1 and split brain,
 network down or similar things: do ceph block writes because it has no
 quorum on monitors? Are there some failure scenarios that I have not
 considered? 
 It sounds like in your example you would have 3 physical servers in
 total. So would you have both a monitor and OSDs processes on each server?

 Yes sorry if it was not clear:
- three servers
- three monitors
- three managers
- 6 osd (two disks per server)

...
  If so, it's not really related to min_size=1 but
to answer your question
 you could lose one monitor and the cluster would continue. Losing a
 second monitor will stop your cluster until this is resolved. In your
 example setup (with colocated mons & OSDs) this would presumably also
 mean you'd lost two OSDs servers too so you'd have bigger problems.

 Losing the switch means monitors are up but cannot communicate so they
should stop?

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Worst thing that can happen if I have size= 2