On 03/02/2021 19:48, Mario Giammarco wrote:
It is obvious and a bit paranoid because many servers
on many customers
run on raid1 and so you are saying: yeah you have two copies of the data
but you can broke both. Consider that in ceph recovery is automatic,
with raid1 some one must manually go to the customer and change disks.
So ceph is already an improvement in this case even with size=2. With
size 3 and min 2 it is a bigger improvement I know.
To labour Dan's point a bit further, maybe a RAID5/6 analogy is better
than RAID1. Yes, I know we're not talking erasure coding pools here but
this is similar to the reasons why people moved from RAID5 (size=2, kind
of) to RAID6 (size=3, kind of). I.e. the more disks you have in an array
(cluster, in our case) and the bigger those disks are, the greater the
chance you have of encountering a second problem during a recovery.
What I ask is this: what happens with min_size=1 and
split brain,
network down or similar things: do ceph block writes because it has no
quorum on monitors? Are there some failure scenarios that I have not
considered?
It sounds like in your example you would have 3 physical servers in
total. So would you have both a monitor and OSDs processes on each server?
If so, it's not really related to min_size=1 but to answer your question
you could lose one monitor and the cluster would continue. Losing a
second monitor will stop your cluster until this is resolved. In your
example setup (with colocated mons & OSDs) this would presumably also
mean you'd lost two OSDs servers too so you'd have bigger problems.
HTH,
Simon