[ceph-users] Re: Worst thing that can happen if I have size= 2

5 Feb 2021

On 05/02/2021 20:10, Mario Giammarco wrote:

...
  It is not that a morning I wake up and put some random
 hardware together,
 I followed guidelines.
 The result should be:
 - if a disk (or more) brokes work goes on
 - if a server brokes the VMs on the server start on another server and
 work goes on.

 The result is: one disk brokes, ceph fills the other one in the same server
 , reaches 90% and EVERYTHING stops including all VMs and the customer has
 lost unsaved data and it cannot run the VMs it needs to continue works.
 Not very "HA" as hoped. 
With three OSD hosts, each with two disks, size=3 and default CRUSH 
rules (i.e. each replica goes to a different host) then each OSD host 
would expect to get roughly 1/3 of the total data. Under normal running 
this would mean each disk sees 1/6 of the total data.

When a single disk failed in your scenario above, all three hosts were 
still available and still get 1/3 of the total data. Because one disk 
failed, the surviving disk has to store the replicas that were on the 
failed disk as well its own (so, 2/6 total data - double what it had 
before). To have reached 90% full on the surviving disk suggests that it 
was (at least) 45% full under normal running.

Ceph is doing what it's supposed to in this case, the issue is that the 
disks haven't been sized large enough to allow for this failure.

Simon

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Worst thing that can happen if I have size= 2