On Tue, Sep 3, 2019 at 5:03 AM Yoann Moulin <yoann.moulin@epfl.ch> wrote:
> As for your EC 7+5 I would have gone for some thing like 8+3 as then you have a spare node active in the cluster and can still provide full protection in the event of a failure of a node.

Make sense! On another cluster, I have an EC 7+5 pool for cephfs but there are 4 servers per chassis. In case I lost one chassis, I still need
to access data. But for that cluster, you are right, 8+3 may be enough for redundancy.
 
Another configuration to consider is to leverage the bucket types in CRUSH. We setup rows, racks, switches, chassis, etc in our CRUSH map and then have the CRUSH rules only select one OSD per fault domain that we want to survive. In your case you 'could' put your hosts into a chassis, then have the rule choose_leaf from the chassis, then you 'could' go down as far as 8+2 for instance and still be protected if 4 hosts in a chassis went down.

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1