On Tue, Nov 26, 2019 at 7:45 PM majia xiao <xiaomajia.st(a)gmail.com> wrote:
We have a Ceph（version 12.2.4）cluster that adopts EC pools, and it
consists of 10 hosts for OSDs.
The corresponding commands to create the EC pool are listed as follows:
ceph osd erasure-code-profile set profile_jerasure_4_3_reed_sol_van \
ceph osd pool create pool_jerasure_4_3_reed_sol_van 2048 2048 erasure
Since that the EC pool's crush-failure-domain is configured to be "host",
we just disable the network interfaces of some hosts (using "ifdown"
command) to verify the functionality of the EC pool.
And here are the phenomena we have observed:
First of all, the IO rate (of "rados bench", which we used for benchmark)
drops immediately to 0 when one host goes offline.
Secondly, it takes a lot of time (around 100 seconds) for Ceph to detect
corresponding OSDs on that host are down.
Finally, once the Ceph has detected all offline OSDs, the EC pool seems to
act normally and it is ready for IO operations again.
So, here are my questions:
1. Is this normal that the IO rate drops to 0 immediately even though
there is only one host goes offline?
2. How to make Ceph reduce the time needed to detect failed OSDs?
This is intended as there is no communication from the host that it's going
down (basically cable yanked), so the other clients expect it to be alive.
I would recommend not setting this value too low as it would introduce
false positives that could cause a death spiral in Ceph. If a host takes
longer to respond to heartbeats than your new shorter time out, it will get
kicked out of the cluster, causing peering of all the other nodes. Then
when it comes back a short time later, it will cause peering again. All
these peerings could cause other OSDs to miss their heartbeats and the
problem only gets worse as it compounds. Server failure events should be
fairly infrequent that 100 seconds is a good compromise. You are able to
adjust the timeout, but I highly recommend you don't go shorter.
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1