Yet another meltdown starting - ceph-users

11 May 2020

Hi all,

another client-load induced meltdown. It is just starting and I hope we get it under
control. This time, its the MGRs failing under the load. It looks like thay don't
manage to get their beacons to the mons and are kicked out as unresponsive. However, the
processes are fine and up. Its just an enormous load.

I'm trying to increase

# ceph config set global mon_mgr_beacon_grace 90

but the command doesn't complete. I guess because all the MGRs are out. Is there any
way to force the MONs *not* to mark MGRs as unresponsive?

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14