frequent Monitor down - ceph-users

28 Oct 2020

Hello everyone, 

I am having regular messages that the Monitors are going down and up: 

2020-10-27T09:50:49.032431+0000 mon .arh-ibstorage2-ib ( mon .1) 2248 : cluster [WRN]
Health check failed: 1/4 mons down, quorum
arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib (MON_DOWN) 
2020-10-27T09:50:49.123511+0000 mon .arh-ibstorage2-ib ( mon .1) 2250 : cluster [WRN]
overall HEALTH_WARN 23 OSD(s) experiencing BlueFS spillover; 3 large omap objects; 1/4
mons down, quorum arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib; noout flag(s)
set; 43 pgs not deep-scrubbed in time; 12 pgs not scrubbed in time 
2020-10-27T09:50:52.735457+0000 mon .arh-ibstorage1-ib ( mon .0) 31287 : cluster [INF]
Health check cleared: MON_DOWN (was: 1/4 mons down, quorum
arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib) 
2020-10-27T12:35:20.556458+0000 mon .arh-ibstorage2-ib ( mon .1) 2260 : cluster [WRN]
Health check failed: 1/4 mons down, quorum
arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib (MON_DOWN) 
2020-10-27T12:35:20.643282+0000 mon .arh-ibstorage2-ib ( mon .1) 2262 : cluster [WRN]
overall HEALTH_WARN 23 OSD(s) experiencing BlueFS spillover; 3 large omap objects; 1/4
mons down, quorum arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib; noout flag(s)
set; 47 pgs not deep-scrubbed in time; 14 pgs not scrubbed in time 

This happens on a daily basis several times a day. 

Could you please let me know how to fix this annoying problem? 

I am running ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus
(stable) on Ubuntu 18.04 LTS with latest updates. 

Thanks 

Andrei