Really? First time I read this here, afaik you can get a split brain
like this.
-----Original Message-----
Sent: Thursday, October 29, 2020 12:16 AM
To: Eugen Block
Cc: ceph-users
Subject: [ceph-users] Re: frequent Monitor down
Eugen, I've got four physical servers and I've installed mon on all of
them. I've discussed it with Wido and a few other chaps from ceph and
there is no issue in doing it. The quorum issues would happen if you
have 2 mons. If you've got more than 2 you should be fine.
Andrei
----- Original Message -----
From: "Eugen Block" <eblock(a)nde.ag>
To: "Andrei Mikhailovsky" <andrei(a)arhont.com>
Cc: "ceph-users" <ceph-users(a)ceph.io>
Sent: Wednesday, 28 October, 2020 20:19:15
Subject: Re: [ceph-users] Re: frequent Monitor down
Why do you have 4 MONs in the first place? That way a
quorum is
difficult to achieve, could it be related to that?
Zitat von Andrei Mikhailovsky <andrei(a)arhont.com>om>:
> Yes, I have, Eugen, I see no obvious reason / error / etc. I see a
> lot of entries relating to Compressing as well as monitor going down.
>
> Andrei
>
>
>
> ----- Original Message -----
>> From: "Eugen Block" <eblock(a)nde.ag>
>> To: "ceph-users" <ceph-users(a)ceph.io>
>> Sent: Wednesday, 28 October, 2020 11:51:20
>> Subject: [ceph-users] Re: frequent Monitor down
>
>> Have you looked into syslog and mon logs?
>>
>>
>> Zitat von Andrei Mikhailovsky <andrei(a)arhont.com>om>:
>>
>>> Hello everyone,
>>>
>>> I am having regular messages that the Monitors are going down and
up:
>>>
>>> 2020-10-27T09:50:49.032431+0000 mon .arh-ibstorage2-ib ( mon .1)
>>> 2248 : cluster [WRN] Health check failed: 1/4 mons down, quorum
>>> arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib (MON_DOWN)
>>> 2020-10-27T09:50:49.123511+0000 mon .arh-ibstorage2-ib ( mon .1)
>>> 2250 : cluster [WRN] overall HEALTH_WARN 23 OSD(s) experiencing
>>> BlueFS spillover; 3 large omap objects; 1/4 mons down, quorum
>>> arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib; noout
>>> flag(s) set; 43 pgs not deep-scrubbed in time; 12 pgs not scrubbed
>>> in time 2020-10-27T09:50:52.735457+0000 mon .arh-ibstorage1-ib (
>>> mon .0)
>>> 31287 : cluster [INF] Health check cleared: MON_DOWN (was: 1/4 mons
>>> down, quorum
arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib)
>>> 2020-10-27T12:35:20.556458+0000 mon .arh-ibstorage2-ib ( mon .1)
>>> 2260 : cluster [WRN] Health check failed: 1/4 mons down, quorum
>>> arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib (MON_DOWN)
>>> 2020-10-27T12:35:20.643282+0000 mon .arh-ibstorage2-ib ( mon .1)
>>> 2262 : cluster [WRN] overall HEALTH_WARN 23 OSD(s) experiencing
>>> BlueFS spillover; 3 large omap objects; 1/4 mons down, quorum
>>> arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib; noout
>>> flag(s) set; 47 pgs not deep-scrubbed in time; 14 pgs not scrubbed
>>> in time
>>>
>>>
>>> This happens on a daily basis several times a day.
>>>
>>> Could you please let me know how to fix this annoying problem?
>>>
>>> I am running ceph version 15.2.4
>>> (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable) on
>>> Ubuntu 18.04 LTS with latest updates.
>>>
>>> Thanks
>>>
>>> Andrei
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send
>>> an email to ceph-users-leave(a)ceph.io
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
email to ceph-users-leave(a)ceph.io