So if you are
doing maintenance on a mon host in a 5 mon cluster you will still have 3 in the quorum.
Exactly. I was in exactly this situation, doing maintenance on 1 and screwing up number
2. Service outage
Been there. I had a cluster that nominally had 5 mons. Two suffered hardware issues, so
it ran on 3 for a distressingly extended period, a function of sparing policies, a
dilatory CSO, and a bizarre chassis selection that was not used for any other purpose.
The cluster ran fine, though there was a client side bug (Kefu might remember this).
The moral of the story is not that 5 doesn’t protect you, but rather that with 3 it could
have been much, much worse. Especially with overseas unstaffed DCs. Mons are lightweight
and inexpensive compared to compute or OSD nodes, the burgeoning constellation of ceph-mgr
plugins notwithstanding.
Double failures happen. They happen to OSD nodes, which is one reason why 2R is a bad
idea. They happen to mons too.
Back …. I think it was the 2015 OpenStack Summit in Vancouver, there was a Ceph operators
BoF of sorts where the question was raise if anyone found going to 7 to be advantageous.
The consensus seemed to be that any RAS benefits were down the tail of diminishing
returns, but that one would be climbing the traffic curve of inter-mon communication.
imho, ymmv, aad
. I will update to 5 as soon as I can.
Secondly: I actually do not believe a MON update has any meaning. It will be behind the
current term the moment the down MON rejoins quorum. If you loose all MONs, you would try
to bring the cluster up on an outdated backup. Did you ever try this out? I doubt it
works. I would expect this MON to wait for an up-to-date MON to show up as a sync source.
I suspect it will neither form nor join quorum. If you force it into quorum you will
likely loose data if not everything.
You don't need a backup of the MON store. It can be rebuild from the OSDs in the
cluster as a last resort, the procedure has been added to the ceph documentation.
Otherwise, just make sure you always have a quorum up. If you really need to refresh a MON
from scratch, shut it down, wipe the store and bring it up again.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Freddy Andersen <freddy(a)cfandersen.com>
Sent: 12 February 2021 16:47:08
To: huxiaoyu(a)horebdata.cn; Marc; Michal Strnad; ceph-users
Subject: [ceph-users] Re: Backups of monitor
I would say everyone recommends at least 3 monitors and since they need to be 1,3,5 or 7
I always read that as 5 is the best number (if you have 5 servers in your cluster). The
other reason is high availability since the MONs use Paxos for the quorum and I like to
have 3 in the quorum you need 5 to be able to do maintenance. (2 out of 3, 3 out of 5… )
So if you are doing maintenance on a mon host in a 5 mon cluster you will still have 3 in
the quorum.
From: huxiaoyu(a)horebdata.cn <huxiaoyu(a)horebdata.cn>
Date: Friday, February 12, 2021 at 8:42 AM
To: Freddy Andersen <freddy(a)cfandersen.com>om>, Marc <Marc(a)f1-outsourcing.eu>eu>,
Michal Strnad <michal.strnad(a)cesnet.cz>cz>, ceph-users <ceph-users(a)ceph.io>
Subject: [ceph-users] Re: Backups of monitor
Why 5 instead of 3 MONs are required?
huxiaoyu(a)horebdata.cn
From: Freddy Andersen
Date: 2021-02-12 16:05
To: huxiaoyu(a)horebdata.cn; Marc; Michal Strnad; ceph-users
Subject: Re: [ceph-users] Re: Backups of monitor
I would say production should have 5 MON servers
From: huxiaoyu(a)horebdata.cn <huxiaoyu(a)horebdata.cn>
Date: Friday, February 12, 2021 at 7:59 AM
To: Marc <Marc(a)f1-outsourcing.eu>eu>, Michal Strnad <michal.strnad(a)cesnet.cz>cz>,
ceph-users <ceph-users(a)ceph.io>
Subject: [ceph-users] Re: Backups of monitor
Normally any production Ceph cluster will have at least 3 MONs, does it reall need a
backup of MON?
samuel
huxiaoyu(a)horebdata.cn
From: Marc
Date: 2021-02-12 14:36
To: Michal Strnad; ceph-users(a)ceph.io
Subject: [ceph-users] Re: Backups of monitor
So why not create an extra start it only when you want to make a backup, wait until it is
up to date, stop it and then stop it to back it up?
-----Original Message-----
From: Michal Strnad <michal.strnad(a)cesnet.cz>
Sent: 11 February 2021 21:15
To: ceph-users(a)ceph.io
Subject: [ceph-users] Backups of monitor
Hi all,
We are looking for a proper solution for backups of monitor (all maps
that they hold). On the internet we found advice that we have to stop
one of monitor, back it up (dump) and start daemon again. But this is
not right approach due to risk of loosing quorum and need of
synchronization after monitor is back online.
Our goal is to have at least some (recent) metadata of objects in
cluster for the last resort when all monitors are in very bad
shape/state and we could start any of them. Maybe there is another
approach but we are not aware of it.
We are running the latest nautilus and three monitors on every cluster.
Ad. We don't want to use more monitors than thee.
Thank you
Cheers
Michal
--
Michal Strnad
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io