On Wed, Mar 11, 2020 at 10:41 PM Robert LeBlanc <robert(a)leblancnet.us> wrote:
This is the second time this happened in a couple of weeks. The MDS locks
up and the stand-by can't take over so the Montiors black list them. I try
to unblack list them, but they still say this in the logs
mds.0.1184394 waiting for osdmap 234947 (which blacklists prior instance)
Do not *ever* unblacklist an MDS. Restart the daemon.
Looking at a pg dump, it looks like the epoch is
passed that.
$ ceph pg map 3.756
osdmap e234953 pg 3.756 (3.756) -> up [113,180,115] acting [113,180,115]
Last time, it seemed to just recover after about an hour all by it's self.
Any way to speed this up?
We need more cluster information, error messages, client
versions/types, etc. to help.
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D