On Fri, Nov 15, 2019 at 4:45 PM Joao Eduardo Luis <joao(a)suse.de> wrote:
On 19/11/14 11:04AM, Gregory Farnum wrote:
On Thu, Nov 14, 2019 at 8:14 AM Dan van der Ster
<dan(a)vanderster.com> wrote:
Hi Joao,
I might have found the reason why several of our clusters (and maybe
Bryan's too) are getting stuck not trimming osdmaps.
It seems that when an osd fails, the min_last_epoch_clean gets stuck
forever (even long after HEALTH_OK), until the ceph-mons are
restarted.
I've updated the ticket:
https://tracker.ceph.com/issues/41154
Wrong ticket, I think you meant
https://tracker.ceph.com/issues/37875#note-7
I've seen this behavior a long, long time ago, but stopped being able to
reproduce it consistently enough to ensure the patch was working properly.
I think I have a patch here:
https://github.com/ceph/ceph/pull/19076/commits
If you are feeling adventurous, and want to give it a try, let me know. I'll
be happy to forward port it to whatever you are running.
Thanks Joao, this patch is what I had in mind.
I'm trying to evaluate how adventurous this would be -- Is there any
risk that if a huge number of osds are down all at once (but
transiently), it would trigger the mon to trim too many maps?
I would expect that the remaining up OSDs will have a safe, low, osd_epoch ?
And anyway I guess that your proposed get_min_last_epoch_clean patch
is equivalent to what we have today if we restart the ceph-mon leader
while an osd is down.
-- Dan