, particularly #note6
You can see what the mon thinks the valid range of osdmaps is:
# ceph report | jq .osdmap_first_committed
113300
# ceph report | jq .osdmap_last_committed
113938
Then the workaround to start trimming is to restart the leader.
This shrinks the range on the mon, which then starts telling the osds
to trim range.
Note that the OSDs will only trim 30 osdmaps for each new osdmap
generated -- so if you have a lot of osdmaps to trim, you need to
generate more.
-- dan
On Thu, Mar 12, 2020 at 11:02 AM Nikola Ciprich
<nikola.ciprich(a)linuxbox.cz> wrote:
OK,
so I can confirm that at least in my case, the problem is caused
by old osd maps not being pruned for some reason, and thus not fitting
into cache. When I increased osd map cache to 5000 the problem is gone.
The question is why they're not being pruned, even though the cluster is in
healthy state. But you can try checking:
ceph daemon osd.X status to see how many maps are your OSDs storing
and ceph daemon osd.X perf dump | grep osd_map_cache_miss
to see if you're experiencing similar problem..
so I'm going to debug further..
BR
nik
On Thu, Mar 12, 2020 at 09:16:58AM +0100, Nikola Ciprich wrote:
> Hi Paul and others,
>
> while digging deeper, I noticed that when the cluster gets into this
> state, osd_map_cache_miss on OSDs starts growing rapidly.. even when
> I increased osd map cache size to 500 (which was the default at least
> for luminous) it behaves the same..
>
> I think this could be related..
>
> I'll try playing more with cache settings..
>
> BR
>
> nik
>
>
>
> On Wed, Mar 11, 2020 at 03:40:04PM +0100, Paul Emmerich wrote:
> > Encountered this one again today, I've updated the issue with new
> > information:
https://tracker.ceph.com/issues/44184
> >
> >
> > Paul
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at
https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> >
www.croit.io
> > Tel: +49 89 1896585 90
> >
> > On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich
> > <nikola.ciprich(a)linuxbox.cz> wrote:
> > >
> > > Hi,
> > >
> > > I just wanted to report we've just hit very similar problem.. on
mimic
> > > (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow
> > > ops caused by waiting for new map. It seems those are slowed by SATA
> > > OSDs which keep being 100% busy reading for long time until all ops are
gone,
> > > blocking OPS on unrelated NVME pools - SATA pools are completely unused
now.
> > >
> > > is this possible that those maps are being requested from slow SATA OSDs
> > > and it takes such a long time for some reason? why could it take so long?
> > > the cluster is very small with very light load..
> > >
> > > BR
> > >
> > > nik
> > >
> > >
> > >
> > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote:
> > > >
> > > >
> > > > On 2/19/20 9:34 AM, Paul Emmerich wrote:
> > > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander
<wido(a)42on.com> wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote:
> > > > >>> I've also seen this problem on Nautilus with no
obvious reason for the
> > > > >>> slowness once.
> > > > >>
> > > > >> Did this resolve itself? Or did you remove the pool?
> > > > >
> > > > > I've seen this twice on the same cluster, it fixed itself
the first
> > > > > time (maybe with some OSD restarts?) and the other time I
removed the
> > > > > pool after a few minutes because the OSDs were running into
heartbeat
> > > > > timeouts. There unfortunately seems to be no way to reproduce
this :(
> > > > >
> > > >
> > > > Yes, that's the problem. I've been trying to reproduce it,
but I can't.
> > > > It works on all my Nautilus systems except for this one.
> > > >
> > > > As you saw it, Bryan saw it, I expect others to encounter this at
some
> > > > point as well.
> > > >
> > > > I don't have any extensive logging as this cluster is in
production and
> > > > I can't simply crank up the logging and try again.
> > > >
> > > > > In this case it wasn't a new pool that caused problems but a
very old one.
> > > > >
> > > > >
> > > > > Paul
> > > > >
> > > > >>
> > > > >>> In my case it was a rather old cluster that was upgraded
all the way
> > > > >>> from firefly
> > > > >>>
> > > > >>>
> > > > >>
> > > > >> This cluster has also been installed with Firefly. It was
installed in
> > > > >> 2015, so a while ago.
> > > > >>
> > > > >> Wido
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users(a)ceph.io
> > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > > >
> > >
> > > --
> > > -------------------------------------
> > > Ing. Nikola CIPRICH
> > > LinuxBox.cz, s.r.o.
> > > 28.rijna 168, 709 00 Ostrava
> > >
> > > tel.: +420 591 166 214
> > > fax: +420 596 621 273
> > > mobil: +420 777 093 799
> > >
www.linuxbox.cz
> > >
> > > mobil servis: +420 737 238 656
> > > email servis: servis(a)linuxbox.cz
> > > -------------------------------------
> >
>
> --
> -------------------------------------
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28.rijna 168, 709 00 Ostrava
>
> tel.: +420 591 166 214
> fax: +420 596 621 273
> mobil: +420 777 093 799
>
www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: servis(a)linuxbox.cz
> -------------------------------------
>
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava
tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
email servis: servis(a)linuxbox.cz
-------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io