[ceph-users] Re: MDS Behind on Trimming...

8 Apr 2024

On 4/8/24 12:32, Erich Weiler wrote:
...
  Ah, I see.  Yes, we are already running version 18.2.1
on the server side (we just installed this cluster a few weeks ago from scratch).  So I
guess if the fix has already been backported to that version, then we still have a
problem.

 Dos that mean it could be the locker order bug (https://tracker.ceph.com/issues/62123) as
Xiubo suggested? 
Then it's possibly the lock order issue. Need to check it later.

Thanks

- Xiubo

...
  Thanks again,
 Erich

> On Apr 7, 2024, at 9:00 PM, Alexander E. Patrakov &lt;patrakov(a)gmail.com&gt; wrote:
>
> Hi Erich,
>
>> On Mon, Apr 8, 2024 at 11:51 AM Erich Weiler &lt;weiler(a)soe.ucsc.edu&gt; wrote:
>>
>> Hi Xiubo,
>>
>>> Thanks for your logs, and it should be the same issue with
>>> https://tracker.ceph.com/issues/62052, could you try to test with this
>>> fix again ?
>> This sounds good - but I'm not clear on what I should do?  I see a patch
>> in that tracker page, is that what you are referring to?  If so, how
>> would I apply such a patch?  Or is there simply a binary update I can
>> apply somehow to the MDS server software?
> The backport of this patch (https://github.com/ceph/ceph/pull/53241)
> was merged on October 18, 2023, and Ceph 18.2.1 was released on
> December 18, 2023. Therefore, if you are running Ceph 18.2.1 on the
> server side, you already have the fix. If you are already running
> version 18.2.1 or 18.2.2 (to which you should upgrade anyway), please
> complain, as the purported fix is then ineffective.
>
>> Thanks for helping!
>>
>> -erich
>>
>>> Please let me know if you still could see this bug then it should be the
>>> locker order bug as https://tracker.ceph.com/issues/62123.
>>>
>>> Thanks
>>>
>>> - Xiubo
>>>
>>>
>>> On 3/28/24 04:03, Erich Weiler wrote:
>>>> Hi All,
>>>>
>>>> I've been battling this for a while and I'm not sure where to go
from
>>>> here.  I have a Ceph health warning as such:
>>>>
>>>> # ceph -s
>>>>   cluster:
>>>>     id:     58bde08a-d7ed-11ee-9098-506b4b4da440
>>>>     health: HEALTH_WARN
>>>>             1 MDSs report slow requests
>>>>             1 MDSs behind on trimming
>>>>
>>>>   services:
>>>>     mon: 5 daemons, quorum
>>>> pr-md-01,pr-md-02,pr-store-01,pr-store-02,pr-md-03 (age 5d)
>>>>     mgr: pr-md-01.jemmdf(active, since 3w), standbys: pr-md-02.emffhz
>>>>     mds: 1/1 daemons up, 2 standby
>>>>     osd: 46 osds: 46 up (since 9h), 46 in (since 2w)
>>>>
>>>>   data:
>>>>     volumes: 1/1 healthy
>>>>     pools:   4 pools, 1313 pgs
>>>>     objects: 260.72M objects, 466 TiB
>>>>     usage:   704 TiB used, 424 TiB / 1.1 PiB avail
>>>>     pgs:     1306 active+clean
>>>>              4    active+clean+scrubbing+deep
>>>>              3    active+clean+scrubbing
>>>>
>>>>   io:
>>>>     client:   123 MiB/s rd, 75 MiB/s wr, 109 op/s rd, 1.40k op/s wr
>>>>
>>>> And the specifics are:
>>>>
>>>> # ceph health detail
>>>> HEALTH_WARN 1 MDSs report slow requests; 1 MDSs behind on trimming
>>>> [WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests
>>>>     mds.slugfs.pr-md-01.xdtppo(mds.0): 99 slow requests are blocked >
>>>> 30 secs
>>>> [WRN] MDS_TRIM: 1 MDSs behind on trimming
>>>>     mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (13884/250)
>>>> max_segments: 250, num_segments: 13884
>>>>
>>>> That "num_segments" number slowly keeps increasing.  I suspect
I just
>>>> need to tell the MDS servers to trim faster but after hours of
>>>> googling around I just can't figure out the best way to do it. The
>>>> best I could come up with was to decrease
"mds_cache_trim_decay_rate"
>>>> from 1.0 to .8 (to start), based on this page:
>>>>
>>>> https://www.suse.com/support/kb/doc/?id=000019740
>>>>
>>>> But it doesn't seem to help, maybe I should decrease it further? I
am
>>>> guessing this must be a common issue...?  I am running Reef on the MDS
>>>> servers, but most clients are on Quincy.
>>>>
>>>> Thanks for any advice!
>>>>
>>>> cheers,
>>>> erich
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
> --
> Alexander E. Patrakov 

2024

2023

2022

2021

2020

2019

[ceph-users] Re: MDS Behind on Trimming...