[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems

7 Dec 2020

Never mind, when I enable it on a more busy directory, I do see new 
ephemeral pins popping up. Just not on the directories I set it on 
originally. Let's see how that holds up.

On 07/12/2020 13:04, Janek Bevendorff wrote:
> Thanks. I tried playing around a bit with 
> mds_export_ephemeral_distributed just now, because it's pretty much 
> the same thing that your script does manually. Unfortunately, it seems 
> to have no effect.
>
> I pinned all top-level directories to mds.0 and then enabled 
> ceph.dir.pin.distributed for a few sub trees. Despite 
> mds_export_ephemeral_distributed being set to true, all work is done 
> by mds.0 now and I also don't see any additional pins in ceph tell 
> mds.\* get subtrees.
>
> Any ideas why that might be?
>
>
> On 07/12/2020 10:49, Dan van der Ster wrote:
>> On Mon, Dec 7, 2020 at 10:39 AM Janek Bevendorff
>> &lt;janek.bevendorff(a)uni-weimar.de&gt; wrote:
>>>
>>>> What exactly do you set to 64k?
>>>> We used to set mds_max_caps_per_client to 50000, but once we started
>>>> using the tuned caps recall config, we reverted that back to the
>>>> default 1M without issue.
>>> mds_max_caps_per_client. As I mentioned, some clients hit this limit
>>> regularly and they aren't entirely idle. I will keep tuning the recall
>>> settings, though.
>>>
>>>> This 15k caps client I mentioned is not related to the max caps per
>>>> client config. In recent nautilus, the MDS will proactively recall
>>>> caps from idle clients -- so a client with even just a few caps like
>>>> this can provoke the caps recall warnings (if it is buggy, like in
>>>> this case). The client doesn't cause any real problems, just the
>>>> annoying warnings.
>>> We only see the warnings during normal operation. I remember having
>>> massive issues with early Nautilus releases, but thanks to more
>>> aggressive recall behaviour in newer releases, that is fixed. Back then
>>> it was virtually impossible to keep the MDS within the bounds of its
>>> memory limit. Nowadays, the warnings only appear when the MDS is really
>>> stressed. In that situation, the whole FS performance is already
>>> degraded massively and MDSs are likely to fail and run into the 
>>> rejoin loop.
>>>
>>>> Multi-active + pinning definitely increases the overall MD throughput
>>>> (once you can get the relevant inodes cached), because as you know the
>>>> MDS is single threaded and CPU bound at the limit.
>>>> We could get something like 4-5k handle_client_requests out of a
>>>> single MDS, and that really does scale horizontally as you add MDSs
>>>> (and pin).
>>> Okay, I will definitely re-evaluate options for pinning individual
>>> directories, perhaps a small script can do it.
>> There is a new ephemeral pinning option in the latest latest releases,
>> but we didn't try it yet.
>> Here's our script -- it assumes the parent dir is pinned to zero or
>> that bal is disabled:
>>
>>
https://github.com/cernceph/ceph-scripts/blob/master/tools/cephfs/cephfs-ba… 
>>
>>
>> Too many pins can cause problems -- we have something like 700 pins at
>> the moment and it's fine, though.
>>
>> Cheers, Dan
>>
>>
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems