[Ceph-users] Re: MDS failing under load with large cache sizes

25 Jul 2019

+ other ceph-users

On Wed, Jul 24, 2019 at 10:26 AM Janek Bevendorff
&lt;janek.bevendorff(a)uni-weimar.de&gt; wrote:
...

  what's the ceph.com mailing list? I wondered
whether this list is dead but it's the list announced on the official ceph.com
homepage, isn't it?  There are two mailing lists announced on the website. If
you go to
 https://ceph.com/resources/ you will find the
 subscribe/unsubscribe/archive links for the (much more active) ceph.com
 MLs. But if you click on "Mailing Lists & IRC page" you will get to a
 page where you can subscribe to this list, which is different. Very
 confusing. 
It is confusing. This is supposed to be the new ML but I don't think
the migration has started yet.

...
   What did you
have the MDS cache size set to at the time?

 < and an inode count between 
 I actually did not think I'd get a reply here. We are a bit further than
 this on the other mailing list. This is the thread:
 http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-July/036095.html

 To sum it up: the ceph client prevents the MDS from freeing its cache,
 so inodes keep piling up until either the MDS becomes too slow (fixable
 by increasing the beacon grace time) or runs out of memory. The latter
 will happen eventually. In the end, my MDSs couldn't even rejoin because
 they hit the host's 128GB memory limit and crashed. 
It's possible the MDS is not being aggressive enough with asking the
single (?) client to reduce its cache size. There were recent changes
[1] to the MDS to improve this. However, the defaults may not be
aggressive enough for your client's workload. Can you try:

ceph config set mds mds_recall_max_caps 10000
ceph config set mds mds_recall_max_decay_rate 1.0

Also your other mailings made me think you may still be using the old
inode limit for the cache size. Are you using the new
mds_cache_memory_limit config option?

Finally, if this fixes your issue (please let us know!) and you decide
to try multiple active MDS, you should definitely use pinning as the
parallel create workload will greatly benefit from it.

[1] https://ceph.com/community/nautilus-cephfs/

--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

2024

2023

2022

2021

2020

2019

[Ceph-users] Re: MDS failing under load with large cache sizes