+ other ceph-users
On Wed, Jul 24, 2019 at 10:26 AM Janek Bevendorff
<janek.bevendorff(a)uni-weimar.de> wrote:
what's the
ceph.com mailing list? I wondered
whether this list is dead but it's the list announced on the official
ceph.com
homepage, isn't it?
There are two mailing lists announced on the website. If
you go to
https://ceph.com/resources/ you will find the
subscribe/unsubscribe/archive links for the (much more active)
ceph.com
MLs. But if you click on "Mailing Lists & IRC page" you will get to a
page where you can subscribe to this list, which is different. Very
confusing.
It is confusing. This is supposed to be the new ML but I don't think
the migration has started yet.
What did you
have the MDS cache size set to at the time?
< and an inode count between
I actually did not think I'd get a reply here. We are a bit further than
this on the other mailing list. This is the thread:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-July/036095.html
To sum it up: the ceph client prevents the MDS from freeing its cache,
so inodes keep piling up until either the MDS becomes too slow (fixable
by increasing the beacon grace time) or runs out of memory. The latter
will happen eventually. In the end, my MDSs couldn't even rejoin because
they hit the host's 128GB memory limit and crashed.
It's possible the MDS is not being aggressive enough with asking the
single (?) client to reduce its cache size. There were recent changes
[1] to the MDS to improve this. However, the defaults may not be
aggressive enough for your client's workload. Can you try:
ceph config set mds mds_recall_max_caps 10000
ceph config set mds mds_recall_max_decay_rate 1.0
Also your other mailings made me think you may still be using the old
inode limit for the cache size. Are you using the new
mds_cache_memory_limit config option?
Finally, if this fixes your issue (please let us know!) and you decide
to try multiple active MDS, you should definitely use pinning as the
parallel create workload will greatly benefit from it.
[1]
https://ceph.com/community/nautilus-cephfs/
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D