[Ceph-users] Re: MDS failing under load with large cache sizes

24 Jul 2019

...
  what's the ceph.com mailing list? I wondered
whether this list is dead but it's the list announced on the official ceph.com
homepage, isn't it? There are two mailing lists announced on the website. If you
go to 
https://ceph.com/resources/ you will find the 
subscribe/unsubscribe/archive links for the (much more active) ceph.com 
MLs. But if you click on "Mailing Lists & IRC page" you will get to a 
page where you can subscribe to this list, which is different. Very 
confusing.

...
  What did you have the MDS cache size set to at the
time?

 < and an inode count between 
I actually did not think I'd get a reply here. We are a bit further than 
this on the other mailing list. This is the thread: 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-July/036095.html

To sum it up: the ceph client prevents the MDS from freeing its cache, 
so inodes keep piling up until either the MDS becomes too slow (fixable 
by increasing the beacon grace time) or runs out of memory. The latter 
will happen eventually. In the end, my MDSs couldn't even rejoin because 
they hit the host's 128GB memory limit and crashed.

...
  This is probably related to using multiple active
metadata servers.
 There have been stability issues we're still looking to work out with
 the MDS balancer, especially with these batch-create style workloads.
 However, if you're willing to use subtree pinning [1] where you
 statically assign each directory tree before each rsync job uploads
 its data, then that should be safe as the balancer will effectively be
 disabled. 
The same happens with only one MDS. I tried it with a fresh CephFS and 
after two minutes of rsyncing stuff into it, I hit 900k inodes before I 
stopped the process.

...
  Alternatively, using one active MDS for the duration
of the batch
 upload should work too but may be significantly slower. The point was to have
multiple for the transfer, because one couldn't 
keep up. But now it seems like the problem wasn't really the single-MDS 
bottleneck. The problem was rather the unbounded growth of runaway 
inodes, which I do not have a solution for.

...
  Same error? Missed heartbeats? 
Yeah, same thing.

2024

2023

2022

2021

2020

2019

[Ceph-users] Re: MDS failing under load with large cache sizes