Thanks that helps. Looks like the problem is that the
MDS is not
automatically trimming its cache fast enough. Please try bumping
mds_cache_trim_threshold:
bin/ceph config set mds mds_cache_trim_threshold 512K
That did help. Somewhat. I removed the aggressive recall settings I set
before and only set this option instead. The cache size seems to be
quite stable now, although still increasing in the long run (but at
least not strictly monotonically).
However, now my client processes are basically in constant I/O wait
state and the CephFS is slow for everybody. After I restarted the copy
job, I got around 4k reqs/s and then it went down to 100 reqs/s with
everybody waiting their turn. So yes, it does seem to help, but it
increases latency by a magnitude.
As always, it would be great if these options were documented somewhere.
Google has like five results, one of them being this thread. ;-)
> Increase it further if it's not aggressive enough. Please let us know
> if that helps.
>
> It shouldn't be necessary to do this so I'll make a tracker ticket
> once we confirm that's the issue.