On Mon, Dec 7, 2020 at 1:28 PM Janek Bevendorff
<janek.bevendorff(a)uni-weimar.de> wrote:
This sounds like there is one or a few clients
acquiring too many
caps. Have you checked this? Are there any messages about the OOM
killer? What config changes for the MDS have you made?
Yes, it's individual clients acquiring too my caps. I first ran the
adjusted recall settings you suggested after we had gone through several
bugs. Right now I am trying distributed ephemeral pinning with 3 MDS
Dan's suggestion of 6x the default values for recall from the MDS
documentation thread. So far, it's working quite well.
Wow! Distributed epins :) Thanks for trying it. How many
sub-directories under the distributed epin'd directory? (There's a lot
of stability problems that are to be fixed in Pacific associated with
lots of subtrees so if you have too large of a directory, things could
get ugly!)
I'm
hopeful your problems will be addressed by:
https://tracker.ceph.com/issues/47307 That does indeed sound a bit like it might
fix these kind of issues.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D