On Thu, 2021-04-01 at 11:04 +0200, Dan van der Ster wrote:
Hi,
Context: one of our users is mounting 350 ceph kernel PVCs per 30GB VM
and they notice "memory pressure".
Manifested how?
When planning for k8s hosts, what would be a
reasonable limit on the
number of ceph kernel PVCs to mount per host?
This seems like a really difficult thing to gauge. It depends on a
number of different factors including amount of RAM and CPUs on the box,
mount options, workload and applications, etc...
If one kernel mounts the
same cephfs several times (with different prefixes), we observed that
this is a unique client session. But does the ceph module globally
share a single copy of cluster metadata, e.g. osdmaps, or is that all
duplicated per session?
One copy per-cluster client, which should generally be shared between
mounts to the same cluster, provided that you're using similar-enough
mount options for the kernel to do that.
Can anyone estimate how much memory is
consumed by each mount (assuming it is a client of an O(1k) osd ceph
cluster)?
Again, hard to tell, and somewhat nebulous. Each mount will get its own
superblock, but most of the client info is shared, so the overhead from
an additional mount itself should be fairly trivial.
The big question mark is how many inodes and dentries you have in core
at the time, and how much data (particularly, dirty data) you have in
the pagecache.
Also, k8s makes it trivial for a user to mount a
single PVC from
hundreds or thousands of clients. Suppose we wanted to be able to
limit the number of clients per PVC -- Do you think a new
`max_sessions=N` cephx cap would be the best approach for this?
Why do you want to limit the number of clients per PVC? I'm not sure
that would really solve anything.
FWIW, I'm not a fan of solutions that end up with clients pooping
themselves because they get back some esoteric error due to exceeding a
limit when trying to mount or something.
--
Jeff Layton <jlayton(a)redhat.com>