On Mon, Apr 5, 2021 at 12:58 PM Sage Weil <sage(a)newdream.net> wrote:
On Mon, Apr 5, 2021 at 2:48 PM Jeff Layton <jlayton(a)redhat.com> wrote:
On Mon, 2021-04-05 at 13:55 -0500, Sage Weil
wrote:
On Mon, Apr 5, 2021 at 1:33 PM Jeff Layton
<jlayton(a)redhat.com> wrote:
On Thu, 2021-04-01 at 11:04 +0200, Dan van der Ster wrote:
> If one kernel mounts the
> same cephfs several times (with different prefixes), we observed that
> this is a unique client session. But does the ceph module globally
> share a single copy of cluster metadata, e.g. osdmaps, or is that all
> duplicated per session?
>
One copy per-cluster client, which should generally be shared between
mounts to the same cluster, provided that you're using similar-enough
mount options for the kernel to do that.
I suspect the problem is that if these are coming from mgr/volumes,
then each mount has a unique cephx user (and a client cap that locks
them into the exported directory), which means that the client
instances can't be shared.
Oof. You're probably right. In that case, you're sort of SoL since you
really do have to have a different client if the creds are different.
We might consider:
1. An alternate mgr/volumes auth mode/model where a single user has
access to the whole volume (i.e., all subvolumes). This might not
require any change in mgr/volumes itself, actually--just use
volume-granularity creds for the client.
2. A hybrid kernel client mode where we can share a single
ceph_{mon,osd}_client for the data path, but have independent
ceph_mds_clients for each mount. (As a practical matter, the osd caps
are identical, so it's annoying that each mount has independent OSD
connections.)
This comes with the downside that if the auth credential is
blocklisted for any reason, it takes down every other mount too. You
also have the inverse problem: if the MDS blocklists a misbehaving
client, that client may still blindly continue reading/writing because
it's using another instance for the OSD communication.
3. A mechanism for the caps to be refreshed for a
client after the
connection is established. That might allow a per-client auth
identity to be used, and the caps for that client to be adjusted as
volumes are added/removed from that host.
Not really wild about any of these except for the first once, since it
probably requires minimal changes to ceph-csi only... :)
Still, it's hard to imagine that it's
_that_ much overhead, even at 350
mounts, but I guess it depends on the amount of memory in the host.
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D