On Wed, Apr 24, 2024 at 7:32 AM John Mulligan
<phlogistonjohn(a)asynchrono.us> wrote:
Hi all,
One of the items we need to do in preparation for supporting sharing CephFS
with SMB is to make the management layer, in particular our smb and nfs MGR
modules, assist the admins by avoiding the situation where the same
directories are configured for nfs and smb access. We're mostly focused on
sharing out subvolumes. I've taken to calling the need to make a subvolume
exclusive to one protocol or the other as "earmarking" the subvolume. We have
rough design sketched out but I want to ask a few questions, largely aimed at
the file system team, before we proceed any further.
Have you thought about how to handle it when we have deeper protocol
integration and actually do want to enable sharing of directories
across protocols?
Even without that, I know there's a problem mapping Windows ACLs
accurately into the NFS or POSIX security space, but IIUC there are
plenty of users who don't grab onto those edges and will actively move
data between applications using different protocols, so the ability to
swap them back and forth may also be important.
First off, we intended to apply metadata to a
subvolume when it is used by nfs
or smb for sharing. I've identified two candidates for storing the metadata:
1. storing it in xattrs in the root of the subvolume
2. using the `ceph fs subvolume metadata ...` commands
I am fairly familiar with xattrs and think they would be pretty appropriate
for this. Accessing them is easy with standard (libcephfs) file system APIs.
One advantage of using xattrs is that they'd also be visible to the protocol
servers (samba/ganesha) and if we wanted to we could (re)use the metadata
there too, if needed.
I'm less familiar with the subvolume metadata commands. The docs say they're
for "custom metadata", something I think this falls under. What I'm less
certain of is if this is a good user case for this particular kind of custom
metadata. One good thing about this is that is clearly specific to subvolumes.
It also has the advantage that you can't accidentally modify/remove an xattr
via the network fs protocol like you possibly could with an xattr over nfs/
smb.
It looks to me like "subvolume metadata" shoves the custom metadata
into a file in the fs, and was built for Ceph-CSI. Using it will put
it in a well-defined place, but will preclude this tooling from being
available to anybody who wants to export via Samba without using
subvolumes. I know you're focused on a subvolume solution, but I think
we'd rather not limit it to that if we don't have to?
However, a separate thing that just came up is that we want this
solution to support encrypted volumes as well — right now, we only
have fscrypt support via the kernel client, but we're working on
extending it to our userspace implementations as well. That has
implications for the design solution since we can't count on being
able to read user-visible metadata within lower directories until
after a key has been supplied (it's impacting our options for handling
samba's case-insensitivity). Not sure if this will matter for the
orchestration — for subvolumes, and probably for anything else, you
can always read the top-level info directory since the encryption
starts beneath that point.
One last thought, and this is not a requirement but
something that might be
interesting down the road is if we could hide directories, at the CephFS
level, from some clients unless they explicitly opt-in to having subvolumes
earmarked to type X visible. IIRC hiding directories is something that's
already done for .snapshot dirs, right? If this seems too strange, do not
worry, I am not going to insist on it... I just thought it might be cool to
have in the future, but only if it's easy. I only mention it in case this idea
tips the scales a bit to either option (1) or (2).
This isn't really a thing we do. The .snap directory is special-cased
throughout the code base, and it's not a real directory to begin with,
so it's not like we can set a flag and have clients hide an arbitrary
dir from listing. Sorry. :/
-Greg
Once we decide on a method of storing metadata for a subvolume, I am also
curious if anyone has opinions if we should have a single key-value pair for
all protocol specific "earmarking" metadata or split things across multiple
items. At least for smb, I want to reuse the feature not only to block sharing
a dir that's already shared with nfs but for blocking (re)sharing dirs with
incompatible idmapping/acl metadata. I see two options, one where we store
everything in a single key (hypothetically, something like "smb.ad,idmapV1" or
"smb.ad,id=foo". The other option is to use multiple key-value pairs. The
first
option has an advantage of being more "atomic" and keeping everything together
but the disadvantage of needing to be parsed.
Any other thoughts on the subject would be welcome!
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io