Earmarking Subvolumes for Protocol Access - Dev

24 Apr 2024

Hi all,

One of the items we need to do in preparation for supporting sharing CephFS 
with SMB is to make the management layer, in particular our smb and nfs MGR 
modules, assist the admins by avoiding the situation where the same 
directories are configured for nfs and smb access. We're mostly focused on 
sharing out subvolumes. I've taken to calling the need to make a subvolume 
exclusive to one protocol or the other as "earmarking" the subvolume. We have 
rough design sketched out but I want to ask a few questions, largely aimed at 
the file system team, before we proceed any further.

First off, we intended to apply metadata to a subvolume when it is used by nfs 
or smb for sharing. I've identified two candidates for storing the metadata:
1. storing it in xattrs in the root of the subvolume
2. using the `ceph fs subvolume metadata ...` commands

I am fairly familiar with xattrs and think they would be pretty appropriate 
for this. Accessing them is easy with standard (libcephfs) file system APIs. 
One advantage of using xattrs is that they'd also be visible to the protocol 
servers (samba/ganesha) and if we wanted to we could (re)use the metadata 
there too, if needed.

I'm less familiar with the subvolume metadata commands. The docs say they're 
for "custom metadata", something I think this falls under. What I'm less
certain of is if this is a good user case for this particular kind of custom 
metadata. One good thing about this is that is clearly specific to subvolumes.
It also has the advantage that you can't accidentally modify/remove an xattr 
via the network fs protocol like you possibly could with an xattr over nfs/
smb.

One last thought, and this is not a requirement but something that might be 
interesting down the road is if we could hide directories, at the CephFS 
level, from some clients unless they explicitly opt-in to having subvolumes 
earmarked to type X visible. IIRC hiding directories is something that's 
already done for .snapshot dirs, right? If this seems too strange, do not 
worry, I am not going to insist on it... I just thought it might be cool to 
have in the future, but only if it's easy. I only mention it in case this idea 
tips the scales a bit to either option (1) or (2).

Once we decide on a method of storing metadata for a subvolume, I am also 
curious if anyone has opinions if we should have a single key-value pair for
all protocol specific "earmarking" metadata or split things across multiple  
items. At least for smb, I want to reuse the feature not only to block sharing
a dir that's already shared with nfs but for blocking (re)sharing dirs with 
incompatible idmapping/acl metadata. I see two options, one where we store
everything in a single key (hypothetically, something like "smb.ad,idmapV1" or 
"smb.ad,id=foo". The other option is to use multiple key-value pairs. The first
option has an advantage of being more "atomic" and keeping everything together 
but the disadvantage of needing to be parsed.

Any other thoughts on the subject would be welcome!