[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

26 Mar 2024

On Monday, March 25, 2024 3:22:26 PM EDT Alexander E. Patrakov wrote:
...
  On Mon, Mar 25, 2024 at 11:01 PM John Mulligan

 &lt;phlogistonjohn(a)asynchrono.us&gt; wrote:
  On Friday, March 22, 2024 2:56:22 PM EDT
Alexander E. Patrakov wrote:
  Hi John,

  A few major features we have planned include:
 * Standalone servers (internally defined users/groups)  
 No concerns here

  * Active Directory Domain Member Servers  
 In the second case, what is the plan regarding UID mapping? Is NFS
 coexistence planned, or a concurrent mount of the same directory using
 CephFS directly?  
 In the immediate future the plan is to have a very simple, fairly
 "opinionated" idmapping scheme based on the autorid backend.  
 OK, the docs for clustered SAMBA do mention the autorid backend in
 examples. It's a shame that the manual page does not explicitly list
 it as compatible with clustered setups.

 However, please consider that the majority of Linux distributions
 (tested: CentOS, Fedora, Alt Linux, Ubuntu, OpenSUSE) use "realmd" to
 join AD domains by default (where "default" means a pointy-clicky way
 in a workstation setup), which uses SSSD, and therefore, by this
 opinionated choice of the autorid backend, you create mappings that
 disagree with the supposed majority and the default. This will create
 problems in the future when you do consider NFS coexistence.

Thanks, I'll keep that in mind.

...
  Well, it's a different topic that most
organizations that I have seen
 seem to ignore this default. Maybe those that don't have any problems
 don't have any reason to talk to me? I think that more research is
 needed here on whether RedHat's and GNOME's push of SSSD is something
 not-ready or indeed the de-facto standard setup.

I think it's a bit of a mix, but am not sure either. 

...
  Even if you don't want to use SSSD, providing an
option to provision a
 few domains with idmap rid backend with statically configured ranges
 (as an override to autorid) would be a good step forward, as this can
 be made compatible with the default RedHat setup. 
That's reasonable. Thanks for the suggestion.

...

  Sharing the same directories over both NFS and
SMB at the same time, also
 known as "multi-protocol", is not planned for now, however we're all aware
 that there's often a demand for this feature and we're aware of the
 complexity it brings. I expect we'll work on that at some point but not
 initially. Similarly, sharing the same directories over a SMB share and
 directly on a cephfs mount won't be blocked but we won't recommend it.  
 OK. Feature request: in the case if there are several CephFS
 filesystems, support configuration of which one to serve.

Putting it on the list.

...
    In fact, I am quite skeptical, because, at least in my
experience,
 every customer's SAMBA configuration as a domain member is a unique
 snowflake, and cephadm would need an ability to specify arbitrary UID
 mapping configuration to match what the customer uses elsewhere - and
 the match must be precise.  
 I agree - our initial use case is something along the lines:
 Users of a Ceph Cluster that have Windows systems, Mac systems, or
 appliances that are joined to an existing AD
 but are not currently interoperating with the Ceph cluster.

 I expect to add some idpapping configuration and agility down the line,
 especially supporting some form of rfc2307 idmapping (where unix IDs are
 stored in AD).  
 Yes, for whatever reason, people do this, even though it is cumbersome
 to manage.

  But those who already have idmapping schemes and
samba accessing ceph will
 probably need to just continue using the existing setups as we don't have
 an immediate plan for migrating those users.

  Here is what I have seen or was told about:

 1. We don't care about interoperability with NFS or CephFS, so we just
 let SAMBA invent whatever UIDs and GIDs it needs using the "tdb2"
 idmap backend. It's completely OK that workstations get different UIDs
 and GIDs, as only SIDs traverse the wire.  
 This is pretty close to our initial plan but I'm not clear why you'd think
 that "workstations get different UIDs and GIDs". For all systems acessing
 the (same) ceph cluster the id mapping should be consistent.
 You did make me consider multi-cluster use cases with something like
 cephfs
 volume mirroring - that's something that I hadn't thought of before *but*
 using an algorithmic mapping backend like autorid (and testing) I think
 we're mostly OK there.  
 The tdb2 backend (used in my example) is not algorithmic, it is
 allocating. That is, it sequentially allocates IDs on the
 first-seen-first-allocated basis. Yet this is what this customer uses,
 presumably because it is the only backend that explicitly specifies
 clustering operation in its manual page.

 And the "autorid" backend is also not fully algorithmic, it allocates
 ranges to domains on the same sequential basis (see
 https://github.com/samba-team/samba/blob/6fb98f70c6274e172787c8d5f73aa939201
 71e7c/source3/winbindd/idmap_autorid_tdb.c#L82), and therefore can create
 mismatching mappings if two workstations or servers have seen the users
 DOMA\usera and DOMB\userb in a different order. It is even mentioned in the
 manual page. SSSD largely avoids this problem by hashing the domain portion
 of the SID instead of
 allocating the subranges on a sequential basis.

Agreed. Thanks for the reminder. This will certainly need to go on the test 
plan.

...
    2. [not seen in the wild, the customer did not
actually implement it,
 it's a product of internal miscommunication, and I am not sure if it
 is valid at all] We don't care about interoperability with CephFS,
 and, while we have NFS, security guys would not allow running NFS
 non-kerberized. Therefore, no UIDs or GIDs traverse the wire, only
 SIDs and names. Therefore, all we need is to allow both SAMBA and NFS
 to use shared UID mapping allocated on as-needed basis using the
 "tdb2" idmap module, and it doesn't matter that these UIDs and GIDs
 are inconsistent with what clients choose.  
 Unfortunately, I don't really understand this item. Fortunately, you say
 it
 was only considered not implemented. :-)

  3. We don't care about ACLs at all, and
don't care about CephFS
 interoperability. We set ownership of all new files to root:root 0666
 using whatever options are available [well, I would rather use a
 dedicated nobody-style uid/gid here]. All we care about is that only
 authorized workstations or authorized users can connect to each NFS or
 SMB share, and we absolutely don't want them to be able to set custom
 ownership or ACLs.  
 Some times known as the "drop-box" use case I think (not to be confused
 with the cloud app of a similar name).
 We could probably implement something like that as an option but I had not
 considered it before.

  4. We care about NFS and CephFS file ownership
being consistent with
 what Windows clients see. We store all UIDs and GIDs in Active
 Directory using the rfc2307 schema, and it's mandatory that all
 servers (especially SAMBA - thanks to the "ad" idmap backend) respect
 that and don't try to invent anything [well, they do - BUILTIN/Users
 gets its GID through tdb2]. Oh, and by the way, we have this strangely
 low-numbered group that everybody gets wrong unless they set "idmap
 config CORP : range = 500-999999".  
 This is oh so similar to a project I worked on prior to working with Ceph.
 I think we'll need to do this one eventually but maybe not this year.
 One nice side-effect of running in containers is that the low-id number is
 less of an issue because the ids only matter within the container context
 (and only then if using the kernel file system access methods). We have
 much more flexibility with IDs in a container.  
 So - are you going to use the kernel-based mount or the ceph vfs
 module? My tests indicate that, in situations where there are
 frequently accessed files, allowing the kernel to cache them in RAM
 (which the vfs module does not do) can create a big boost in
 performance. Also, SUSE considers the ceph vfs module a
 non-recommended solution apparently for the same performance-related
 reason, see
 https://documentation.suse.com/ses/7/html/ses-all/cha-ses-cifs.html 

The prototype module only uses the vfs module due to the extreme simplicity of 
setting it up in containers. Otherwise, we're trying to keep our options open 
and are investigating multiple approaches currently.

...
    5. We use a few static ranges for algorithmic ID
translation using the
 idmap rid backend. Everything works.  
 See above.

  6. We use SSSD, which provides consistent IDs
everywhere, and for a
 few devices which can't use it, we configured compatible idmap rid
 ranges for use with winbindd. The only problem is that we like
 user-private groups, and only SSSD has support for them (although we
 admit it's our fault that we enabled this non-default option).
 7. We store ID mappings in non-AD LDAP and use winbindd with the
 "ldap" idmap backend.  
 For now, we're only planning to do idmapping with winbind and AD. We'd
 probably only consider non-AD ldap and/or ssd if there was strong and loud
 demand for it.  
 See above.

 However, as I said, providing a way to use the "rid" backend with
 statically defined domains and ranges in addition to the default
 "autorid" backend would be, for me, a good-enough substitute for SSSD.

Sounds reasonable. I've done it that way in a prior role too, so it's somewhat 
familiar.   Thanks!

...
    I am sure other weird but valid setups exist - please
extend the list
 if you can.

 Which of the above scenarios would be supportable without resorting to
 the old way of installing SAMBA manually alongside the cluster?  
 I hope I covered the above with some inline replies. This was great food
 for thought and at just the right level of technical detail. So thank you
 very much for replying, this is exactly the kind of discussion I want to
 have now where the design is still young and flexible.

 One other cool thing I plan on doing is supporting multiple samba
 containers running on the same cluster (even the same node if I can
 wrangle the network properly). So one could in fact have completely
 different domain joins and/or configurations. While I wouldn't suggest
 anyone run a whole lot of different configurations on the same cluster -
 this idea already allows for some level of agility between schemes. Later
 on we might be able to use that as a building block for migration tools,
 either from an existing samba setup or between configurations.  
 Multiple SAMBA containers are also good for high availability (with
 ctdb) or scale-out (with round-robin DNS).

  Also, I plan on adding `global_custom_options`
and `share_custom_options`
 for special overrides for development, qa, and experimentation but those
 are strongly within the "you break it, you bought it" realm. But these
 could be used for experimenting  with idmapping schemes without having
 them all baked into the smb mgr module code.  
 Great, thanks! 
Once again, thanks for the feedback. This discussion is very welcome!

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support