Re: orchestrator mds add|update

23 Oct 2019

On Wed, Oct 23, 2019 at 9:56 AM Sage Weil &lt;sweil(a)redhat.com&gt; wrote:
...

 I'm trying to implement MDS daemon management for mgr/ssh and am
 confused by the intent of the orchestrator interface.

 - The add_mds() method takes a 'spec' StatelessServiceSpec that has
 a ctor like

     def __init__(self, name, placement=None, count=None):

 but it is constructed only with a name:

     @_write_cli('orchestrator mds add',
                 "name=svc_arg,type=CephString",
                 'Create an MDS service')
     def _mds_add(self, svc_arg):
         spec = orchestrator.StatelessServiceSpec(svc_arg)

 That means count=1 and placement is unspecified.  That's fine for Rook,
 sort of, as long as you want exactly 1 MDS for each file system.

 - Given that, can we rename the 'svg_arg' arg to 'name'?

 - The 'name' here, IIUC, is the name of the grouping of daemons.  I think
 it was intended to be a file system, as per the docs:

  The ``name`` parameter is an identifier of the group of instances:

  * a CephFS file system for a group of MDS daemons,
  * a zone name for a group of RGWs

 but IIRC the new CephFS behavior is that all standby daemons go into the
 same pool and are doled out to file systems that need them arbitrarily.
 In that case, I think the only thing we would want to specify (in the rook
 case where we don't pick daemon location) is the count of MDSs... and
 then have a singel name grouping.  Is that right for CephFS? 
Yes. One issue we need to consider is that when we have the mgr
creating/deleting MDS daemons based on the needs of the file systems,
we will need to delete a specific standby and not just any daemon.
Otherwise, we have unnecessary failovers.

Perhaps the MDS name should just be a random short string of letters
and not identify a "group" of MDS daemons.

...
   I have a
 feeling it won't work for the other daemon types, though, like NFS
 servers, which *do* care what they are serving up.

 - For SSH, none of that works, since we need to pass a location when
 adding daemons.  It seems like we want somethign closer to nfs_add,
 which is

     @_write_cli('orchestrator nfs add',
                 "name=svc_arg,type=CephString "
                 "name=pool,type=CephString "
                 "name=namespace,type=CephString,req=false",
                 'Create an NFS service')

 i.e.,

    * 'add' takes a 'name' (the actual daemon name) and a location (if
the
 orch needs it).
    * 'rm' takes the same name and removes it.
    * 'update' does the smarts of adding ($want - $have) daemons for a
 given group and generating names for them.  Something else organizes these
 into groups (a common name prefix?).  I.e., 'update' basically builds on
 'add' and 'rm'.

 And/or, we introduce some basic scheduling into ssh orchestrator (or
 orchestrator_cli).  I'm not sure this is actually that smart since we can
 probably get away with something quite simple: round-robin assignment of
 daemons to hosts, and the ability to label nodes for a daemon type or
 daemon type + grouping.  This would basically give ssh orch what ansible
 does as far as mapping out the deployment, and gracefully degrade to
 something that "just works" (well enough) when you don't know/care
 where things land.  Obviously having a real scheduler like that in k8s
 do this is better, but for non-kube deployments, there is still a need for
 placing daemons to hosts to make things easy for the human operator. 
Agreed.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

2024

2023

2022

2021

2020

2019

Re: orchestrator mds add|update