Re: Running different rgw daemon with same cephxuser

10 Feb 2021

Sounds good, thanks guys! It does compile so go for it :)
–––––––––
Sébastien Han
Senior Principal Software Engineer, Storage Architect

"Always give 100%. Unless you're giving blood."
On Wed, Feb 10, 2021 at 6:29 AM Jiffin Thottan &lt;jthottan(a)redhat.com&gt; wrote:
...

 Hey Seb,

 I will test the PR against HPA and let u know the results (within one or two days).
 --
 Jiffin

 ----- Original Message -----
 From: "Sebastien Han" &lt;shan(a)redhat.com&gt;
 To: "Matt Benjamin" &lt;mbenjami(a)redhat.com&gt;
 Cc: "Jiffin Thottan" &lt;jthottan(a)redhat.com&gt;om>, "ceph-rgw-eng"
&lt;ceph-rgw-eng(a)redhat.com&gt;om>, "ceph-tech-list"
&lt;ceph-tech-list(a)redhat.com&gt;om>, "dev" &lt;dev(a)ceph.io&gt;io>, "Matt
Benjamin" &lt;mbenjamin(a)redhat.com&gt;om>, "Kaleb Keithley"
&lt;kkeithle(a)redhat.com&gt;om>, "Orit Wasserman" &lt;owasserm(a)redhat.com&gt;om>,
"Travis Nielsen" &lt;tnielsen(a)redhat.com&gt;
 Sent: Tuesday, February 9, 2021 10:11:47 PM
 Subject: Re: Running different rgw daemon with same cephxuser

 Thank Matt, I just sent this to kick in the discussion
 https://github.com/ceph/ceph/pull/39380
 If someone wants to take over it's preferable I guess, this is mainly
 due to my limited C++ knowledge.

 So feel free to assign someone from your team to take over so we can
 move faster with this one.
 Thanks!
 –––––––––
 Sébastien Han
 Senior Principal Software Engineer, Storage Architect

 "Always give 100%. Unless you're giving blood."

 On Mon, Feb 8, 2021 at 3:53 PM Matt Benjamin &lt;mbenjami(a)redhat.com&gt; wrote:

 HI Sebastien,

 That seems like a concise and reasonable solution to me.  It seems
 like the metrics from a single instance should in fact be transient
 (leaving the problem of maintaining aggregate values to prometheus or
 even downstream of that?

 Matt

 On Mon, Feb 8, 2021 at 9:47 AM Sebastien Han &lt;shan(a)redhat.com&gt; wrote:

 Hi Jiffin,

 From my perspective, one simple way to fix this (although we must be
 careful with backward compatibility) would be for rgw to register to
 service map differently.
 Today it is using the daemon name like rgw.foo, then it will register
 as foo. Essentially, if you try to run that pod twice you would still
 see a single instance in the service map as well as the prometheus
 metrics.

 It would be nice to register with RADOS client session ID instead ,
 just like rbd-mirror does by using instance_id. Something like:

 std::string instance_id = stringify(rados->get_instance_id());
 int ret = rados.service_daemon_register(daemon_type, name, metadata);

 Here https://github.com/ceph/ceph/blob/master/src/rgw/rgw_rados.cc#L1139
 With that we can re-use the same cephx user and scale to any number,
 all instances will use the same cephx to authenticate to the cluster
 but they will show up as N in the service map.

 I guess one downside is that as soon as the daemon restart, we get a
 new RADOS client session ID, and thus our name changes, which means we
 are losing all the metrics...
 Thoughts?

 Thanks!
 –––––––––
 Sébastien Han
 Senior Principal Software Engineer, Storage Architect

 "Always give 100%. Unless you're giving blood."

 On Thu, Feb 4, 2021 at 3:39 PM Jiffin Thottan &lt;jthottan(a)redhat.com&gt; wrote:

 Hi all,

 In OCS(Rook) env workflow for RGW daemons as follows,

 Normally for creating ceph object-store, the first Rook creates pools for rgw daemon with
the specified configuration.

 Then depending on the no of instances, Rook create cephxuser and then rgw spawn daemon in
the container(pod) using its id
 with following arguments for radosgw binary
     Args:
       --fsid=91501490-4b55-47db-b226-f9d9968774c1
       --keyring=/etc/ceph/keyring-store/keyring
       --log-to-stderr=true
       --err-to-stderr=true
       --mon-cluster-log-to-stderr=true
       --log-stderr-prefix=debug
       --default-log-to-file=false
       --default-mon-cluster-log-to-file=false
       --mon-host=$(ROOK_CEPH_MON_HOST)
       --mon-initial-members=$(ROOK_CEPH_MON_INITIAL_MEMBERS)
       --id=rgw.my.store.a
       --setuser=ceph
       --setgroup=ceph
       --foreground
       --rgw-frontends=beast port=8080
       --host=$(POD_NAME)
       --rgw-mime-types-file=/etc/ceph/rgw/mime.types
       --rgw-realm=my-store
       --rgw-zonegroup=my-store
       --rgw-zone=my-store

 And here cephxuser will be "client.rgw.my.store.a" and all the pools for rgw
will be created as my-store*. Normally if there is
 a request for another instance in the config file for a ceph-object-store config file[1]
for rook, another user "client.rgw.mystore.b"
 will be created by rook and will consume the same pools.

 There is a feature in Kubernetes known as autoscale in which pods can be automatically
scaled based on specified metrics. If we apply that
 feature for rgw pods, Kubernetes will automatically scale the rgw pods(like a clone of
the existing pod) with the same argument for "--id"
 based on the metrics, but ceph cannot distinguish those as different rgw daemons even
though multiple pods of rgw are running simultaneously.
  In "ceph status" shows only one daemon rgw as well

 In vstart or ceph ansible(Ali help me to figure it out), I can see for each rgw daemon a
cephxuser is getting created as well

 Is this behaviour intended ? or am I hitting any corner case which was never tested
before?

 There is no point of autoscaling of rgw pod if it considered to the same daemon, the s3
client will talk to only one of the pods and ceph mgr
 provides metrics can give incorrect data as well which can affect the autoscale feature

 Also opened an issue in rook for the time being [2]

 [1]
https://github.com/rook/rook/blob/master/cluster/examples/kubernetes/ceph/o…
 [2] https://github.com/rook/rook/issues/6943

 Regards,
 Jiffin

 --

 Matt Benjamin
 Red Hat, Inc.
 315 West Huron Street, Suite 140A
 Ann Arbor, Michigan 48103

 http://www.redhat.com/en/technologies/storage

 tel.  734-821-5101
 fax.  734-769-8938
 cel.  734-216-5309

2024

2023

2022

2021

2020

2019

Re: Running different rgw daemon with same cephxuser