You would need new tcp connections for kube proxy to send to new hosts
On Thu, Feb 11, 2021 at 03:47 Jiffin Thottan <jthottan(a)redhat.com> wrote:
I was able to test the PR against HPA in minikube and
it is working as
expected.
# ceph status
cluster:
id: c7a87662-dccb-4143-bf68-58ff676a0362
health: HEALTH_WARN
mon a is low on available space
8 pool(s) have no replicas configured
services:
mon: 1 daemons, quorum a (age 20m)
mgr: a(active, since 19m)
osd: 1 osds: 1 up (since 19m), 1 in (since 19m)
rgw: 3 daemons active (my.store.a.my-store.my-store.4383,
my.store.a.my-store.my-store.4715, my.store.a.my-store.my-store.4717)
data:
pools: 8 pools, 96 pgs
objects: 2.57k objects, 8.5 MiB
usage: 85 MiB used, 20 GiB / 20 GiB avail
pgs: 96 active+clean
io:
client: 611 KiB/s rd, 386 KiB/s wr, 696 op/s rd, 1.27k op/s wr
even metrics separated shown from ceph mgr.
@Matt @Casey :
I saw following wrt s3 client
I created HPA for rgw pod which will scale pods based on no of requests,
I trigger recursive directory(4480 directories, 67705 files) copy from s3
client using the following command
aws s3 cp <directory> --no-verify-ssl --endpoint-url
http://$BUCKET_HOST:$BUCKET_PORT
s3://$BUCKET_NAME
even hpa scaled the rgw pods, requests were not sending to new created rgw
pods(daemons)
but when I triggered another recursive copy it was sent to all the pods.
Is this behaviour expected??
--
Jiffin
----- Original Message -----
From: "Sebastien Han" <shan(a)redhat.com>
To: "Jiffin Thottan" <jthottan(a)redhat.com>
Cc: "Matt Benjamin" <mbenjami(a)redhat.com>om>, "ceph-rgw-eng" <
ceph-rgw-eng(a)redhat.com>gt;, "ceph-tech-list"
<ceph-tech-list(a)redhat.com>om>,
"dev" <dev(a)ceph.io>io>, "Matt Benjamin"
<mbenjamin(a)redhat.com>om>, "Kaleb
Keithley" <kkeithle(a)redhat.com>om>, "Orit Wasserman"
<owasserm(a)redhat.com>om>,
"Travis Nielsen" <tnielsen(a)redhat.com>
Sent: Wednesday, February 10, 2021 1:20:14 PM
Subject: Re: Running different rgw daemon with same cephxuser
Sounds good, thanks guys! It does compile so go for it :)
–––––––––
Sébastien Han
Senior Principal Software Engineer, Storage Architect
"Always give 100%. Unless you're giving blood."
On Wed, Feb 10, 2021 at 6:29 AM Jiffin Thottan <jthottan(a)redhat.com>
wrote:
Hey Seb,
I will test the PR against HPA and let u know the results (within one or
two
days).
--
Jiffin
----- Original Message -----
From: "Sebastien Han" <shan(a)redhat.com>
To: "Matt Benjamin" <mbenjami(a)redhat.com>
Cc: "Jiffin Thottan" <jthottan(a)redhat.com>om>, "ceph-rgw-eng" <
ceph-rgw-eng(a)redhat.com>gt;, "ceph-tech-list"
<ceph-tech-list(a)redhat.com>om>,
"dev" <dev(a)ceph.io>io>, "Matt Benjamin"
<mbenjamin(a)redhat.com>om>, "Kaleb
Keithley" <kkeithle(a)redhat.com>om>, "Orit Wasserman"
<owasserm(a)redhat.com>om>,
"Travis Nielsen" <tnielsen(a)redhat.com>
Sent: Tuesday, February 9, 2021 10:11:47 PM
Subject: Re: Running different rgw daemon with same cephxuser
Thank Matt, I just sent this to kick in the discussion
https://github.com/ceph/ceph/pull/39380
If someone wants to take over it's preferable I guess, this is mainly
due to my limited C++ knowledge.
So feel free to assign someone from your team to take over so we can
move faster with this one.
Thanks!
–––––––––
Sébastien Han
Senior Principal Software Engineer, Storage Architect
"Always give 100%. Unless you're giving blood."
On Mon, Feb 8, 2021 at 3:53 PM Matt Benjamin <mbenjami(a)redhat.com>
wrote:
>
> HI Sebastien,
>
> That seems like a concise and reasonable solution to me. It seems
> like the metrics from a single instance should in fact be transient
> (leaving the problem of maintaining aggregate values to prometheus or
> even downstream of that?
>
> Matt
>
> On Mon, Feb 8, 2021 at 9:47 AM Sebastien Han <shan(a)redhat.com> wrote:
> >
> > Hi Jiffin,
> >
> > From my perspective, one simple way to fix this (although we must be
> > careful with backward compatibility) would be for rgw to register to
> > service map differently.
> > Today it is using the daemon name like rgw.foo, then it will register
> > as foo. Essentially, if you try to run that pod twice you would still
> > see a single instance in the service map as well as the prometheus
> > metrics.
> >
> > It would be nice to register with RADOS client session ID instead ,
> > just like rbd-mirror does by using instance_id. Something like:
> >
> > std::string instance_id = stringify(rados->get_instance_id());
> > int ret = rados.service_daemon_register(daemon_type, name, metadata);
> >
> > Here
https://github.com/ceph/ceph/blob/master/src/rgw/rgw_rados.cc#L1139
> > With that we can re-use the same cephx
user and scale to any number,
> > all instances will use the same cephx to authenticate to the cluster
> > but they will show up as N in the service map.
> >
> > I guess one downside is that as soon as the daemon restart, we get a
> > new RADOS client session ID, and thus our name changes, which means
we
> > are losing all the metrics...
> > Thoughts?
> >
> > Thanks!
> > –––––––––
> > Sébastien Han
> > Senior Principal Software Engineer, Storage Architect
> >
> > "Always give 100%. Unless you're giving blood."
> >
> > On Thu, Feb 4, 2021 at 3:39 PM Jiffin Thottan <jthottan(a)redhat.com>
wrote:
> > >
> > > Hi all,
> > >
> > > In OCS(Rook) env workflow for RGW daemons as follows,
> > >
> > > Normally for creating ceph object-store, the first Rook creates
pools for rgw daemon with the specified configuration.
> > >
> > > Then depending on the no of instances, Rook create cephxuser and
then rgw spawn daemon in the container(pod) using its id
> > > with following arguments for
radosgw binary
> > > Args:
> > > --fsid=91501490-4b55-47db-b226-f9d9968774c1
> > > --keyring=/etc/ceph/keyring-store/keyring
> > > --log-to-stderr=true
> > > --err-to-stderr=true
> > > --mon-cluster-log-to-stderr=true
> > > --log-stderr-prefix=debug
> > > --default-log-to-file=false
> > > --default-mon-cluster-log-to-file=false
> > > --mon-host=$(ROOK_CEPH_MON_HOST)
> > > --mon-initial-members=$(ROOK_CEPH_MON_INITIAL_MEMBERS)
> > > --id=rgw.my.store.a
> > > --setuser=ceph
> > > --setgroup=ceph
> > > --foreground
> > > --rgw-frontends=beast port=8080
> > > --host=$(POD_NAME)
> > > --rgw-mime-types-file=/etc/ceph/rgw/mime.types
> > > --rgw-realm=my-store
> > > --rgw-zonegroup=my-store
> > > --rgw-zone=my-store
> > >
> > > And here cephxuser will be "client.rgw.my.store.a" and all the
pools for rgw will be created as my-store*. Normally if there is
> > > a request for another instance in
the config file for a
ceph-object-store config file[1] for rook, another user
"client.rgw.mystore.b"
> > > will be created by rook and will
consume the same pools.
> > >
> > > There is a feature in Kubernetes known as autoscale in which pods
can be automatically scaled based on specified metrics. If we apply that
> > > feature for rgw pods, Kubernetes
will automatically scale the rgw
pods(like a clone of the existing pod) with the
same argument for "--id"
> > > based on the metrics, but ceph
cannot distinguish those as
different rgw daemons even though multiple pods of rgw
are running
simultaneously.
> > > In "ceph status" shows
only one daemon rgw as well
> > >
> > > In vstart or ceph ansible(Ali help me to figure it out), I can
see
for each rg
<https://www.google.com/maps/search/ansible(Ali+help+me+to+figure+it+out),+I+can+see+for+each+rg?entry=gmail&source=g>w
daemon a cephxuser is getting created as well
> > >
> > > Is this behaviour intended ? or am I hitting any corner case which
was never tested before?
> > >
> > > There is no point of autoscaling of rgw pod if it considered to
the
same daemon, the s3 client will talk to only one of the pods and ceph
mgr
> > > provides metrics can give
incorrect data as well which can affect
the autoscale feature
> > >
> > > Also opened an issue in rook for the time being [2]
> > >
> > > [1]
https://github.com/rook/rook/blob/master/cluster/examples/kubernetes/ceph/o…
--
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103
http://www.redhat.com/en/technologies/storage
tel. 734-821-5101
fax. 734-769-8938
cel. 734-216-5309
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io