Hi everyone,
we upgraded our production cluster just recently to version:
ceph version 14.2.3-349-g7b1552ea82
(7b1552ea827cf5167b6edbba96dd1c4a9dc16937) nautilus (stable)
We then activated pg_autoscaler for two pools that had a bad pg_num
and the result is satisfying.
However, after the rebalance finished the cluster became laggy. We
noticed that two out of three MONs had a much higher CPU usage than
usual, according to `top` the MON processes consumed more than 100%.
Restarting the MON services and disabling pg_autoscaler resolved the
issue. I've read that the balancer module can cause a higher load on
the MGR daemon, is this somehow related?
Another thing to mention is the confusing calculation of the
autoscaler. After the pg numbers had been corrected we got the warning
about overcommitted pools:
> 1 subtrees have overcommitted pool target_size_bytes
> 1 subtrees have overcommitted pool target_size_ratio
The images pool was responsible for that. The confusing part was that
sometimes autoscale-status displayed the size of that pool with more
than 14 TB:
ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO
TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
images 14399G 3.0 33713G 1.2813
1.0 128 on
And a couple of minutes later the pool suddenly only had around 4 TB of data:
ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO
TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
images 4112G 3.0 33713G 0.3659
1.0 128 on
There seems to be some kind of inconsistency here. The actual used
storage of this pool according to `ceph df` is:
POOLS:
POOL ID STORED OBJECTS USED
%USED MAX AVAIL
images 1 4.1 TiB 1.01M 12 TiB
49.73 4.1 TiB
Has anyone experienced something similar? Are these known issues?
Regards,
Eugen
Dear all,
Can you show me steps how to intergrate Metadata of Ceph object with
ElasticSeach to improve searching medata performance?
thank you very much.
-----------------------------
Br,
Dương Tuấn Dũng
0986153686
unsubscribe
---------- Forwarded message ---------
From: <ceph-users-request(a)ceph.io>
Date: Mon, Sep 16, 2019 at 7:22 PM
Subject: ceph-users Digest, Vol 80, Issue 54
To: <ceph-users(a)ceph.io>
Send ceph-users mailing list submissions to
ceph-users(a)ceph.io
To subscribe or unsubscribe via email, send a message with subject or
body 'help' to
ceph-users-request(a)ceph.io
You can reach the person managing the list at
ceph-users-owner(a)ceph.io
When replying, please edit your Subject line so it is more specific
than "Re: Contents of ceph-users digest..."
Today's Topics:
1. Re: upmap supported in SLES 12SPx (Ilya Dryomov)
2. Re: upmap supported in SLES 12SPx (Thomas Schneider)
3. Re: upmap supported in SLES 12SPx (Ilya Dryomov)
4. Re: Using same instance name for rgw (Eric Choi)
5. Re: RGW Passthrough (Casey Bodley)
----------------------------------------------------------------------
Date: Mon, 16 Sep 2019 16:56:19 +0200
From: Ilya Dryomov <idryomov(a)gmail.com>
Subject: [ceph-users] Re: upmap supported in SLES 12SPx
To: Thomas Schneider <74cmonty(a)gmail.com>
Cc: Konstantin Shalygin <k0ste(a)k0ste.ru>, ceph-users
<ceph-users(a)ceph.io>
Message-ID:
<CAOi1vP-YzuMeL-u=hPDMr6fSTWd+F7RXX61kfO9609pxv98qNw(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Mon, Sep 16, 2019 at 4:40 PM Thomas Schneider <74cmonty(a)gmail.com> wrote:
>
> Hi,
>
> thanks for your valuable input.
>
> Question:
> Can I get more information of the 6 clients (those with features
> 0x40106b84a842a42), e.g. IP, that allows me to identify it easily?
Yes, although it's not integrated into "ceph features". Log into
a monitor node and run "ceph daemon mon.a sessions" (mon.a is the name
of the monitor, substitute accordingly).
Thanks,
Ilya
------------------------------
Date: Mon, 16 Sep 2019 17:10:37 +0200
From: Thomas Schneider <74cmonty(a)gmail.com>
Subject: [ceph-users] Re: upmap supported in SLES 12SPx
To: Ilya Dryomov <idryomov(a)gmail.com>
Cc: Konstantin Shalygin <k0ste(a)k0ste.ru>, ceph-users
<ceph-users(a)ceph.io>
Message-ID: <79b342ac-be00-e3e7-cbec-b6c96d3a0a59(a)gmail.com>
Content-Type: text/plain; charset=utf-8
Wonderbra.
I found some relevant sessions on 2 of 3 monitor nodes.
And I found some others:
root@ld5505:~# ceph daemon mon.ld5505 sessions | grep 0x40106b84a842a42
root@ld5505:~# ceph daemon mon.ld5505 sessions | grep -v luminous
[
"MonSession(client.32679861 v1:10.97.206.92:0/1183647891 is open
allow *, features 0x27018fb86aa42ada (jewel))",
"MonSession(client.32692978 v1:10.97.206.91:0/3689092992 is open
allow *, features 0x27018fb86aa42ada (jewel))",
"MonSession(client.11935413 v1:10.96.6.116:0/3187655474 is open
allow r, features 0x27018eb84aa42a52 (jewel))",
"MonSession(client.3941901 v1:10.76.179.23:0/2967896845 is open
allow r, features 0x27018fb86aa42ada (jewel))",
"MonSession(client.28313343 v1:10.76.177.108:0/1303617860 is open
allow r, features 0x27018fb86aa42ada (jewel))",
"MonSession(client.29311725 v1:10.97.206.94:0/224438037 is open
allow *, features 0x27018fb86aa42ada (jewel))",
"MonSession(client.4535833 v1:10.76.177.133:0/1269608815 is open
allow r, features 0x27018fb86aa42ada (jewel))",
"MonSession(client.3919902 v1:10.96.4.243:0/293623521 is open allow
r, features 0x27018eb84aa42a52 (jewel))",
"MonSession(client.35678944 v1:10.76.179.211:0/4218086982 is open
allow r, features 0x27018eb84aa42a52 (jewel))",
"MonSession(client.35751316 v1:10.76.179.30:0/1348696702 is open
allow r, features 0x27018eb84aa42a52 (jewel))",
"MonSession(client.28246527 v1:10.96.4.228:0/1495661381 is open
allow r, features 0x27018fb86aa42ada (jewel))",
"MonSession(client.3917843 v1:10.76.179.22:0/489863209 is open allow
r, features 0x27018fb86aa42ada (jewel))",
"MonSession(unknown.0 - is open allow r, features 0x27018eb84aa42a52
(jewel))",
]
Would it make sense to shutdown these clients, too?
What confuses me is that the list includes clients that belong to the
Ceph cluster, namely 10.97.206.0/24.
All nodes of the Ceph cluster are identical in terms of OS, kernel, Ceph.
Regards
Thomas
Am 16.09.2019 um 16:56 schrieb Ilya Dryomov:
> On Mon, Sep 16, 2019 at 4:40 PM Thomas Schneider <74cmonty(a)gmail.com>
wrote:
>> Hi,
>>
>> thanks for your valuable input.
>>
>> Question:
>> Can I get more information of the 6 clients (those with features
>> 0x40106b84a842a42), e.g. IP, that allows me to identify it easily?
> Yes, although it's not integrated into "ceph features". Log into
> a monitor node and run "ceph daemon mon.a sessions" (mon.a is the name
> of the monitor, substitute accordingly).
>
> Thanks,
>
> Ilya
------------------------------
Date: Mon, 16 Sep 2019 17:36:35 +0200
From: Ilya Dryomov <idryomov(a)gmail.com>
Subject: [ceph-users] Re: upmap supported in SLES 12SPx
To: Thomas Schneider <74cmonty(a)gmail.com>
Cc: Konstantin Shalygin <k0ste(a)k0ste.ru>, ceph-users
<ceph-users(a)ceph.io>
Message-ID:
<CAOi1vP9MsfoQFPNfUYPRn2MzS=5buZ-s2Jcorkym84K8hJ6cWw(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Mon, Sep 16, 2019 at 5:10 PM Thomas Schneider <74cmonty(a)gmail.com> wrote:
>
> Wonderbra.
>
> I found some relevant sessions on 2 of 3 monitor nodes.
> And I found some others:
> root@ld5505:~# ceph daemon mon.ld5505 sessions | grep 0x40106b84a842a42
> root@ld5505:~# ceph daemon mon.ld5505 sessions | grep -v luminous
> [
> "MonSession(client.32679861 v1:10.97.206.92:0/1183647891 is open
> allow *, features 0x27018fb86aa42ada (jewel))",
> "MonSession(client.32692978 v1:10.97.206.91:0/3689092992 is open
> allow *, features 0x27018fb86aa42ada (jewel))",
> "MonSession(client.11935413 v1:10.96.6.116:0/3187655474 is open
> allow r, features 0x27018eb84aa42a52 (jewel))",
> "MonSession(client.3941901 v1:10.76.179.23:0/2967896845 is open
> allow r, features 0x27018fb86aa42ada (jewel))",
> "MonSession(client.28313343 v1:10.76.177.108:0/1303617860 is open
> allow r, features 0x27018fb86aa42ada (jewel))",
> "MonSession(client.29311725 v1:10.97.206.94:0/224438037 is open
> allow *, features 0x27018fb86aa42ada (jewel))",
> "MonSession(client.4535833 v1:10.76.177.133:0/1269608815 is open
> allow r, features 0x27018fb86aa42ada (jewel))",
> "MonSession(client.3919902 v1:10.96.4.243:0/293623521 is open allow
> r, features 0x27018eb84aa42a52 (jewel))",
> "MonSession(client.35678944 v1:10.76.179.211:0/4218086982 is open
> allow r, features 0x27018eb84aa42a52 (jewel))",
> "MonSession(client.35751316 v1:10.76.179.30:0/1348696702 is open
> allow r, features 0x27018eb84aa42a52 (jewel))",
> "MonSession(client.28246527 v1:10.96.4.228:0/1495661381 is open
> allow r, features 0x27018fb86aa42ada (jewel))",
> "MonSession(client.3917843 v1:10.76.179.22:0/489863209 is open allow
> r, features 0x27018fb86aa42ada (jewel))",
> "MonSession(unknown.0 - is open allow r, features 0x27018eb84aa42a52
> (jewel))",
> ]
>
> Would it make sense to shutdown these clients, too?
>
> What confuses me is that the list includes clients that belong to the
> Ceph cluster, namely 10.97.206.0/24.
> All nodes of the Ceph cluster are identical in terms of OS, kernel, Ceph.
The above output seems consistent with your "ceph features" output: it
lists clients with features 0x27018eb84aa42a52 and 0x27018fb86aa42ada.
Like I said in my previous email, both of these support upmap.
If you temporarily shut them down, set-require-min-compat-client will
work without --yes-i-really-mean-it.
Thanks,
Ilya
------------------------------
Date: Mon, 16 Sep 2019 16:05:47 -0000
From: "Eric Choi" <eric.yongjun.choi(a)gmail.com>
Subject: [ceph-users] Re: Using same instance name for rgw
To: ceph-users(a)ceph.io
Message-ID: <156864994756.18.7543223789861447910@mailman-web>
Content-Type: text/plain; charset="utf-8"
bump. anyone?
------------------------------
Date: Mon, 16 Sep 2019 12:22:16 -0400
From: Casey Bodley <cbodley(a)redhat.com>
Subject: [ceph-users] Re: RGW Passthrough
To: ceph-users(a)ceph.io
Message-ID: <fdf93930-794e-b2d0-46ee-5288a4d91605(a)redhat.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi Robert,
So far the cloud tiering features are still in the design stages. We're
working on some initial refactoring work to support this abstraction
(ie. to either satisfy a request against the local rados cluster, or to
proxy it somewhere else). With respect to passthrough/tiering to AWS, we
could use help thinking through the user/credential mapping in
particular. We have a weekly 'RGW Refactoring' meeting on Wednesdays
where we discuss design and refactoring progress - it's on the upstream
community calendar, I'll send you an invite.
On 9/13/19 9:59 PM, Robert LeBlanc wrote:
> We are very interested in the RGW Passthrough mentioned for Octupus.
> What's the status and how can we help? We want to connect with AWS S3.
>
> Thank you,
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
------------------------------
Subject: Digest Footer
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
------------------------------
End of ceph-users Digest, Vol 80, Issue 54
******************************************
We are very interested in the RGW Passthrough mentioned for Octupus. What's
the status and how can we help? We want to connect with AWS S3.
Thank you,
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
I previously posted this question to lists.ceph.com not understanding lists.ceph.io is the replacement for it. Posting it again here with some edits.
---
Hi there, we have been using ceph for a few years now, it's only now that
I've noticed we have been using the same name for all RGW hosts, resulting
when you run ceph -s:
rgw: 1 daemon active (..)
also our ceph.conf looks like (for rgw)
...
[client.radosgw.gateway]
...
despite having more than 10 RGW hosts.
* What are the side effects of doing this? Is this a no-no? I can see the
metrics can (ceph daemon ... perf dump) be wrong, are the metrics kept
track independently (per host)? Can this affect performance negatively by any chance? (we are using the same key obviously..)
* We recently upgraded from Lumunious to Nautilus, I've noticed that later docs all prescribes radosgw config section as
[client.rgw.{instance-name}]
..
should we make this change?
Much appreciated!
Hi,
The CFP is ending today for the Ceph Day London on October 24th.
If you have a talk you would like to submit, please follow the link below!
Wido
On 7/18/19 3:43 PM, Wido den Hollander wrote:
> Hi,
>
> We will be having Ceph Day London October 24th!
>
> https://ceph.com/cephdays/ceph-day-london-2019/
>
> The CFP is now open for you to get your Ceph related content in front
> of the Ceph community ranging from all levels of expertise:
>
> https://forms.zohopublic.com/thingee/form/CephDayLondon2019/formperma/h96jZ…
>
> If your company is interested in sponsoring the event, we would be
> delighted to have you. Please contact me directly for further information.
>
> The Ceph Day is co-located with the Apache CloudStack project. There
> will be two tracks where people can choose between Ceph and CloudStack.
>
> After the Ceph Day there's going to be beers in the pub nearby to make
> new friends.
>
> Join us in London on October 24th!
>
> Wido
> _______________________________________________
> ceph-users mailing list
> ceph-users(a)lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Dear Ceph team,
I builde a object storage using ceph, a cluster with 5 nodes (9 sas10k 2TB
(for data)+ 1 nvme (for metadata) in one node , using ceph version
14.2.3, nautilus (stable). My cluster storage over 100M (minlion) objects
(files) with file's size ~ 50KB, and I want to delete them to free capacity
and improve performance. When I delete object, speed delete object very
slow, about ~ 35-36 objs/s at 100M (minlion) objects point, i using
command: s3cmd del -r s3://mybucket/2019 to delete. Can you help me:
- How does ceph's deleting object operator work? (it mean about how delete
object flow in ceph works?)
- how to improve speed of delete object operator ( objects/s)
- how to delete mult
- bestpractice for this usecase
- other recommend....
Thank you very much.
-----------------------------
Br,
Dương Tuấn Dũng
0986153686