March 2024 - ceph-users - lists.ceph.io

by Daniel Brown

I likely missed an announcement, and if so, please forgive me. I’m seeing some failure for when running apt on a cluster of ubuntu machines — looks like a directory has changed on https://download.ceph.com/ Was: debian-reef/ Now appears to be: debian-reef_OLD/ Was reef pulled?

2 months, 2 weeks

4
4
0 0

Number of pgs

by Nikolaos Dandoulakis

Hi all, Pretty sure not the first time you see a thread like this. Our cluster consists of 12 nodes/153 OSDs/1.2 PiB used, 708 TiB /1.9 PiB avail The data pool is 2048 pgs big exactly the same number as when the cluster started. We have no issues with the cluster, everything runs as expected and very efficiently. We support about 1000 clients. The question is should we increase the number of pgs? If you think so, what is the sensible number to go to? 4096? More? I will eagerly await for your response. Best, Nick P.S. Yes, autoscaler is off :) The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th' ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

2 months, 2 weeks

3
5
0 0

Number of pgs

by ndandoul＠ed.ac.uk

Hi all, Pretty sure not the very first time you see a thread like this. Our cluster consists of 12 nodes/153 OSDs/1.2 PiB used, 708 TiB /1.9 PiB avail The data pool is 2048 pgs big exactly the number when the cluster started. We have no issues with the cluster, everything runs as expected and very efficiently. We support about 1000 clients. The question is should we increase the number of pgs? If you think so, what is the sensible number to go to? 4096? More? I will eagerly await for your response. P.S. Yes, autoscaler is off :)

2 months, 2 weeks

1
0
0 0

Upgraded 16.2.14 to 16.2.15

by Zakhar Kirpichenko

Hi, I have upgraded my test and production cephadm-managed clusters from 16.2.14 to 16.2.15. The upgrade was smooth and completed without issues. There were a few things which I noticed after each upgrade: 1. RocksDB options, which I provided to each mon via their configuration files, got overwritten during mon redeployment and I had to re-add mon_rocksdb_options back. 2. Monitor debug_rocksdb option got silently reset back to the default 4/5, I had to set it back to 1/5. 3. For roughly 2 hours after the upgrade, despite the clusters being healthy and operating normally, all monitors would run manual compactions very often and write to disks at very high rates. For example, production monitors had their rocksdb:low0 thread write to store.db: monitors without RocksDB compression: ~8 GB/5 min, or ~96 GB/hour; monitors with RocksDB compression: ~1.5 GB/5 min, or ~18 GB/hour. After roughly 2 hours with no changes to the cluster the write rates dropped to ~0.4-0.6 GB/5 min and ~120 MB/5 min respectively. The reason for frequent manual compactions and high write rates wasn't immediately apparent. 4. Crash deployment broke ownership of /var/lib/ceph/FSID/crash /var/lib/ceph/FSID/crash/posted, despite I already fixed it manually after the upgrade to 16.2.14 which had broken it as well. 5. Mgr RAM usage appears to be increasing at a slower rate than it did with 16.2.14, although it's too early to tell whether the issue with mgrs randomly consuming all RAM and getting OOM-killed has been fixed - with 16.2.14 this would normally take several days. Overall, things look good. Thanks to the Ceph team for this release! Zakhar

2 months, 2 weeks

4
9
0 0

Help with deep scrub warnings

by Nicola Mori

Dear Ceph users, in order to reduce the deep scrub load on my cluster I set the deep scrub interval to 2 weeks, and tuned other parameters as follows: # ceph config get osd osd_deep_scrub_interval 1209600.000000 # ceph config get osd osd_scrub_sleep 0.100000 # ceph config get osd osd_scrub_load_threshold 0.300000 # ceph config get osd osd_deep_scrub_randomize_ratio 0.100000 # ceph config get osd osd_scrub_min_interval 259200.000000 # ceph config get osd osd_scrub_max_interval 1209600.000000 In my admittedly poor knowledge of Ceph's deep scrub procedures, these settings should spread the deep scrub operations in two weeks instead of the default one week, lowering the scrub frequency and the related load. But I'm currently getting warnings like: [WRN] PG_NOT_DEEP_SCRUBBED: 56 pgs not deep-scrubbed in time pg 3.1e1 not deep-scrubbed since 2024-02-22T00:22:55.296213+0000 pg 3.1d9 not deep-scrubbed since 2024-02-20T03:41:25.461002+0000 pg 3.1d5 not deep-scrubbed since 2024-02-20T09:52:57.334058+0000 pg 3.1cb not deep-scrubbed since 2024-02-20T03:30:40.510979+0000 . . . I don't understand the first one, since the deep scrub interval should be two weeks so I don''t expect warnings for PGs which have been deep-scrubbed less than 14 days ago (at the moment I'm writing it's Tue Mar 5 07:39:07 UTC 2024). Moreover, I don't understand why the deep scrub for so many PGs is lagging behind. Is there something wrong in my settings? Thanks in advance for any help, Nicola

2 months, 2 weeks

3
3
1 0

PGs with status active+clean+laggy

by mori.ricardo＠gmail.com

Dear community, I have a ceph quincy cluster with 5 nodes currently. But only 3 with SSDs and others with nvme. On separate pools. I have had many alerts from PGs with active-clean-laggy status. This has caused problems with slow writing. I wanted to know how to troubleshoot properly. I checked several things related to the network, I have 10 GB cards on all nodes and everything seems to be correct. Many thanks

2 months, 2 weeks

1
0
0 0

Uninstall ceph rgw

by Albert Shih

Hi everyone, I'm currently trying to understand how to deploy rgw, so I test few things but now I'm not sure what's are installed what not. First I try to install according to https://docs.ceph.com/en/quincy/cephadm/services/rgw/ then I see in that page they are https://docs.ceph.com/en/quincy/mgr/rgw/#mgr-rgw-module so now I got some rgw daemon running. But I like to clean up and «erase» everything about rgw ? not only to try to understand but also because I think I mixted up between realm and zonegroup... Regards -- Albert SHIH 🦫 🐸 France Heure locale/Local time: mar. 05 mars 2024 11:01:30 CET

2 months, 2 weeks

2
2
0 0

PGs with status active+clean+laggy

by ricardomori＠soujmv.com

Dear community, I have a ceph quincy cluster with 5 nodes currently. But only 3 with SSDs. I have had many alerts from PGs with active-clean-laggy status. This has caused problems with slow writing. I wanted to know how to troubleshoot properly. I checked several things related to the network, I have 10 GB cards on all nodes and everything seems to be correct. Many thanks

2 months, 2 weeks

3
2
0 0

[RGW] Restrict a subuser to access only one specific bucket

by Huy Nguyen

Hi community, I have a user that owns some buckets. I want to create a subuser that has permission to access only one bucket. What can I do to archive this? Thanks

2 months, 2 weeks

2
1
0 0

Slow RGW multisite sync due to "304 Not Modified" responses on primary zone

by Alam Mohammad

Hi, We have 2 clusters (v18.2.1) primarily used for RGW which has over 2+ billion RGW objects. They are also in multisite configuration totaling to 2 zones and we've got around 2 Gbps of bandwidth dedicated (P2P) for the multisite traffic. We see that using "radosgw-admin sync status" on the zone 2, all the 128 shards are recovering and unfortunately there is very less data transfer from primary zone ie., the link utilization is barely 100 Mbps / 2 Gbps. Our objects are quite small as well like avg. of 1 MB in size. On further inspection, we noticed the rgw access the logs at primary site are mostly yielding "304 Not Modified" for RGWs at site-2. Is this expected? Here are some of the logs (information is redacted) root@host-04:~# tail -f /var/log/haproxy-msync.log Feb 12 05:06:51 host-04 haproxy[971171]: 10.1.85.14:33730 [12/Feb/2024:05:06:51.047] https~ backend/host-04-msync 0/0/0/2/2 304 143 - - ---- 56/55/1/0/0 0/0 "GET /bucket1/object1.jpg?rgwx-zonegroup=71dceb3d-3092-4dc6-897f-a9abf60c9972&rgwx-prepend-metadata=true&rgwx-sync-manifest&rgwx-sync-cloudtiered&rgwx-skip-decrypt&rgwx-if-not-replicated-to=a8204ce2-b69e-4d90-bca1-93edd05a1a29%3Abucket1%3A8b96aea5-c763-40a3-8430-efd67cff0c62.20010.7 HTTP/1.1" Feb 12 05:06:51 host-04 haproxy[971171]: 10.1.85.14:59730 [12/Feb/2024:05:06:51.048] https~ backend/host-04-msync 0/0/0/2/2 304 143 - - ---- 56/55/3/1/0 0/0 "GET /bucket1/object91.jpg?rgwx-zonegroup=71dceb3d-3092-4dc6-897f-a9abf60c9972&rgwx-prepend-metadata=true&rgwx-sync-manifest&rgwx-sync-cloudtiered&rgwx-skip-decrypt&rgwx-if-not-replicated-to=a8204ce2-b69e-4d90-bca1-93edd05a1a29%3Abucket1%3A8b96aea5-c763-40a3-8430-efd67cff0c62.20010.7 HTTP/1.1" We also took a look at our grafana instance and out of 1000 requests / second, 200 are "200 OK" and 800 are "304 Not Modified". Sync threads are run on only 2 rgw daemons per zone and are behind a Load Balancer. "# radosgw-admin sync error list" also contains around 20 errors which are mostly automatically recoverable. As we understand, does it mean that RGW multisite sync logs in the log pool are yet to be generated or some sort? Please provide us some insights and let us know how to resolve this. Thanks, Saif

2 months, 2 weeks

2
2
0 0

2024

2023

2022

2021

2020

2019

ceph-users March 2024