December 2023 - ceph-users

Upgrading from Pacific to Quincy fails with "Unexpected error"

by Reza Bakhshayeshi

Hi all, I have a problem regarding upgrading Ceph cluster from Pacific to Quincy version with cephadm. I have successfully upgraded the cluster to the latest Pacific (16.2.11). But when I run the following command to upgrade the cluster to 17.2.5, After upgrading 3/4 mgrs, the upgrade process stops with "Unexpected error". (everything is on a private network) ceph orch upgrade start my-private-repo/quay-io/ceph/ceph:v17.2.5 I also tried the 17.2.4 version. cephadm fails to check the hosts' status and marks them as offline: cephadm 2023-04-06T10:19:59.998510+0000 mgr.host9.arhpnd (mgr.4516356) 5782 : cephadm [DBG] host host4 (x.x.x.x) failed check cephadm 2023-04-06T10:19:59.998553+0000 mgr.host9.arhpnd (mgr.4516356) 5783 : cephadm [DBG] Host "host4" marked as offline. Skipping daemon refresh cephadm 2023-04-06T10:19:59.998581+0000 mgr.host9.arhpnd (mgr.4516356) 5784 : cephadm [DBG] Host "host4" marked as offline. Skipping gather facts refresh cephadm 2023-04-06T10:19:59.998609+0000 mgr.host9.arhpnd (mgr.4516356) 5785 : cephadm [DBG] Host "host4" marked as offline. Skipping network refresh cephadm 2023-04-06T10:19:59.998633+0000 mgr.host9.arhpnd (mgr.4516356) 5786 : cephadm [DBG] Host "host4" marked as offline. Skipping device refresh cephadm 2023-04-06T10:19:59.998659+0000 mgr.host9.arhpnd (mgr.4516356) 5787 : cephadm [DBG] Host "host4" marked as offline. Skipping osdspec preview refresh cephadm 2023-04-06T10:19:59.998682+0000 mgr.host9.arhpnd (mgr.4516356) 5788 : cephadm [DBG] Host "host4" marked as offline. Skipping autotune cluster 2023-04-06T10:20:00.000151+0000 mon.host8 (mon.0) 158587 : cluster [ERR] Health detail: HEALTH_ERR 9 hosts fail cephadm check; Upgrade: failed due to an unexpected exception cluster 2023-04-06T10:20:00.000191+0000 mon.host8 (mon.0) 158588 : cluster [ERR] [WRN] CEPHADM_HOST_CHECK_FAILED: 9 hosts fail cephadm check cluster 2023-04-06T10:20:00.000202+0000 mon.host8 (mon.0) 158589 : cluster [ERR] host host7 (x.x.x.x) failed check: Unable to reach remote host host7. Process exited with non-zero exit status 3 cluster 2023-04-06T10:20:00.000213+0000 mon.host8 (mon.0) 158590 : cluster [ERR] host host2 (x.x.x.x) failed check: Unable to reach remote host host2. Process exited with non-zero exit status 3 cluster 2023-04-06T10:20:00.000220+0000 mon.host8 (mon.0) 158591 : cluster [ERR] host host8 (x.x.x.x) failed check: Unable to reach remote host host8. Process exited with non-zero exit status 3 cluster 2023-04-06T10:20:00.000228+0000 mon.host8 (mon.0) 158592 : cluster [ERR] host host4 (x.x.x.x) failed check: Unable to reach remote host host4. Process exited with non-zero exit status 3 cluster 2023-04-06T10:20:00.000240+0000 mon.host8 (mon.0) 158593 : cluster [ERR] host host3 (x.x.x.x) failed check: Unable to reach remote host host3. Process exited with non-zero exit status 3 and here are some outputs of the commands: [root@host8 ~]# ceph -s cluster: id: xxx health: HEALTH_ERR 9 hosts fail cephadm check Upgrade: failed due to an unexpected exception services: mon: 5 daemons, quorum host8,host1,host7,host2,host9 (age 2w) mgr: host9.arhpnd(active, since 105m), standbys: host8.jowfih, host1.warjsr, host2.qyavjj mds: 1/1 daemons up, 3 standby osd: 37 osds: 37 up (since 8h), 37 in (since 3w) data: io: client: progress: Upgrade to 17.2.5 (0s) [............................] [root@host8 ~]# ceph orch upgrade status { "target_image": "my-private-repo/quay-io/ceph/ceph@sha256 :34c763383e3323c6bb35f3f2229af9f466518d9db926111277f5e27ed543c427", "in_progress": true, "which": "Upgrading all daemon types on all hosts", "services_complete": [], "progress": "3/59 daemons upgraded", "message": "Error: UPGRADE_EXCEPTION: Upgrade: failed due to an unexpected exception", "is_paused": true } [root@host8 ~]# ceph cephadm check-host host7 check-host failed: Host 'host7' not found. Use 'ceph orch host ls' to see all managed hosts. [root@host8 ~]# ceph versions { "mon": { "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)": 5 }, "mgr": { "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)": 1, "ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 3 }, "osd": { "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)": 37 }, "mds": { "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)": 4 }, "overall": { "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)": 47, "ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 3 } } The strange thing is I can rollback the cluster status by failing to not-upgraded mgr like this: ceph mgr fail ceph orch upgrade start my-private-repo/quay-io/ceph/ceph:v16.2.11 Would you happen to have any idea about this? Best regards, Reza

1 month, 3 weeks

4
9
0 0

Disable signature url in ceph rgw

by marc＠singer.services

Hi Ceph users We are using Ceph Pacific (16) in this specific deployment. In our use case we do not want our users to be able to generate signature v4 URLs because they bypass the policies that we set on buckets (e.g IP restrictions). Currently we have a sidecar reverse proxy running that filters requests with signature URL specific request parameters. This is obviously not very efficient and we are looking to replace this somehow in the future. 1. Is there an option in RGW to disable this signed URLs (e.g returning status 403)? 2. If not is this planned or would it make sense to add it as a configuration option? 3. Or is the behaviour of not respecting bucket policies in RGW with signature v4 URLs a bug and they should be actually applied? Thanks you for your help and let me know if you have any questions Marc Singer

2 months

4
4
0 0

FS down - mds degraded

by Sake Ceph

Starting a new thread, forgot subject in the previous. So our FS down. Got the following error, what can I do? # ceph health detail HEALTH_ERR 1 filesystem is degraded; 1 mds daemon damaged [WRN] FS_DEGRADED: 1 filesystem is degraded fs atlassian/prod is degraded [ERR] MDS_DAMAGE: 1 mds daemon damaged fs atlassian-prod mds.1 is damaged # ceph fs get atlassian-prod Filesystem 'atlassian-prod' (2) fs_name atlassian-prod epoch 43440 flags 32 joinable allow_snaps allow_multimds_snaps allow_standby_replay created 2023-05-10T08:45:46.911064+0000 modified 2023-12-21T06:47:19.291154+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 29480 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 3 in 0,1,2 up {0=1073573,2=1073583} failed damaged 1 stopped data_pools [5] metadata_pool 4 inline_data disabled balancer standby_count_wanted 1 [mds.atlassian-prod.pwsoel13142.egsdfl{0:1073573} state up:resolve seq 573 join_fscid=2 addr [v2:10.233.127.22:6800/61692284,v1:10.233.127.22:6801/61692284] compat {c=[1],r=[1],i=[7ff]}] [mds.atlassian-prod.pwsoel13143.qlvypn{2:1073583} state up:resolve seq 571 join_fscid=2 addr [v2:10.233.127.18:6800/3627858294,v1:10.233.127.18:6801/3627858294] compat {c=[1],r=[1],i=[7ff]}] Best regards, Sake

2 months, 2 weeks

4
7
0 0

easy way to find out the number of allocated objects for a RBD image

by Tony Liu

Hi, Other than get all objects of the pool and filter by image ID, is there any easier way to get the number of allocated objects for a RBD image? What I really want to know is the actual usage of an image. An allocated object could be used partially, but that's fine, no need to be 100% accurate. To get the object count and times object size, that should be sufficient. "rbd export" exports actual used data, but to get the actual usage by exporting the image seems too much. This brings up another question, is there any way to know the export size before running it? Thanks! Tony

3 months, 1 week

4
4
0 0

Re: Ceph OSD reported Slow operations

by V A Prabha

Hi Eugen Please find the details below root@meghdootctr1:/var/log/ceph# ceph -s cluster: id: c59da971-57d1-43bd-b2b7-865d392412a5 health: HEALTH_WARN nodeep-scrub flag(s) set 544 pgs not deep-scrubbed in time services: mon: 3 daemons, quorum meghdootctr1,meghdootctr2,meghdootctr3 (age 5d) mgr: meghdootctr1(active, since 5d), standbys: meghdootctr2, meghdootctr3 mds: 3 up:standby osd: 36 osds: 36 up (since 34h), 36 in (since 34h) flags nodeep-scrub data: pools: 2 pools, 544 pgs objects: 10.14M objects, 39 TiB usage: 116 TiB used, 63 TiB / 179 TiB avail pgs: 544 active+clean io: client: 24 MiB/s rd, 16 MiB/s wr, 2.02k op/s rd, 907 op/s wr Ceph Versions: root@meghdootctr1:/var/log/ceph# ceph --version ceph version 14.2.16 (762032d6f509d5e7ee7dc008d80fe9c87086603c) nautilus (stable) Ceph df -h https://pastebin.com/1ffucyJg Ceph OSD performance dump https://pastebin.com/1R6YQksE Ceph tell osd.XX bench (Out of 36 osds only 8 OSDs give High IOPS value of 250 +. Out of that 4 OSDs are from HP 3PAR and 4 OSDS from DELL EMC. We are using only 4 OSDs from HP3 par and it is working fine without any latency and iops issues from the beginning but the remaining 32 OSDs are from DELL EMC in which 4 OSDs are much better than the remaining 28 OSDs) https://pastebin.com/CixaQmBi Please help me to identify if the issue is with the DELL EMC Storage, Ceph configuration parameter tuning or the Overload in the cloud setup On November 1, 2023 at 9:48 PM Eugen Block <eblock(a)nde.ag> wrote: > Hi, > > for starters please add more cluster details like 'ceph status', 'ceph > versions', 'ceph osd df tree'. Increasing the to 10G was the right > thing to do, you don't get far with 1G with real cluster load. How are > the OSDs configured (HDD only, SSD only or HDD with rocksdb on SSD)? > How is the disk utilization? > > Regards, > Eugen > > Zitat von prabhav(a)cdac.in: > > > In a production setup of 36 OSDs( SAS disks) totalling 180 TB > > allocated to a single Ceph Cluster with 3 monitors and 3 managers. > > There were 830 volumes and VMs created in Openstack with Ceph as a > > backend. On Sep 21, users reported slowness in accessing the VMs. > > Analysing the logs lead us to problem with SAS , Network congestion > > and Ceph configuration( as all default values were used). We updated > > the Network from 1Gbps to 10Gbps for public and cluster networking. > > There was no change. > > The ceph benchmark performance showed that 28 OSDs out of 36 OSDs > > reported very low IOPS of 30 to 50 while the remaining showed 300+ > > IOPS. > > We gradually started reducing the load on the ceph cluster and now > > the volumes count is 650. Now the slow operations has gradually > > reduced but I am aware that this is not the solution. > > Ceph configuration is updated with increasing the > > osd_journal_size to 10 GB, > > osd_max_backfills = 1 > > osd_recovery_max_active = 1 > > osd_recovery_op_priority = 1 > > bluestore_cache_trim_max_skip_pinned=10000 > > > > After one month, now we faced another issue with Mgr daemon stopped > > in all 3 quorums and 16 OSDs went down. From the > > ceph-mon,ceph-mgr.log could not get the reason. Please guide me as > > its a production setup > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io Thanks & Regards, Ms V A Prabha / श्रीमती प्रभा वी ए Joint Director / संयुक्त निदेशक Centre for Development of Advanced Computing(C-DAC) / प्रगत संगणन विकास केन्द्र(सी-डैक) Tidel Park”, 8th Floor, “D” Block, (North &South) / “टाइडल पार्क”,8वीं मंजिल, “डी” ब्लॉक, (उत्तर और दक्षिण) No.4, Rajiv Gandhi Salai / नं.4, राजीव गांधी सलाई Taramani / तारामणि Chennai / चेन्नई – 600113 Ph.No.:044-22542226/27 Fax No.: 044-22542294 ------------------------------------------------------------------------------------------------------------ [ C-DAC is on Social-Media too. Kindly follow us at: Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] This e-mail is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies and the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email is strictly prohibited and appropriate legal action will be taken. ------------------------------------------------------------------------------------------------------------

3 months, 1 week

4
15
0 0

Ceph 16.2.14: ceph-mgr getting oom-killed

by Zakhar Kirpichenko

Hi, I'm facing a rather new issue with our Ceph cluster: from time to time ceph-mgr on one of the two mgr nodes gets oom-killed after consuming over 100 GB RAM: [Nov21 15:02] tp_osd_tp invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 [ +0.000010] oom_kill_process.cold+0xb/0x10 [ +0.000002] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [ +0.000008] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=504d37b566d9fd442d45904a00584b4f61c93c5d49dc59eb1c948b3d1c096907,mems_allowed=0-1,global_oom,task_memcg=/docker/3826be8f9115479117ddb8b721ca57585b2bdd58a27c7ed7b38e8d83eb795957,task=ceph-mgr,pid=3941610,uid=167 [ +0.000697] Out of memory: Killed process 3941610 (ceph-mgr) total-vm:146986656kB, anon-rss:125340436kB, file-rss:0kB, shmem-rss:0kB, UID:167 pgtables:260356kB oom_score_adj:0 [ +6.509769] oom_reaper: reaped process 3941610 (ceph-mgr), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB The cluster is stable and operating normally, there's nothing unusual going on before, during or after the kill, thus it's unclear what causes the mgr to balloon, use all RAM and get killed. Systemd logs aren't very helpful: they just show normal mgr operations until it fails to allocate memory and gets killed: https://pastebin.com/MLyw9iVi The mgr experienced this issue several times in the last 2 months, and the events don't appear to correlate with any other events in the cluster because basically nothing else happened at around those times. How can I investigate this and figure out what's causing the mgr to consume all memory and get killed? I would very much appreciate any advice! Best regards, Zakhar

3 months, 2 weeks

6
21
0 0

Upgrading nautilus / centos7 to octopus / ubuntu 20.04. - Suggestions and hints?

by Götz Reinicke

Hi, As I’v read and thought a lot about the migration as this is a bigger project, I was wondering if anyone has done that already and might share some notes or playbooks, because in all readings there where some parts missing or miss understandable to me. I do have some different approaches in mind, so may be you have some suggestions or hints. a) upgrade nautilus on centos 7 with the few missing features like dashboard and prometheus. After that migrate one node after an other to ubuntu 20.04 with octopus and than upgrade ceph to the recent stable version. b) migrate one node after an other to ubuntu 18.04 with nautilus and then upgrade to octupus and after that to ubuntu 20.04. or c) upgrade one node after an other to ubuntu 20.04 with octopus and join it to the cluster until all nodes are upgraded. For test I tried c) with a mon node, but adding that to the cluster fails with some failed state, still probing for the other mons. (I dont have the right log at hand right now.) So my questions are: a) What would be the best (most stable) migration path and b) is it in general possible to add a new octopus mon (not upgraded one) to a nautilus cluster, where the other mons are still on nautilus? I hope my thoughts and questions are understandable :) Thanks for any hint and suggestion. Best . Götz

3 months, 3 weeks

7
11
0 0

RGW - user created bucket with name of already created bucket

by Ondřej Kukla

Hello, I would like to share a quite worrying experience I’ve just found on one of my production clusters. User successfully created a bucket with name of a bucket that already exists! He is not bucket owner - the original user is, but he is able to see it when he does ListBuckets over s3 api. (Both accounts are able to do it now - only the original owner is able to interact with it) This bucket is also counted to the new users usage stats. Has anyone noticed this before? This cluster is running on Quincy - 17.2.6. Is there a way to detach the bucket from the new owner so he doesn’t have a bucket that doesn’t belong to him? Regards, Ondrej

3 months, 3 weeks

2
3
0 0

RGW rate-limiting or anti-hammering for (external) auth requests // Anti-DoS measures

by Christian Rohmann

Hey Ceph-Users, RGW does have options [1] to rate limit ops or bandwidth per bucket or user. But those only come into play when the request is authenticated. I'd like to also protect the authentication subsystem from malicious or invalid requests. So in case e.g. some EC2 credentials are not valid (anymore) and clients start hammering the RGW with those requests, I'd like to make it cheap to deal with those requests. Especially in case some external authentication like OpenStack Keystone [2] is used, valid access tokens are cached within the RGW. But requests with invalid credentials end up being sent at full rate to the external API [3] as there is no negative caching. And even if there was, that would only limit the external auth requests for the same set of invalid credentials, but it would surely reduce the load in that case: Since the HTTP request is blocking .... > [...] > 2023-12-18T15:25:55.861+0000 7fec91dbb640 20 sending request to > https://keystone.example.com/v3/s3tokens > 2023-12-18T15:25:55.861+0000 7fec91dbb640 20 register_request > mgr=0x561a407ae0c0 req_data->id=778, curl_handle=0x7fedaccb36e0 > 2023-12-18T15:25:55.861+0000 7fec91dbb640 20 WARNING: blocking http > request > 2023-12-18T15:25:55.861+0000 7fede37fe640 20 link_request > req_data=0x561a40a418b0 req_data->id=778, curl_handle=0x7fedaccb36e0 > [...] this does not only stress the external authentication API (keystone in this case), but also blocks RGW threads for the duration of the external call. I am currently looking into using the capabilities of HAProxy to rate limit requests based on their resulting http-response [4]. So in essence to rate-limit or tarpit clients that "produce" a high number of 403 "InvalidAccessKeyId" responses. To have less collateral it might make sense to limit based on the presented credentials themselves. But this would require to extract and track HTTP headers or URL parameters (presigned URLs) [5] and to put them into tables. * What are your thoughts on the matter? * What kind of measures did you put in place? * Does it make sense to extend RGWs capabilities to deal with those cases itself? ** adding negative caching ** rate limits on concurrent external authentication requests (or is there a pool of connections for those requests?) Regards Christian [1] https://docs.ceph.com/en/latest/radosgw/admin/#rate-limit-management [2] https://docs.ceph.com/en/latest/radosgw/keystone/#integrating-with-openstac… [3] https://github.com/ceph/ceph/blob/86bb77eb9633bfd002e73b5e58b863bc2d0df594/… [4] https://www.haproxy.com/documentation/haproxy-configuration-manual/latest/#… [5] https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-reque…

3 months, 3 weeks

2
3
1 0

Is there any way to merge an rbd image's full backup and a diff?

by Satoru Takeuchi

Hi, I'm developing RBD images' backup system. In my case, a backup data must be stored at least two weeks. To meet this requirement, I'd like to take backups as follows: 1. Take a full backup by rbd export first. 2. Take a differencial backups everyday. 3. Merge the full backup and the oldest (taken two weeks ago) diff. As a result of evaluation, I confirmed there is no problem in step 1 and 2. However, I found that step 3 couldn't be accomplished by `rbd merge-diff <full backup> <diff>` because `rbd merge-diff` only accepts a diff as a first parameter. Is there any way to merge a full backup and a diff? Thanks, Satoru

3 months, 3 weeks

2
5
0 0

2024

2023

2022

2021

2020

2019

ceph-users December 2023