January 2023 - ceph-users

by Szabo, Istvan (Agoda)

Hi, Wonder have you ever faced issue with snaptrimming if you follow ceph pg allocation recommendation (100pg/osd)? We have a nautilus cluster and we scare to increase the pg-s of the pools because seems like even if we have 4osd/nvme, if the pg number is higher = the snaptrimming is slower. Eg.: We have these pools: Db1 pool size 64,504G with 512 PGs Db2 pool size 92,242G with 256 PGs Db2 snapshot remove faster than Db1. Our osds are very underutilized regarding pg point of view due to this reason, each osd is holding maximum 25 gigantic pgs which makes all the maintenance very difficult due to backfilling full, osd full issues. Any recommendation if you use this feature? Thank you ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.

1 year, 3 months

2
1
0 0

Re: OSD crash on Onode::put

by Igor Fedotov

Hi Dongdong, thanks a lot for your post, it's really helpful. Thanks, Igor On 1/5/2023 6:12 AM, Dongdong Tao wrote: > > I see many users recently reporting that they have been struggling > with this Onode::put race condition issue[1] on both the latest > Octopus and pacific. > Igor opened a PR [2] to address this issue, I've reviewed it > carefully, and looks good to me. I'm hoping this could get some > priority from the community. > > For those who had been hitting this issue, I would like to share a > workaround that could unblock you: > > During the investigation of this issue, I found this race condition > always happens after the bluestore onode cache size becomes 0. > Setting debug_bluestore = 1/30 will allow you to see the cache size > after the crash: > --- > 2022-10-25T00:47:26.562+0000 7f424f78e700 30 > bluestore.MempoolThread(0x564a9dae2a68) _resize_shards > max_shard_onodes: 0 max_shard_buffer: 8388608 > --- > > This is apparently wrong as this means the bluestore metadata cache is > basically disabled, > but it makes much sense to explain why we are hitting the race > condition so easily -- An onode will be trimmed right away after it's > unpinned. > > Keep going with the investigation, it turns out the culprit for the > 0-sized cache is the leak that happened in bluestore_cache_other mempool > Please refer to the bug tracker [3] which has the detail of the leak > issue, it was already fixed by [4], and the next Pacific point > release will have it. > But it was never backported to Octopus. > So if you are hitting the same: > For those who are on Octopus, you can manually backport this patch to > fix the leak and prevent the race condition from happening. > For those who are on Pacific, you can wait for the next Pacific point > release. > > By the way, I'm backporting the fix to ubuntu Octopus and Pacific > through this SRU [5], so it will be landed in ubuntu's package soon. > > [1] https://tracker.ceph.com/issues/56382 > [2] https://github.com/ceph/ceph/pull/47702 > [3] https://tracker.ceph.com/issues/56424 > [4] https://github.com/ceph/ceph/pull/46911 > [5] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1996010 > > Cheers, > Dongdong > > -- Igor Fedotov Ceph Lead Developer -- croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web <https://croit.io/> | LinkedIn <http://linkedin.com/company/croit> | Youtube <https://www.youtube.com/channel/UCIJJSKVdcSLGLBtwSFx_epw> | Twitter <https://twitter.com/croit_io> Meet us at the SC22 Conference! Learn more <https://croit.io/croit-sc22> Technology Fast50 Award Winner by Deloitte <https://www2.deloitte.com/de/de/pages/technology-media-and-telecommunicatio…>! <https://www2.deloitte.com/de/de/pages/technology-media-and-telecommunicatio…>

1 year, 3 months

5
10
0 0

[ERR] OSD_SCRUB_ERRORS: 2 scrub errors

by Szabo, Istvan (Agoda)

Hi, I’m continuously getting scrub errors in my index pool and log pool that I need to repair always. HEALTH_ERR 2 scrub errors; Possible data damage: 1 pg inconsistent [ERR] OSD_SCRUB_ERRORS: 2 scrub errors [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent pg 20.19 is active+clean+inconsistent, acting [39,41,37] Why is this? I have no cue at all, no log entry no anything ☹ ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.

1 year, 3 months

6
9
0 0

Intel Cache Solution with HA Cluster on the iSCSI Gateway node

by Kamran Zafar Syed

HI There, Is there someone, who had some experience of implementing Intel Cache Accelerator Solution on top of iSCSI Gateway. Thanks and Regards, Koki

1 year, 3 months

1
0
0 0

nfs RGW export makes nfs-gnaesha server in crash loop

by Ben

Hi, This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will crash Ganesha server pod. CephFS export works just fine. Here are steps of it: 1, create export: bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path /bucketexport --bucket testbk { "bind": "/bucketexport", "path": "testbk", "cluster": "nfs4rgw", "mode": "RW", "squash": "none" } 2, check pods status afterwards: rook-ceph-nfs-nfs1-a-679fdb795-82tcx 2/2 Running 0 4h3m rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 1/2 Error 2 4h6m 3, check failing pod’s logs: 11/01/2023 08:11:53 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 11/01/2023 08:11:54 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend 11/01/2023 08:11:54 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) 11/01/2023 08:11:57 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE 11/01/2023 08:11:57 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] export_defaults_commit :CONFIG :INFO :Export Defaults now (options=03303002/00080000 , , , , , , , , expire= 0) 2023-01-11T08:11:57.853+0000 7f59dac7c200 -1 auth: unable to find a keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory 2023-01-11T08:11:57.853+0000 7f59dac7c200 -1 AuthRegistry(0x56476817a480) no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx 2023-01-11T08:11:57.855+0000 7f59dac7c200 -1 auth: unable to find a keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory 2023-01-11T08:11:57.855+0000 7f59dac7c200 -1 AuthRegistry(0x7ffe4d092c90) no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx 2023-01-11T08:11:57.856+0000 7f5987537700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1] 2023-01-11T08:11:57.856+0000 7f5986535700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1] 2023-01-11T08:12:00.861+0000 7f5986d36700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1] 2023-01-11T08:12:00.861+0000 7f59dac7c200 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication failed to fetch mon config (--no-mon-config to skip) 4, delete the export: ceph nfs export delete nfs4rgw /bucketexport Ganesha servers go back normal: rook-ceph-nfs-nfs1-a-679fdb795-82tcx 2/2 Running 0 4h30m rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 2/2 Running 10 4h33m Any ideas to make it work? Thanks Ben

1 year, 3 months

1
0
0 0

Issues with cephadm adopt cluster with name

by armsby

I am trying to adopt a cluster with cephadm, and everything was ok when it came to mon and mgr servers, But when I try to run "cephadm adopt --name osd.340 --style legacy —cluster prod" It runs everything, but when the container starts, it says that it can not open /etc/ceph/prod.conf as it binds config in as /etc/ceph/ceph.conf If I change the unit.run file so it mounts it in as prod.conf it starts but has issues connecting to mon servers. Has someone else experienced this, and are there solutions for this issue? Armsby

1 year, 3 months

1
0
0 0

2 pgs backfill_toofull but plenty of space

by Torkil Svensgaard

Hi Ceph version 17.2.3 (dff484dfc9e19a9819f375586300b3b79d80034d) quincy (stable) Looking at this: " Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull " " [WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull pg 3.11f is active+remapped+backfill_wait+backfill_toofull, acting [98,51,39,100] pg 3.74c is active+remapped+backfill_wait+backfill_toofull, acting [96,120,58,48] " But the disks are noway near being full as far as I can determine, so why backfill_toofull? The PGs in question are in the rbd_data pool. " # ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 1.4 PiB 730 TiB 686 TiB 686 TiB 48.46 ssd 1.3 TiB 1.2 TiB 162 GiB 162 GiB 12.11 TOTAL 1.4 PiB 731 TiB 686 TiB 686 TiB 48.42 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL .mgr 1 1 1.1 GiB 273 545 MiB 0.05 549 GiB rbd_data 3 4096 294 TiB 78.56M 450 TiB 45.72 267 TiB rbd 4 32 4.1 MiB 26 3.5 MiB 0 549 GiB rbd_internal 5 32 54 KiB 16 172 KiB 0 549 GiB cephfs_data 6 2048 127 TiB 148.64M 229 TiB 29.99 267 TiB cephfs_metadata 7 128 71 GiB 2.84M 142 GiB 11.46 549 GiB libvirt 8 32 37 MiB 221 74 MiB 0 549 GiB nfs-ganesha 9 32 2.7 KiB 7 52 KiB 0 366 GiB .nfs 10 32 53 KiB 47 306 KiB 0 366 GiB " The top utilized disk is at 57% and the PGs in that pool are ~50GB. " TOP BOTTOM USE WEIGHT PGS ID |USE WEIGHT PGS ID --------------------------------+-------------------------------- 57.71% 1.00000 54 osd.68 |46.60% 1.00000 286 osd.17 57.08% 1.00000 53 osd.80 |46.55% 1.00000 286 osd.99 54.95% 1.00000 70 osd.86 |46.48% 1.00000 284 osd.106 54.86% 1.00000 52 osd.63 |45.88% 1.00000 187 osd.27 54.06% 1.00000 68 osd.88 |45.81% 1.00000 279 osd.5 53.89% 1.00000 51 osd.79 |44.95% 1.00000 272 osd.13 53.65% 1.00000 51 osd.67 |43.63% 1.00000 269 osd.16 53.59% 1.00000 52 osd.65 |43.30% 1.00000 261 osd.12 53.58% 1.00000 51 osd.82 |32.17% 1.00000 172 osd.4 53.52% 1.00000 50 osd.72 |0% 0 0 osd.49 --------------------------------+-------------------------------- " Mvh. Torkil -- Torkil Svensgaard Systems Administrator Danish Research Centre for Magnetic Resonance DRCMR, Section 714 Copenhagen University Hospital Amager and Hvidovre Kettegaard Allé 30, 2650 Hvidovre, Denmark

1 year, 3 months

3
2
0 0

Re: What's happening with ceph-users?

by Neha Ojha

Re-adding the dev list and adding the user list because others might benefit from this information. Thanks, Neha On Tue, Jan 10, 2023 at 10:21 AM Wyll Ingersoll < wyllys.ingersoll(a)keepertech.com> wrote: > Also, it was only my ceph-users account that was lost, dev account was > still active. > ------------------------------ > *From:* Wyll Ingersoll <wyllys.ingersoll(a)keepertech.com> > *Sent:* Tuesday, January 10, 2023 1:20 PM > *To:* Neha Ojha <nojha(a)redhat.com>; Adam Kraitman <akraitma(a)redhat.com>; > Dan Mick <dan.mick(a)redhat.com> > *Subject:* Re: What's happening with ceph-users? > > I ended up re-subscribing this morning. But it might be worth > investigating if others are having similar issues. > ------------------------------ > *From:* Neha Ojha <nojha(a)redhat.com> > *Sent:* Tuesday, January 10, 2023 1:14 PM > *To:* Wyll Ingersoll <wyllys.ingersoll(a)keepertech.com>; Adam Kraitman < > akraitma(a)redhat.com>; Dan Mick <dan.mick(a)redhat.com> > *Subject:* Re: What's happening with ceph-users? > > +Adam Kraitman <akraitma(a)redhat.com> +Dan Mick <dan.mick(a)redhat.com> Is > this expected? > > On Tue, Jan 10, 2023 at 6:15 AM Wyll Ingersoll < > wyllys.ingersoll(a)keepertech.com> wrote: > > > All of my subscriptions to the ceph.io lists (users and developers) seem > to have been deleted. Do we need to re-subscribe or is this something > that is being fixed? > ------------------------------ > *From:* Neha Ojha <nojha(a)redhat.com> > *Sent:* Monday, January 9, 2023 2:40 PM > *To:* Dan van der Ster <dvanders(a)gmail.com> > *Cc:* Ceph Developers <dev(a)ceph.io>; Josh Durgin <jdurgin(a)redhat.com>; > Mike Perez <miperez(a)redhat.com>; Adam Kraitman <akraitma(a)redhat.com> > *Subject:* Re: What's happening with ceph-users? > > Our mailing lists were down due to the recent lab issues. They should be > back up now. Please let us know if you see any issues. > > Thanks, > Neha > > On Sun, Jan 8, 2023 at 9:53 AM Dan van der Ster <dvanders(a)gmail.com> > wrote: > > Hi, > > Has ceph-users been down a few days? And now it seems to have been > reverted to an old backup? (I'm referring mail on an address I unsubbed > many months ago) > > Thanks, Dan > >

1 year, 3 months

1
0
0 0

Re: docs.ceph.com -- Do you use the header navigation bar? (RESPONSES REQUESTED)

by Frank Schilder

Hi John, firstly, image attachments are filtered out by the list. How about you upload the image somewhere like https://imgur.com/ and post a link instead? In my browser, the sticky header contains only "home" and "edit on github", which are both entirely useless for a user. What exactly is "header navigation" expected to do if it contains nothing else? Unless I'm looking at the wrong thing (I can't see the attached image), this header can be removed. The "edit on github" link can be added to the end of a page. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: John Zachary Dover <zac.dover(a)gmail.com> Sent: 04 January 2023 16:35:56 To: ceph-users Subject: [ceph-users] docs.ceph.com -- Do you use the header navigation bar? (RESPONSES REQUESTED) Do you use the header navigation bar on docs.ceph.com? See the attached file (sticky_header.png) if you are unsure of what "header navigation bar" means. In the attached file, the header navigation bar is indicated by means of two large, ugly, red-and-green arrows. *Cards on the Table* The navigation bar is the kind of thing that is sometimes referred to as a "sticky header", and it can get in the way of linked-to sections. I would like to remove this header bar. If there is community support for the header bar, though, I won't remove it. *What is Zac Complaining About?* Follow this procedure to see the behavior that has provoked my complaint: 1. Go to https://docs.ceph.com/en/quincy/glossary/ 2. Scroll down to the "Ceph Cluster Map" entry. 3. Click the "Cluster Map" link in the line that reads "See Cluster Map". 4. Notice that the header navigation bar obscures the headword "Cluster Map". If you have any opinion at all on this matter, voice it. Please. Zac Dover Docs Upstream Ceph _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

1 year, 3 months

3
2
0 0

Re: rbd-mirror stops replaying journal on primary cluster

by Josef Johansson

Hi, Actually, the test case was even more simple than that. A misaligned discard (discard_granularity_bytes=4096, offset=0, length=4096+512) made the journal stop replaying entries. This is now well covered in tests and example e2e-tests. The workaround is quite easy, set `rbd_discard_granularity_bytes = 0` in the client ceph conf and all discards will be applied to the rbd image. The fix should hopefully be backported to stable releases. Thanks for review by the ceph-team on this. If anyone could confirm that this indeed solves the problem, please let me know anyhow. Regards Josef On Thu, Dec 8, 2022 at 11:15 AM Josef Johansson <josef86(a)gmail.com> wrote: > > Hi, > > Running a simple > `echo 1>a;sync;rm a;sync;fstrim --all` > Triggers the problem. No need to have the mount point mounted with discard. > > On Thu, Dec 8, 2022 at 12:33 AM Josef Johansson <josef86(a)gmail.com> wrote: > > > > Hi, > > > > I've updated https://tracker.ceph.com/issues/57396 with some more > > info, it seems that disabling discard within a guest solves the > > problem (or switching from virtio-scsi-single to virtio-blk in older > > kernels). I'm testing two different VMs on the same hypervisor with > > identical configs, one works the other doesn't. > > > > Not sure what to make of it, seems that the kernel around 4.18+ are > > sending a weird discard? > > > > On Tue, Aug 30, 2022 at 8:43 AM Josef Johansson <josef86(a)gmail.com> wrote: > > > > > > Hi, > > > > > > There's nothing special in the cluster when it stops replaying. It > > > seems that a journal entry that the local replayer doesn't handle and > > > just stops. Since it's the local replayer that stops there's no logs > > > in rbd-mirror. The odd part is that rbd-mirror handles this totally > > > fine and is the one syncing correctly. > > > > > > What's worse is that this is reported as HEALTHY in status > > > information, even though when restarting that VM it will stall until > > > replaying is complete. The replay function inside rbd client seems to > > > be fine handling the journal, but only on start of the vm. I will try > > > to get a ticket open on tracker.ceph.com as soon as my account is > > > approved. > > > > > > I have tried to see what component is responsible for local replay but > > > I have not been successful yet. > > > > > > Thanks for answering :) > > > > > > On Mon, Aug 22, 2022 at 11:05 AM Eugen Block <eblock(a)nde.ag> wrote: > > > > > > > > Hi, > > > > > > > > IIRC the rbd mirror journals will grow if the sync stops to work, > > > > which seems to be the case here. Does the primary cluster experience > > > > any high load when the replay stops? How is the connection between the > > > > two sites and is the link saturated? Does the rbd-mirror log reveal > > > > anything useful (maybe also in debug mode)? > > > > > > > > Regards, > > > > Eugen > > > > > > > > Zitat von Josef Johansson <josef(a)oderland.se>: > > > > > > > > > Hi, > > > > > > > > > > I'm running ceph octopus 15.2.16 and I'm trying out two way mirroring. > > > > > > > > > > Everything seems to running fine except sometimes when the replay > > > > > stops at the primary clusters. > > > > > > > > > > This means that VMs will not start properly until all journal > > > > > entries are replayed, but also that the journal grows by time. > > > > > > > > > > I am trying to find out why this occurs, and where to look for more > > > > > information. > > > > > > > > > > I am currently using rbd --pool <pool> --image <image> journal > > > > > status to see if the clients are in sync or not. > > > > > > > > > > Example output when things went sideways > > > > > > > > > > minimum_set: 0 > > > > > active_set: 2 > > > > > registered clients: > > > > > [id=, commit_position=[positions=[[object_number=0, tag_tid=1, > > > > > entry_tid=4592], [object_number=3, tag_tid=1, entry_tid=4591], > > > > > [object_number=2, tag_tid=1, entry_tid=4590], [object_number=1, > > > > > tag_tid=1, entry_tid=4589]]], state=connected] > > > > > [id=bdde9b90-df26-4e3d-84b3-66605dc45608, > > > > > commit_position=[positions=[[object_number=5, tag_tid=1, > > > > > entry_tid=19913], [object_number=4, tag_tid=1, entry_tid=19912], > > > > > [object_number=7, tag_tid=1, entry_tid=19911], [object_number=6, > > > > > tag_tid=1, entry_tid=19910]]], state=disconnected] > > > > > > > > > > Right now I'm trying to catch it red handed in the primary osd logs. > > > > > But I'm not even sure if that's the process that is replaying the > > > > > journal.. > > > > > > > > > > Regards > > > > > Josef > > > > > _______________________________________________ > > > > > ceph-users mailing list -- ceph-users(a)ceph.io > > > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > > > > > > > > > > > > > > _______________________________________________ > > > > ceph-users mailing list -- ceph-users(a)ceph.io > > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io

1 year, 3 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users January 2023