Hi,
Wonder have you ever faced issue with snaptrimming if you follow ceph pg allocation recommendation (100pg/osd)?
We have a nautilus cluster and we scare to increase the pg-s of the pools because seems like even if we have 4osd/nvme, if the pg number is higher = the snaptrimming is slower.
Eg.:
We have these pools:
Db1 pool size 64,504G with 512 PGs
Db2 pool size 92,242G with 256 PGs
Db2 snapshot remove faster than Db1.
Our osds are very underutilized regarding pg point of view due to this reason, each osd is holding maximum 25 gigantic pgs which makes all the maintenance very difficult due to backfilling full, osd full issues.
Any recommendation if you use this feature?
Thank you
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
Hi Dongdong,
thanks a lot for your post, it's really helpful.
Thanks,
Igor
On 1/5/2023 6:12 AM, Dongdong Tao wrote:
>
> I see many users recently reporting that they have been struggling
> with this Onode::put race condition issue[1] on both the latest
> Octopus and pacific.
> Igor opened a PR [2] to address this issue, I've reviewed it
> carefully, and looks good to me. I'm hoping this could get some
> priority from the community.
>
> For those who had been hitting this issue, I would like to share a
> workaround that could unblock you:
>
> During the investigation of this issue, I found this race condition
> always happens after the bluestore onode cache size becomes 0.
> Setting debug_bluestore = 1/30 will allow you to see the cache size
> after the crash:
> ---
> 2022-10-25T00:47:26.562+0000 7f424f78e700 30
> bluestore.MempoolThread(0x564a9dae2a68) _resize_shards
> max_shard_onodes: 0 max_shard_buffer: 8388608
> ---
>
> This is apparently wrong as this means the bluestore metadata cache is
> basically disabled,
> but it makes much sense to explain why we are hitting the race
> condition so easily -- An onode will be trimmed right away after it's
> unpinned.
>
> Keep going with the investigation, it turns out the culprit for the
> 0-sized cache is the leak that happened in bluestore_cache_other mempool
> Please refer to the bug tracker [3] which has the detail of the leak
> issue, it was already fixed by [4], and the next Pacific point
> release will have it.
> But it was never backported to Octopus.
> So if you are hitting the same:
> For those who are on Octopus, you can manually backport this patch to
> fix the leak and prevent the race condition from happening.
> For those who are on Pacific, you can wait for the next Pacific point
> release.
>
> By the way, I'm backporting the fix to ubuntu Octopus and Pacific
> through this SRU [5], so it will be landed in ubuntu's package soon.
>
> [1] https://tracker.ceph.com/issues/56382
> [2] https://github.com/ceph/ceph/pull/47702
> [3] https://tracker.ceph.com/issues/56424
> [4] https://github.com/ceph/ceph/pull/46911
> [5] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1996010
>
> Cheers,
> Dongdong
>
>
--
Igor Fedotov
Ceph Lead Developer
--
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web <https://croit.io/> | LinkedIn <http://linkedin.com/company/croit> |
Youtube <https://www.youtube.com/channel/UCIJJSKVdcSLGLBtwSFx_epw> |
Twitter <https://twitter.com/croit_io>
Meet us at the SC22 Conference! Learn more <https://croit.io/croit-sc22>
Technology Fast50 Award Winner by Deloitte
<https://www2.deloitte.com/de/de/pages/technology-media-and-telecommunicatio…>!
<https://www2.deloitte.com/de/de/pages/technology-media-and-telecommunicatio…>
Hi,
I’m continuously getting scrub errors in my index pool and log pool that I need to repair always.
HEALTH_ERR 2 scrub errors; Possible data damage: 1 pg inconsistent
[ERR] OSD_SCRUB_ERRORS: 2 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 20.19 is active+clean+inconsistent, acting [39,41,37]
Why is this?
I have no cue at all, no log entry no anything ☹
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
Hi,
This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will
crash Ganesha server pod. CephFS export works just fine. Here are steps of
it:
1, create export:
bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path
/bucketexport --bucket testbk
{
"bind": "/bucketexport",
"path": "testbk",
"cluster": "nfs4rgw",
"mode": "RW",
"squash": "none"
}
2, check pods status afterwards:
rook-ceph-nfs-nfs1-a-679fdb795-82tcx 2/2 Running
0 4h3m
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 1/2 Error
2 4h6m
3, check failing pod’s logs:
11/01/2023 08:11:53 : epoch 63be6f49 :
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
11/01/2023 08:11:54 : epoch 63be6f49 :
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
nfs_start_grace :STATE :EVENT :grace reload client info completed from
backend
11/01/2023 08:11:54 : epoch 63be6f49 :
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid
count(0)
11/01/2023 08:11:57 : epoch 63be6f49 :
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
11/01/2023 08:11:57 : epoch 63be6f49 :
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
export_defaults_commit :CONFIG :INFO :Export Defaults now
(options=03303002/00080000 , , , ,
, , , , expire= 0)
2023-01-11T08:11:57.853+0000 7f59dac7c200 -1 auth: unable to find a keyring
on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
2023-01-11T08:11:57.853+0000 7f59dac7c200 -1 AuthRegistry(0x56476817a480)
no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling
cephx
2023-01-11T08:11:57.855+0000 7f59dac7c200 -1 auth: unable to find a keyring
on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
2023-01-11T08:11:57.855+0000 7f59dac7c200 -1 AuthRegistry(0x7ffe4d092c90)
no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling
cephx
2023-01-11T08:11:57.856+0000 7f5987537700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support [1]
2023-01-11T08:11:57.856+0000 7f5986535700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support [1]
2023-01-11T08:12:00.861+0000 7f5986d36700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support [1]
2023-01-11T08:12:00.861+0000 7f59dac7c200 -1 monclient: authenticate NOTE:
no keyring found; disabled cephx authentication
failed to fetch mon config (--no-mon-config to skip)
4, delete the export:
ceph nfs export delete nfs4rgw /bucketexport
Ganesha servers go back normal:
rook-ceph-nfs-nfs1-a-679fdb795-82tcx 2/2 Running
0 4h30m
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 2/2 Running
10 4h33m
Any ideas to make it work?
Thanks
Ben
I am trying to adopt a cluster with cephadm, and everything was ok when it came to mon and mgr servers,
But when I try to run "cephadm adopt --name osd.340 --style legacy —cluster prod"
It runs everything, but when the container starts, it says that it can not open /etc/ceph/prod.conf as it binds config in as /etc/ceph/ceph.conf
If I change the unit.run file so it mounts it in as prod.conf it starts but has issues connecting to mon servers.
Has someone else experienced this, and are there solutions for this issue?
Armsby
Re-adding the dev list and adding the user list because others might
benefit from this information.
Thanks,
Neha
On Tue, Jan 10, 2023 at 10:21 AM Wyll Ingersoll <
wyllys.ingersoll(a)keepertech.com> wrote:
> Also, it was only my ceph-users account that was lost, dev account was
> still active.
> ------------------------------
> *From:* Wyll Ingersoll <wyllys.ingersoll(a)keepertech.com>
> *Sent:* Tuesday, January 10, 2023 1:20 PM
> *To:* Neha Ojha <nojha(a)redhat.com>; Adam Kraitman <akraitma(a)redhat.com>;
> Dan Mick <dan.mick(a)redhat.com>
> *Subject:* Re: What's happening with ceph-users?
>
> I ended up re-subscribing this morning. But it might be worth
> investigating if others are having similar issues.
> ------------------------------
> *From:* Neha Ojha <nojha(a)redhat.com>
> *Sent:* Tuesday, January 10, 2023 1:14 PM
> *To:* Wyll Ingersoll <wyllys.ingersoll(a)keepertech.com>; Adam Kraitman <
> akraitma(a)redhat.com>; Dan Mick <dan.mick(a)redhat.com>
> *Subject:* Re: What's happening with ceph-users?
>
> +Adam Kraitman <akraitma(a)redhat.com> +Dan Mick <dan.mick(a)redhat.com> Is
> this expected?
>
> On Tue, Jan 10, 2023 at 6:15 AM Wyll Ingersoll <
> wyllys.ingersoll(a)keepertech.com> wrote:
>
>
> All of my subscriptions to the ceph.io lists (users and developers) seem
> to have been deleted. Do we need to re-subscribe or is this something
> that is being fixed?
> ------------------------------
> *From:* Neha Ojha <nojha(a)redhat.com>
> *Sent:* Monday, January 9, 2023 2:40 PM
> *To:* Dan van der Ster <dvanders(a)gmail.com>
> *Cc:* Ceph Developers <dev(a)ceph.io>; Josh Durgin <jdurgin(a)redhat.com>;
> Mike Perez <miperez(a)redhat.com>; Adam Kraitman <akraitma(a)redhat.com>
> *Subject:* Re: What's happening with ceph-users?
>
> Our mailing lists were down due to the recent lab issues. They should be
> back up now. Please let us know if you see any issues.
>
> Thanks,
> Neha
>
> On Sun, Jan 8, 2023 at 9:53 AM Dan van der Ster <dvanders(a)gmail.com>
> wrote:
>
> Hi,
>
> Has ceph-users been down a few days? And now it seems to have been
> reverted to an old backup? (I'm referring mail on an address I unsubbed
> many months ago)
>
> Thanks, Dan
>
>
Hi John,
firstly, image attachments are filtered out by the list. How about you upload the image somewhere like https://imgur.com/ and post a link instead?
In my browser, the sticky header contains only "home" and "edit on github", which are both entirely useless for a user. What exactly is "header navigation" expected to do if it contains nothing else? Unless I'm looking at the wrong thing (I can't see the attached image), this header can be removed. The "edit on github" link can be added to the end of a page.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: John Zachary Dover <zac.dover(a)gmail.com>
Sent: 04 January 2023 16:35:56
To: ceph-users
Subject: [ceph-users] docs.ceph.com -- Do you use the header navigation bar? (RESPONSES REQUESTED)
Do you use the header navigation bar on docs.ceph.com? See the attached
file (sticky_header.png) if you are unsure of what "header navigation bar"
means. In the attached file, the header navigation bar is indicated by
means of two large, ugly, red-and-green arrows.
*Cards on the Table*
The navigation bar is the kind of thing that is sometimes referred to as a
"sticky header", and it can get in the way of linked-to sections. I would
like to remove this header bar. If there is community support for the
header bar, though, I won't remove it.
*What is Zac Complaining About?*
Follow this procedure to see the behavior that has provoked my complaint:
1. Go to https://docs.ceph.com/en/quincy/glossary/
2. Scroll down to the "Ceph Cluster Map" entry.
3. Click the "Cluster Map" link in the line that reads "See Cluster Map".
4. Notice that the header navigation bar obscures the headword "Cluster
Map".
If you have any opinion at all on this matter, voice it. Please.
Zac Dover
Docs
Upstream Ceph
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hi,
Actually, the test case was even more simple than that. A misaligned
discard (discard_granularity_bytes=4096, offset=0, length=4096+512)
made the journal stop replaying entries. This is now well covered in
tests and example e2e-tests.
The workaround is quite easy, set `rbd_discard_granularity_bytes = 0`
in the client ceph conf and all discards will be applied to the rbd
image. The fix should hopefully be backported to stable releases.
Thanks for review by the ceph-team on this.
If anyone could confirm that this indeed solves the problem, please
let me know anyhow.
Regards
Josef
On Thu, Dec 8, 2022 at 11:15 AM Josef Johansson <josef86(a)gmail.com> wrote:
>
> Hi,
>
> Running a simple
> `echo 1>a;sync;rm a;sync;fstrim --all`
> Triggers the problem. No need to have the mount point mounted with discard.
>
> On Thu, Dec 8, 2022 at 12:33 AM Josef Johansson <josef86(a)gmail.com> wrote:
> >
> > Hi,
> >
> > I've updated https://tracker.ceph.com/issues/57396 with some more
> > info, it seems that disabling discard within a guest solves the
> > problem (or switching from virtio-scsi-single to virtio-blk in older
> > kernels). I'm testing two different VMs on the same hypervisor with
> > identical configs, one works the other doesn't.
> >
> > Not sure what to make of it, seems that the kernel around 4.18+ are
> > sending a weird discard?
> >
> > On Tue, Aug 30, 2022 at 8:43 AM Josef Johansson <josef86(a)gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > There's nothing special in the cluster when it stops replaying. It
> > > seems that a journal entry that the local replayer doesn't handle and
> > > just stops. Since it's the local replayer that stops there's no logs
> > > in rbd-mirror. The odd part is that rbd-mirror handles this totally
> > > fine and is the one syncing correctly.
> > >
> > > What's worse is that this is reported as HEALTHY in status
> > > information, even though when restarting that VM it will stall until
> > > replaying is complete. The replay function inside rbd client seems to
> > > be fine handling the journal, but only on start of the vm. I will try
> > > to get a ticket open on tracker.ceph.com as soon as my account is
> > > approved.
> > >
> > > I have tried to see what component is responsible for local replay but
> > > I have not been successful yet.
> > >
> > > Thanks for answering :)
> > >
> > > On Mon, Aug 22, 2022 at 11:05 AM Eugen Block <eblock(a)nde.ag> wrote:
> > > >
> > > > Hi,
> > > >
> > > > IIRC the rbd mirror journals will grow if the sync stops to work,
> > > > which seems to be the case here. Does the primary cluster experience
> > > > any high load when the replay stops? How is the connection between the
> > > > two sites and is the link saturated? Does the rbd-mirror log reveal
> > > > anything useful (maybe also in debug mode)?
> > > >
> > > > Regards,
> > > > Eugen
> > > >
> > > > Zitat von Josef Johansson <josef(a)oderland.se>:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm running ceph octopus 15.2.16 and I'm trying out two way mirroring.
> > > > >
> > > > > Everything seems to running fine except sometimes when the replay
> > > > > stops at the primary clusters.
> > > > >
> > > > > This means that VMs will not start properly until all journal
> > > > > entries are replayed, but also that the journal grows by time.
> > > > >
> > > > > I am trying to find out why this occurs, and where to look for more
> > > > > information.
> > > > >
> > > > > I am currently using rbd --pool <pool> --image <image> journal
> > > > > status to see if the clients are in sync or not.
> > > > >
> > > > > Example output when things went sideways
> > > > >
> > > > > minimum_set: 0
> > > > > active_set: 2
> > > > > registered clients:
> > > > > [id=, commit_position=[positions=[[object_number=0, tag_tid=1,
> > > > > entry_tid=4592], [object_number=3, tag_tid=1, entry_tid=4591],
> > > > > [object_number=2, tag_tid=1, entry_tid=4590], [object_number=1,
> > > > > tag_tid=1, entry_tid=4589]]], state=connected]
> > > > > [id=bdde9b90-df26-4e3d-84b3-66605dc45608,
> > > > > commit_position=[positions=[[object_number=5, tag_tid=1,
> > > > > entry_tid=19913], [object_number=4, tag_tid=1, entry_tid=19912],
> > > > > [object_number=7, tag_tid=1, entry_tid=19911], [object_number=6,
> > > > > tag_tid=1, entry_tid=19910]]], state=disconnected]
> > > > >
> > > > > Right now I'm trying to catch it red handed in the primary osd logs.
> > > > > But I'm not even sure if that's the process that is replaying the
> > > > > journal..
> > > > >
> > > > > Regards
> > > > > Josef
> > > > > _______________________________________________
> > > > > ceph-users mailing list -- ceph-users(a)ceph.io
> > > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users(a)ceph.io
> > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io