September 2023 - ceph-users

by Frank Schilder

Hi all, we had a client with the warning "[WRN] MDS_CLIENT_OLDEST_TID: 1 clients failing to advance oldest client/flush tid". I looked at the client and there was nothing going on, so I rebooted it. After the client was back, the message was still there. To clean this up I failed the MDS. Unfortunately, the MDS that took over is remained stuck in rejoin without doing anything. All that happened in the log was: [root@ceph-10 ceph]# tail -f ceph-mds.ceph-10.log 2023-07-20T15:54:29.147+0200 7fedb9c9f700 1 mds.2.896604 rejoin_start 2023-07-20T15:54:29.161+0200 7fedb9c9f700 1 mds.2.896604 rejoin_joint_start 2023-07-20T15:55:28.005+0200 7fedb9c9f700 1 mds.ceph-10 Updating MDS map to version 896614 from mon.4 2023-07-20T15:56:00.278+0200 7fedb9c9f700 1 mds.ceph-10 Updating MDS map to version 896615 from mon.4 [...] 2023-07-20T16:02:54.935+0200 7fedb9c9f700 1 mds.ceph-10 Updating MDS map to version 896653 from mon.4 2023-07-20T16:03:07.276+0200 7fedb9c9f700 1 mds.ceph-10 Updating MDS map to version 896654 from mon.4 After some time I decided to give another fail a try and, this time, the replacement daemon went to active state really fast. If I have a message like the above, what is the clean way of getting the client clean again (version: 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable))? Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

6 months

2
16
0 0

6.5 CephFS client - ceph_cap_reclaim_work [ceph] / ceph_con_workfn [libceph] hogged CPU

by Stefan Kooman

Hi, Since the 6.5 kernel addressed the issue with regards to regression in the readahead handling code... we went ahead and installed this kernel for a couple of mail / web clusters (Ubuntu 6.5.1-060501-generic #202309020842 SMP PREEMPT_DYNAMIC Sat Sep 2 08:48:34 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux). Since then we occasionally see the following being logged by the kernel: [Sun Sep 10 07:19:00 2023] workqueue: delayed_work [ceph] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND [Sun Sep 10 08:41:24 2023] workqueue: ceph_con_workfn [libceph] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND [Sun Sep 10 11:05:55 2023] workqueue: delayed_work [ceph] hogged CPU for >10000us 8 times, consider switching to WQ_UNBOUND [Sun Sep 10 12:54:38 2023] workqueue: ceph_con_workfn [libceph] hogged CPU for >10000us 8 times, consider switching to WQ_UNBOUND [Sun Sep 10 19:06:37 2023] workqueue: ceph_con_workfn [libceph] hogged CPU for >10000us 16 times, consider switching to WQ_UNBOUND [Mon Sep 11 10:53:33 2023] workqueue: ceph_con_workfn [libceph] hogged CPU for >10000us 32 times, consider switching to WQ_UNBOUND [Tue Sep 12 10:14:03 2023] workqueue: ceph_con_workfn [libceph] hogged CPU for >10000us 64 times, consider switching to WQ_UNBOUND [Tue Sep 12 11:14:33 2023] workqueue: ceph_cap_reclaim_work [ceph] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND We wonder if this is a new phenomenon, or that it's rather logged in the new kernel and it was not before. However, we have hit a few OOM situations since we switched to the new kernel because of ceph_cap_reclaim_work events (OOM is because Apache threads keep piling up as it cannot access CephFS). We then also see MDS slow ops reported. This might be related to a backup job that is running on a backup server. We did not observe this behavior on 5.12.19 kernel. Ceph cluster is on 16.2.11 currently. Anyone has some insight on this? Thanks, Stefan

6 months

3
12
0 0

Re: RGW access logs with bucket name

by Boris Behrens

Bringing up that topic again: is it possible to log the bucket name in the rgw client logs? currently I am only to know the bucket name when someone access the bucket via https://TLD/bucket/object instead of https://bucket.TLD/object. Am Di., 3. Jan. 2023 um 10:25 Uhr schrieb Boris Behrens <bb(a)kervyn.de>: > Hi, > I am looking forward to move our logs from > /var/log/ceph/ceph-client...log to our logaggregator. > > Is there a way to have the bucket name in the log file? > > Or can I write the rgw_enable_ops_log into a file? Maybe I could work with > this. > > Cheers and happy new year > Boris > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

6 months, 1 week

4
6
0 0

Re: Status of Quincy 17.2.5 ?

by Christian Rohmann

Hey everyone, On 20/10/2022 10:12, Christian Rohmann wrote: > 1) May I bring up again my remarks about the timing: > > On 19/10/2022 11:46, Christian Rohmann wrote: > >> I believe the upload of a new release to the repo prior to the >> announcement happens quite regularly - it might just be due to the >> technical process of releasing. >> But I agree it would be nice to have a more "bit flip" approach to >> new releases in the repo and not have the packages appear as updates >> prior to the announcement and final release and update notes. > By my observations sometimes there are packages available on the > download servers via the "last stable" folders such as > https://download.ceph.com/debian-quincy/ quite some time before the > announcement of a release is out. > I know it's hard to time this right with mirrors requiring some time > to sync files, but would be nice to not see the packages or have > people install them before there are the release notes and potential > pointers to changes out. Todays 16.2.11 release shows the exact issue I described above .... 1) 16.2.11 packages are already available via e.g. https://download.ceph.com/debian-pacific 2) release notes not yet merged: (https://github.com/ceph/ceph/pull/49839), thus https://ceph.io/en/news/blog/2022/v16-2-11-pacific-released/ show a 404 :-) 3) No announcement like https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/QOCU563UD3… to the ML yet. Regards Christian

6 months, 1 week

2
2
0 0

[Pacific] ceph orch device ls do not returns any HDD

by Patrick Begou

Hi everyone I'm new to CEPH, just a french 4 days training session with Octopus on VMs that convince me to build my first cluster. At this time I have 4 old identical nodes for testing with 3 HDDs each, 2 network interfaces and running Alma Linux8 (el8). I try to replay the training session but it fails, breaking the web interface because of some problems with podman 4.2 not compatible with Octopus. So I try to deploy Pacific with cephadm tool on my first node (mostha1) (to enable testing also an upgrade later). dnf -y install https://download.ceph.com/rpm-16.2.13/el8/noarch/cephadm-16.2.13-0.el8.noar… monip=$(getent ahostsv4 mostha1 |head -n 1| awk '{ print $1 }') cephadm bootstrap --mon-ip $monip --initial-dashboard-password xxxxx \ --initial-dashboard-user admceph \ --allow-fqdn-hostname --cluster-network 10.1.0.0/16 This was sucessfull. But running "*c**eph orch device ls*" do not show any HDD even if I have /dev/sda (used by the OS), /dev/sdb and /dev/sdc The web interface shows a row capacity which is an aggregate of the sizes of the 3 HDDs for the node. I've also tried to reset /dev/sdb but cephadm do not see it: [ceph: root@mostha1 /]# ceph orch device zap mostha1.legi.grenoble-inp.fr /dev/sdb --force Error EINVAL: Device path '/dev/sdb' not found on host 'mostha1.legi.grenoble-inp.fr' On my first attempt with octopus, I was able to list the available HDD with this command line. Before moving to Pacific, the OS on this node has been reinstalled from scratch. Any advices for a CEPH beginner ? Thanks Patrick

6 months, 2 weeks

9
51
3 0

Specify priority for active MGR and MDS

by Nicolas FONTAINE

Hi everyone, Is there a way to specify which MGR and which MDS should be the active one? Thanks, Nicolas.

6 months, 3 weeks

3
2
0 0

Remove empty orphaned PGs not mapped to a pool

by Malte Stroem

Hello, we removed an SSD cache tier and its pool. The PGs for the pool do still exist. The cluster is healthy. The PGs are empty and they reside on the cache tier pool's SSDs. We like to take out the disks but it is not possible. The cluster sees the PGs and answers with a HEALTH_WARN. Because of the replication of three there are still 128 PGs on three of the 24 OSDs. We were able to remove the other OSDs. Summary: - pool removed - 3 x 128 empty PGs still exist - 3 of 24 OSDs still exist How is it possible to remove these empty and healthy PGs? The only way I found was something like: ceph pg {pg-id} mark_unfound_lost delete Is that the right way? Some output of: ceph pg ls-by-osd 23 PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE SINCE VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP 3.0 0 0 0 0 0 0 0 0 active+clean 27h 0'0 2627265:196316 [15,6,23]p15 [15,6,23]p15 2023-09-28T12:41:52.982955+0200 2023-09-27T06:48:23.265838+0200 3.1 0 0 0 0 0 0 0 0 active+clean 9h 0'0 2627266:19330 [6,23,15]p6 [6,23,15]p6 2023-09-29T06:30:57.630016+0200 2023-09-27T22:58:21.992451+0200 3.2 0 0 0 0 0 0 0 0 active+clean 2h 0'0 2627265:1135185 [23,15,6]p23 [23,15,6]p23 2023-09-29T13:42:07.346658+0200 2023-09-24T14:31:52.844427+0200 3.3 0 0 0 0 0 0 0 0 active+clean 13h 0'0 2627266:193170 [6,15,23]p6 [6,15,23]p6 2023-09-29T01:56:54.517337+0200 2023-09-27T17:47:24.961279+0200 3.4 0 0 0 0 0 0 0 0 active+clean 14h 0'0 2627265:2343551 [23,6,15]p23 [23,6,15]p23 2023-09-29T00:47:47.548860+0200 2023-09-25T09:39:51.259304+0200 3.5 0 0 0 0 0 0 0 0 active+clean 2h 0'0 2627265:194111 [15,6,23]p15 [15,6,23]p15 2023-09-29T13:28:48.879959+0200 2023-09-26T15:35:44.217302+0200 3.6 0 0 0 0 0 0 0 0 active+clean 6h 0'0 2627265:2345717 [23,15,6]p23 [23,15,6]p23 2023-09-29T09:26:02.534825+0200 2023-09-27T21:56:57.500126+0200 Best regards, Malte

6 months, 3 weeks

4
12
0 0

CephFS: convert directory into subvolume

by Eugen Block

Hi, while writing a response to [1] I tried to convert an existing directory within a single cephfs into a subvolume. According to [2] that should be possible, I'm just wondering how to confirm that it actually worked. Because setting the xattr works fine, the directory just doesn't show up in the subvolume ls command. This is what I tried (in Reef and Pacific): # one "regular" subvolume already exists $ ceph fs subvolume ls cephfs [ { "name": "subvol1" } ] # mounted / and created new subdir $ mkdir /mnt/volumes/subvol2 $ setfattr -n ceph.dir.subvolume -v 1 /mnt/volumes/subvol2 # still only one subvolume $ ceph fs subvolume ls cephfs [ { "name": "subvol1" } ] I also tried it directly underneath /mnt: $ mkdir /mnt/subvol2 $ setfattr -n ceph.dir.subvolume -v 1 /mnt/subvol2 But still no subvolume2 available. What am I missing here? Thanks Eugen [1] https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/G4ZWGGUPPFQ… [2] https://www.spinics.net/lists/ceph-users/msg72341.html

6 months, 4 weeks

3
7
0 0

Clients failing to respond to capability release

by Tim Bishop

Hi, I've seen this issue mentioned in the past, but with older releases. So I'm wondering if anybody has any pointers. The Ceph cluster is running Pacific 16.2.13 on Ubuntu 20.04. Almost all clients are working fine, with the exception of our backup server. This is using the kernel CephFS client on Ubuntu 22.04 with kernel 6.2.0 [1] (so I suspect a newer Ceph version?). The backup server has multiple (12) CephFS mount points. One of them, the busiest, regularly causes this error on the cluster: HEALTH_WARN 1 clients failing to respond to capability release [WRN] MDS_CLIENT_LATE_RELEASE: 1 clients failing to respond to capability release mds.mds-server(mds.0): Client backupserver:cephfs-backupserver failing to respond to capability release client_id: 521306112 And occasionally, which may be unrelated, but occurs at the same time: [WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests mds.mds-server(mds.0): 1 slow requests are blocked > 30 secs The second one clears itself, but the first sticks until I can unmount the filesystem on the client after the backup completes. It appears that whilst it's in this stuck state there may be one or more directory trees that are inaccessible to all clients. The backup server is walking the whole tree but never gets stuck itself, so either the inaccessible directory entry is caused after it has gone past, or it's not affected. Maybe the backup server is holding a directory when it shouldn't? It may be that an upgrade to Quincy resolves this, since it's more likely to be inline with the kernel client version wise, but I don't want to knee-jerk upgrade just to try and fix this problem. Thanks for any advice. Tim. [1] The reason for the newer kernel is that the backup performance from CephFS was terrible with older kernels. This newer kernel does at least resolve that issue.

6 months, 4 weeks

4
5
0 0

ceph-dashboard python warning with new pyo3 0.17 lib (debian12)

by DERUMIER, Alexandre

Hi, on debian12, ceph-dashboard is throwing a warning "Module 'dashboard' has failed dependency: PyO3 modules may only be initialized once per interpreter process" Seem to be related to pyo3 0.17 change https://github.com/PyO3/pyo3/blob/7bdc504252a2f972ba3490c44249b202a4ce6180/… " Each #[pymodule] can now only be initialized once per process To make PyO3 modules sound in the presence of Python sub-interpreters, for now it has been necessary to explicitly disable the ability to initialize a #[pymodule] more than once in the same process. Attempting to do this will now raise an ImportError. "

7 months

3
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users September 2023