- ceph-users - lists.ceph.io

by Malte Stroem

Hello, we'd like to upgrade our cluster from the latest Ceph 15 to Ceph 18. It's running with cephadm. What's the right way to do it? Latest Ceph 15 to latest 16 and then to the latest 17 and then the latest 18? Does that work? Or is it possible to jump from the latest Ceph 16 to the latest Ceph 18? Latest Ceph 15 -> latest Ceph 16 -> latest Ceph 18. Best, Malte

4 days, 18 hours

3
4
0 0

MDS daemons crash

by Alexey GERASIMOV

Hello all! Hope that anybody can help us. The initial point: Ceph cluster v15.2 (installed and controlled by the Proxmox) with 3 nodes based on physical servers rented from a cloud provider. The volumes provided by Ceph using CephFS and RBD also. We run 2 MDS daemons but use max_mds=1 so one daemon was in active state, and another in standby. On Thursday some of the applications stopped working. After the investigation it was clear that we have a problem with Ceph, more precisely with СephFS - both MDS daemons suddenly crashed. We tried to restart them and found that they crashed again immediately after the start. The crash information: 2024-04-17T17:47:42.841+0000 7f959ced9700 1 mds.0.29134 recovery_done -- successful recovery! 2024-04-17T17:47:42.853+0000 7f959ced9700 1 mds.0.29134 active_start 2024-04-17T17:47:42.881+0000 7f959ced9700 1 mds.0.29134 cluster recovered. 2024-04-17T17:47:43.825+0000 7f959aed5700 -1 ./src/mds/OpenFileTable.cc: In function 'void OpenFileTable::commit(MDSContext*, uint64_t, int)' thread 7f959aed5700 time 2024-04-17T17:47:43.831243+0000 ./src/mds/OpenFileTable.cc: 549: FAILED ceph_assert(count > 0) Next hours we read tons of articles, studied the documentation, and checked the cluster status in general by the various diagnostic commands - but didn't find anything wrong. At evening we decided to upgrade our Ceph cluster; so, we upgraded it to v16, and finally to v17.2.7. Unfortunately, it didn't solve the problem, MDS continue to crash with the same error. The only difference that we found is the "1 MDSs report damaged metadata" in the output of ceph -s - see it below. I supposed that it may be the well-known bug, but couldn't find the same one on https://tracker.ceph.com - there are several bugs associated with file OpenFileTable.cc but not related to ceph_assert(count > 0) We tried to check the source code of OpenFileTable.cc also, here is a fragment of it, in function OpenFileTable::_journal_finish int omap_idx = anchor.omap_idx; unsigned& count = omap_num_items.at(omap_idx); ceph_assert(count > 0); So, we guess that the object map is empty for some object in Ceph, and it is unexpected behavior. But again, we found nothing wrong in our cluster... Next, we started with https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ article - tried to reset the journal (despite that it was Ok all the time) and wipe the sessions using cephfs-table-tool all reset session command. No result... Now I decided to continue following this article and run cephfs-data-scan scan_extents command, we started it on Friday but it is still working (2 from 3 workers finished, so I'm waiting for the last one; may be I need more workers for the next command cephfs-data-scan scan_inodes that I plan to run ). But I have a doubt that it will solve the issue because, again, we guess that we have no problem with our objects in Ceph but with metadata only... Is it the new bug? or something else? What should we try additionally to run our MDS daemon? Any idea is welcome! The important outputs: ceph -s cluster: id: 4cd1c477-c8d0-4855-a1f1-cb71d89427ed health: HEALTH_ERR 1 MDSs report damaged metadata insufficient standby MDS daemons available 83 daemons have recently crashed 3 mgr modules have recently crashed services: mon: 3 daemons, quorum asrv-dev-stor-2,asrv-dev-stor-3,asrv-dev-stor-1 (age 22h) mgr: asrv-dev-stor-2(active, since 22h), standbys: asrv-dev-stor-1 mds: 1/1 daemons up osd: 18 osds: 18 up (since 22h), 18 in (since 29h) data: volumes: 1/1 healthy pools: 5 pools, 289 pgs objects: 29.72M objects, 5.6 TiB usage: 21 TiB used, 47 TiB / 68 TiB avail pgs: 287 active+clean 2 active+clean+scrubbing+deep io: client: 2.5 KiB/s rd, 172 KiB/s wr, 261 op/s rd, 195 op/s wr ceph fs dump e29480 enable_multiple, ever_enabled_multiple: 0,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 1 Filesystem 'cephfs' (1) fs_name cephfs epoch 29480 flags 12 joinable allow_snaps allow_multimds_snaps created 2022-11-25T15:56:08.507407+0000 modified 2024-04-18T16:52:29.970504+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 14728 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=156636152} failed damaged stopped data_pools [5] metadata_pool 6 inline_data disabled balancer standby_count_wanted 1 [mds.asrv-dev-stor-1{0:156636152} state up:active seq 6 laggy since 2024-04-18T16:52:29.970479+0000 addr [v2:172.22.2.91:6800/2487054023,v1:172.22.2.91:6801/2487054023] compat {c=[1],r=[1],i=[7ff]}] cephfs-journal-tool --rank=cephfs:0 journal inspect Overall journal integrity: OK ceph pg dump summary version 41137 stamp 2024-04-18T21:17:59.133536+0000 last_osdmap_epoch 0 last_pg_scan 0 PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG sum 29717605 0 0 0 0 6112544251872 13374192956 28493480 1806575 1806575 OSD_STAT USED AVAIL USED_RAW TOTAL sum 21 TiB 47 TiB 21 TiB 68 TiB ceph pg dump pools POOLID OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG DISK_LOG 8 31771 0 0 0 0 131337887503 2482 140 401246 401246 7 839707 0 0 0 0 3519034650971 736 61 399328 399328 6 1319576 0 0 0 0 421044421 13374189738 28493279 206749 206749 5 27526539 0 0 0 0 2461702171417 0 0 792165 792165 2 12 0 0 0 0 48497560 0 0 6991 6991 --- Best regards, Alexey Gerasimov System Manager www.opencascade.com<http://www.opencascade.com/> www.capgemini.com<http://www.capgemini.com> [cid:image001.png@01DA93D0.B001CE80]

5 days, 1 hour

1
0
0 0

Working ceph cluster reports large amount of pgs in state unknown/undersized and objects degraded

by Tobias Langner

We operate a tiny ceph cluster (v16.2.7) across three machines, each running two OSDs and one of each mds, mgr, and mon. The cluster serves one main erasure-coded (2+1) storage pool and a few other management-related pools. The cluster has been running smoothly for several months. A few weeks ago we noticed a health warning reporting backfillfull/nearfull osds and pools. Here is the output of `ceph -s` at this point (extraced from logs): -------------------------------------------------------------------------------- cluster: health: HEALTH_WARN 1 backfillfull osd(s) 2 nearfull osd(s) Reduced data availability: 163 pgs inactive, 1 pg peering Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull Degraded data redundancy: 1486709/10911157 objects degraded (13.626%), 68 pgs degraded, 68 pgs undersized 162 pgs not scrubbed in time 6 pool(s) backfillfull services: mon: 3 daemons, quorum mon.101,mon.102,mon.100 (age 5m) mgr: mgr-102(active, since 54m), standbys: mgr-101, mgr-100 mds: 1/1 daemons up, 1 standby, 1 hot standby osd: 6 osds: 6 up (since 4m), 6 in (since 2w); 7 remapped pgs data: volumes: 1/1 healthy pools: 6 pools, 338 pgs objects: 3.64M objects, 14 TiB usage: 13 TiB used, 1.7 TiB / 15 TiB avail pgs: 47.929% pgs unknown 0.296% pgs not active 1486709/10911157 objects degraded (13.626%) 52771/10911157 objects misplaced (0.484%) 162 unknown 106 active+clean 67 active+undersized+degraded 1 active+undersized+degraded+remapped+backfill_toofull 1 remapped+peering 1 active+remapped+backfill_toofull -------------------------------------------------------------------------------- I now see the large amount of pgs in state unknown and the fact that a significant fraction of objects is degraded despite all osds being up, but we didn't notice this back then. Because the cluster continued to act fine from the perspective of the mounted filesystem, we didn't really notice the potential problem and did not intervene. From then one, things have mostly gone downwards. Now, `ceph -s` reports the following: -------------------------------------------------------------------------------- cluster: health: HEALTH_WARN noout flag(s) set Reduced data availability: 117 pgs inactive Degraded data redundancy: 2095625/12121767 objects degraded (17.288%), 114 pgs degraded, 114 pgs undersized 117 pgs not scrubbed in time services: mon: 3 daemons, quorum mon.101,mon.102,mon.100 (age 15h) mgr: mgr-102(active, since 7d), standbys: mgr-100, mgr-101 mds: 1/1 daemons up, 1 standby, 1 hot standby osd: 6 osds: 6 up (since 55m), 6 in (since 3w) flags noout data: volumes: 1/1 healthy pools: 6 pools, 338 pgs objects: 4.04M objects, 15 TiB usage: 12 TiB used, 2.8 TiB / 15 TiB avail pgs: 34.615% pgs unknown 2095625/12121767 objects degraded (17.288%) 117 unknown 114 active+undersized+degraded 107 active+clean -------------------------------------------------------------------------------- Note in particular the still very large number of pgs in state unknown, which hasn't changed in days. Same goes for the degraded pgs. Also, the cluster should have around 37TiB storage available but now it only reports 15 TiB. We did a bit of digging around but couldn't really get to the bottom of the unknown pgs and how we can recover from that. One other data point is that the command `ceph osd df tree` gets stuck on two of the three machines and one the one where it returns something, it looks like this: -------------------------------------------------------------------------------- ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 47.67506 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - root default -13 18.26408 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - datacenter dc.100 -5 18.26408 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - host osd-100 3 hdd 10.91409 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 91 up osd.3 5 hdd 7.34999 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 48 up osd.5 -9 14.69998 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - datacenter dc.101 -7 14.69998 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - host osd-101 0 hdd 7.34999 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 83 up osd.0 1 hdd 7.34999 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 86 up osd.1 -11 14.71100 - 15 TiB 12 TiB 12 TiB 77 MiB 21 GiB 2.6 TiB 82.00 1.00 - datacenter dc.102 -17 7.35550 - 7.4 TiB 6.3 TiB 6.2 TiB 16 MiB 11 GiB 1.1 TiB 85.16 1.04 - host osdroid-102-1 4 hdd 7.35550 1.00000 7.4 TiB 6.3 TiB 6.2 TiB 16 MiB 11 GiB 1.1 TiB 85.16 1.04 114 up osd.4 -15 7.35550 - 7.4 TiB 5.8 TiB 5.7 TiB 61 MiB 10 GiB 1.6 TiB 78.83 0.96 - host osdroid-102-2 2 hdd 7.35550 1.00000 7.4 TiB 5.8 TiB 5.7 TiB 61 MiB 10 GiB 1.6 TiB 78.83 0.96 107 up osd.2 TOTAL 15 TiB 12 TiB 12 TiB 77 MiB 21 GiB 2.6 TiB 82.00 MIN/MAX VAR: 0/1.04 STDDEV: 66.97 -------------------------------------------------------------------------------- The odd part here is that for some reason only osd.2 and osd.4 seem to contribute size to the cluster. Interestingly, accessing content from the storage pool works mostly without issues, which shouldn't work if 4 out of 6 OSDs weren't properly up. Even more odd is that while `ceph health detail` reports a lot of pgs in state unknown, undersized, and degraded, inspecting the respective pgs with `ceph pg <pdid> query` results in active+clean for *all* of them... I'm not sure which of the two pieces of information I am supposed to trust... Any ideas what we can do to get our cluster back into a sane state? I'm happy to provide more logs or command output, please let me know. Thanks!

6 days, 1 hour

4
5
0 0

Mysterious Space-Eating Monster

by duluxoz

Hi All, *Something* is chewing up a lot of space on our `\var` partition to the point where we're getting warnings about the Ceph monitor running out of space (ie > 70% full). I've been looking, but I can't find anything significant (ie log files aren't too big, etc) BUT there seem to be a hell of a lot (15) of sub-directories (with GUIDs for names) under the `/var/lib/containers/storage/overlay/` folder, all ending with `merged` - ie `/var/lib/containers/storage/overlay/{{GUID}}/`merged`. Is this normal, or is something going wrong somewhere, or am I looking in the wrong place? Also, if this is the issue, can I delete these folders? Sorry for asking such a noob Q, but the Cephadm/Podman stuff is extremely new to me :-) Thanks in advance Cheers Dulux-Oz

6 days, 19 hours

3
2
0 0

Ceph image delete error - NetHandler create_socket couldnt create socket

by Pardhiv Karri

Hi, Trying to delete images in a Ceph pool is causing errors in one of the clusters. I rebooted all the monitor nodes sequentially to see if the error went away, but it still persists. What is the best way to fix this? The Ceph cluster is in an OK state, with no rebalancing or scrubbing happening (I did set the noscrub and deep-noscrub flags) and also no load on the cluster, very few IO. root@ceph-mon01 ~# rbd rm 000dca3d-4f2b-4033-b8f5-95458e0c3444_disk_delete -p compute Removing image: 31% complete...2024-04-18 20:42:52.525135 7f6de0c79700 -1 NetHandler create_socket couldn't create socket (24) Too many open files Removing image: 32% complete...2024-04-18 20:42:52.539882 7f6de9c7b700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.541508 7f6de947a700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.546613 7f6de0c79700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.558133 7f6de9c7b700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.573819 7f6de947a700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.589733 7f6de0c79700 -1 NetHandler create_socket couldn't create socket (24) Too many open files Removing image: 33% complete...2024-04-18 20:42:52.643489 7f6de9c7b700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.727262 7f6de0c79700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.737135 7f6de9c7b700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.743292 7f6de947a700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.746167 7f6de0c79700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.757404 7f6de9c7b700 -1 NetHandler create_socket couldn't create socket (24) Too many open files Removing image: 34% complete...2024-04-18 20:42:52.773182 7f6de947a700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.773222 7f6de947a700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.789847 7f6de0c79700 -1 NetHandler create_socket couldn't create socket (24) Too many open files 2024-04-18 20:42:52.844201 7f6de9c7b700 -1 NetHandler create_socket couldn't create socket (24) Too many open files ^C root@ceph-mon01 ~# Thanks, Pardh

6 days, 20 hours

3
4
0 0

reef 18.2.3 QE validation status

by Yuri Weinstein

Details of this release are summarized here: https://tracker.ceph.com/issues/65393#note-1 Release Notes - TBD LRC upgrade - TBD Seeking approvals/reviews for: smoke - infra issues, still trying, Laura PTL rados - Radek, Laura approved? Travis? Nizamudeen? rgw - Casey approved? fs - Venky approved? orch - Adam King approved? krbd - Ilya approved powercycle - seems fs related, Venky, Brad PTL ceph-volume - will require https://github.com/ceph/ceph/pull/56857/commits/63fe3921638f1fb7fc065907a9e… Guillaume is fixing it. TIA

1 week

7
15
0 0

RGWs stop processing requests after upgrading to Reef

by Iain Stott

Hi, We have recently upgraded one of our clusters from Quincy 17.2.6 to Reef 18.2.1, since then we have had 3 instances of our RGWs stop processing requests. We have 3 hosts that run a single instance of RGW on each, and all 3 just seem to stop processing requests at the same time causing our storage to become unavailable. A restart or redeploy of the RGW service brings them back ok. The cluster was deployed using ceph ansible, but since we have adopted it to cephadm which is how the upgrade was performed. We have enabled debug logging as there was nothing out of the ordinary in normal logs and are currently sifting through them from the last crash. We are just wondering if it possible to run Quincy RGWs instead of Reef as we didn't have this issue prior to the upgrade? We have 3 clusters in a multisite setup, we are holding off on upgrading the other 2 clusters due to this issue. Thanks Iain Iain Stott OpenStack Engineer Iain.Stott(a)thg.com [THG Ingenuity Logo]<https://www.thg.com> www.thg.com<https://www.thg.com/> [LinkedIn]<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk> [Instagram] <https://www.instagram.com/thg> [X] <https://twitter.com/thgplc?lang=en>

1 week

1
0
0 0

Prevent users to create buckets

by sinan＠turka.nl

Hello, I am using Ceph RGW for S3. Is it possible to create (sub)users that cannot create/delete buckets and are limited to specific buckets? At the end, I want to create 3 separate users and for each user I want to create a bucket. The users should only have access to their own bucket and should not be able to create new or delete buckets. One approach could be to limit the max_buckets to 1 so the user cannot create new buckets, but it will still have access to other buckets and will able to delete buckets. Any advice here? Thanks! Sinan

1 week

4
3
0 0

crushmap history

by Blair Bethwaite

Hi all, Do the Mons store any crushmap history, and if so how does one get at it please? I ask because we've recently encountered an issue in a medium scale (~5PB raw) EC based RGW focused cluster where "something" happened, which we still don't know, that suddenly caused us to see 94% of objects (5.4 billion of them) misplaced. We've tracked down the first log message of that pgmap state change: Mar 29 10:30:31 mon1 bash\[5804\]: debug 2024-03-29T10:30:31.152+0000 7f3b6e378700 0 log\_channel(cluster) log \[DBG\] : pgmap v44327: 2273 pgs: 225 active+clean, 2038 active+remapped+backfill\_wait, 10 active+remapped+backfilling; 1.6 PiB data, 2.1 PiB used, 2.2 PiB / 4.3 PiB avail; 5426274136/5752755429 objects misplaced (94.325%); 248 MiB/s, 109 objects/s recovering This appears to have been preceded (aside from a single HTTP HEAD request coming into RGW) by a 5 minute gap in logs where either journald couldn't keep up with debug messages or the Mons were stuck. The last log before that occurs seems to be a compaction event kicking off: mon1 bash\[25927\]: Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0 Mar 29 10:24:14 mon1 bash\[25927\]: \*\* Compaction Stats \[L\] \*\* Mar 29 10:24:14 mon1 bash\[25927\]: Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Mar 29 10:24:14 mon1 bash\[25927\]: ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Mar 29 10:24:14 mon1 bash\[25927\]: Low 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 116.0 11.4 0.02 0.01 7 0.003 490 462 Mar 29 10:24:14 mon1 bash\[25927\]: High 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.9 1.23 1.20 28 0.044 0 0 Mar 29 10:24:14 mon1 bash\[25927\]: User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 16.4 0.00 0.00 1 0.001 0 0 We're left wondering what the heck has happened to cause such a huge redistribution of data in the cluster when we've not made any corresponding changes, so wanting to see if there's any breadcrumbs we can find. Appreciate any pointers! -- Cheers, ~Blairo

1 week, 1 day

2
2
0 0

Client kernel crashes on cephfs access

by Marc Ruhmann

Hi everyone, I would like to ask for help regarding client kernel crashes that happen on cephfs access. We have been struggling with this for over a month now with over 100 crashes on 7 hosts during that time. Our cluster runs version 18.2.1. Our clients run CentOS Stream. On CentOS Stream 9 the problem started with kernel version 5.14.0-425.el9. Version 5.14.0-419.el9 is the last one without problems. It also occurred on CentOS Stream 8, starting with version 4.18.0-546.el8 (4.18.0-544.el8 being the last good one). The problem presents itself by the client kernel crashing, forcing a reboot of the machine. Apparently it is triggered by a certain level of IO on the cephfs mount. It works perfectly fine when we rollback to the last good kernel version. The exact call trace in vmcore-dmesg.txt differs between occurrences. Here are two typical examples: ``` [ 8641.382499] list_del corruption. next->prev should be ffff88bd0a4d4c80, but was ffff88bcefdfd280 [ 8641.382521] ------------[ cut here ]------------ [ 8641.382521] kernel BUG at lib/list_debug.c:54! [ 8641.382528] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 8641.382591] CPU: 2 PID: 83929 Comm: kworker/2:0 Kdump: loaded Not tainted 5.14.0-432.el9.x86_64 #1 [ 8641.382610] Hardware name: oVirt RHEL/RHEL-AV, BIOS edk2-20230524-4.el9_3 05/24/2023 [ 8641.382624] Workqueue: ceph-cap ceph_cap_unlink_work [ceph] [ 8641.382662] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x47 [ 8641.382681] Code: c7 c7 78 42 d8 b1 e8 f9 87 fe ff 0f 0b 48 89 fe 48 c7 c7 08 43 d8 b1 e8 e8 87 fe ff 0f 0b 48 c7 c7 b8 43 d8 b1 e8 da 87 fe ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 78 43 d8 b1 e8 c6 87 fe ff 0f 0b [ 8641.382711] RSP: 0018:ffff95a000d6be60 EFLAGS: 00010246 [ 8641.382722] RAX: 0000000000000054 RBX: ffff88bced76dc00 RCX: 0000000000000000 [ 8641.382734] RDX: 0000000000000000 RSI: ffff88c02eea0840 RDI: ffff88c02eea0840 [ 8641.382746] RBP: ffff88bd0a4d4c80 R08: 80000000ffff8434 R09: 0000000000ffff10 [ 8641.382758] R10: 000000000000000f R11: 000000000000000f R12: ffff88c02eeb2800 [ 8641.382779] R13: ffff88bcc4610258 R14: ffff88bcc46101b8 R15: ffff88bcc46101c8 [ 8641.382793] FS: 0000000000000000(0000) GS:ffff88c02ee80000(0000) knlGS:0000000000000000 [ 8641.382809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8641.382819] CR2: 00007f35cee8a000 CR3: 0000000105708004 CR4: 00000000007706e0 [ 8641.382832] PKRU: 55555554 [ 8641.382838] Call Trace: [ 8641.382844] <TASK> [ 8641.382850] ? show_trace_log_lvl+0x1c4/0x2df [ 8641.382860] ? show_trace_log_lvl+0x1c4/0x2df [ 8641.382870] ? ceph_cap_unlink_work+0x3f/0x140 [ceph] [ 8641.382893] ? __die_body.cold+0x8/0xd [ 8641.382902] ? die+0x2b/0x50 [ 8641.382911] ? do_trap+0xce/0x120 [ 8641.382919] ? __list_del_entry_valid.cold+0x1d/0x47 [ 8641.382930] ? do_error_trap+0x65/0x80 [ 8641.382938] ? __list_del_entry_valid.cold+0x1d/0x47 [ 8641.382948] ? exc_invalid_op+0x4e/0x70 [ 8641.382958] ? __list_del_entry_valid.cold+0x1d/0x47 [ 8641.382975] ? asm_exc_invalid_op+0x16/0x20 [ 8641.382988] ? __list_del_entry_valid.cold+0x1d/0x47 [ 8641.382998] ceph_cap_unlink_work+0x3f/0x140 [ceph] [ 8641.383021] process_one_work+0x1e2/0x3b0 [ 8641.383032] ? __pfx_worker_thread+0x10/0x10 [ 8641.383043] worker_thread+0x50/0x3a0 [ 8641.383051] ? __pfx_worker_thread+0x10/0x10 [ 8641.383061] kthread+0xdd/0x100 [ 8641.383069] ? __pfx_kthread+0x10/0x10 [ 8641.383078] ret_from_fork+0x29/0x50 [ 8641.383090] </TASK> [ 8641.383095] Modules linked in: tls ceph libceph dns_resolver fscache netfs nft_counter ipt_REJECT xt_owner xt_conntrack nft_compat nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables libcrc32c nfnetlink vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency_common isst_if_common nfit virtio_gpu iTCO_wdt iTCO_vendor_support libnvdimm lpc_ich virtio_dma_buf drm_shmem_helper drm_kms_helper i2c_i801 rapl syscopyarea sysfillrect sysimgblt virtio_balloon fb_sys_fops i2c_smbus pcspkr joydev fuse drm ext4 mbcache jbd2 sr_mod cdrom sd_mod ahci t10_pi sg libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel virtio_net virtio_console virtio_scsi net_failover failover serio_raw ``` ``` [ 3538.365469] list_del corruption. next->prev should be ffff8d2b75997c80, but was ffff8d2afcfaae80 [ 3538.365488] ------------[ cut here ]------------ [ 3538.365488] kernel BUG at lib/list_debug.c:54! [ 3538.365493] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 3538.365553] CPU: 0 PID: 910 Comm: php-fpm Kdump: loaded Not tainted 5.14.0-432.el9.x86_64 #1 [ 3538.365569] Hardware name: oVirt RHEL/RHEL-AV, BIOS edk2-20230524-4.el9_3 05/24/2023 [ 3538.365582] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x47 [ 3538.365612] Code: c7 c7 78 42 38 8e e8 f9 87 fe ff 0f 0b 48 89 fe 48 c7 c7 08 43 38 8e e8 e8 87 fe ff 0f 0b 48 c7 c7 b8 43 38 8e e8 da 87 fe ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 78 43 38 8e e8 c6 87 fe ff 0f 0b [ 3538.365641] RSP: 0018:ffffae870073fda0 EFLAGS: 00010246 [ 3538.365652] RAX: 0000000000000054 RBX: ffff8d2b75997800 RCX: 0000000000000000 [ 3538.365668] RDX: 0000000000000000 RSI: ffff8d2e2ee20840 RDI: ffff8d2e2ee20840 [ 3538.365681] RBP: ffff8d2b75997ab8 R08: 80000000ffff842f R09: 0000000000ffff10 [ 3538.365693] R10: 000000000000000f R11: 000000000000000f R12: 00000000ffffc032 [ 3538.365705] R13: ffff8d2b75997c80 R14: ffff8d2ac480b800 R15: ffff8d2ac480b9c8 [ 3538.365717] FS: 00007f9be42097c0(0000) GS:ffff8d2e2ee00000(0000) knlGS:0000000000000000 [ 3538.365733] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3538.365744] CR2: 00007fa97dc5398c CR3: 0000000104248004 CR4: 00000000007706f0 [ 3538.365756] PKRU: 55555554 [ 3538.365761] Call Trace: [ 3538.365768] <TASK> [ 3538.365774] ? show_trace_log_lvl+0x1c4/0x2df [ 3538.365785] ? show_trace_log_lvl+0x1c4/0x2df [ 3538.365796] ? ceph_drop_caps_for_unlink+0xb8/0x170 [ceph] [ 3538.365828] ? __die_body.cold+0x8/0xd [ 3538.365836] ? die+0x2b/0x50 [ 3538.365845] ? do_trap+0xce/0x120 [ 3538.365853] ? __list_del_entry_valid.cold+0x1d/0x47 [ 3538.365863] ? do_error_trap+0x65/0x80 [ 3538.365871] ? __list_del_entry_valid.cold+0x1d/0x47 [ 3538.365881] ? exc_invalid_op+0x4e/0x70 [ 3538.365891] ? __list_del_entry_valid.cold+0x1d/0x47 [ 3538.365901] ? asm_exc_invalid_op+0x16/0x20 [ 3538.365912] ? __list_del_entry_valid.cold+0x1d/0x47 [ 3538.365923] ceph_drop_caps_for_unlink+0xb8/0x170 [ceph] [ 3538.365947] ceph_unlink+0xed/0x450 [ceph] [ 3538.365970] vfs_unlink+0x114/0x290 [ 3538.365980] do_unlinkat+0x1af/0x2e0 [ 3538.365990] __x64_sys_unlink+0x3e/0x60 [ 3538.365999] do_syscall_64+0x59/0x90 [ 3538.366008] ? syscall_exit_to_user_mode+0x22/0x40 [ 3538.366018] ? do_syscall_64+0x69/0x90 [ 3538.366027] ? do_syscall_64+0x69/0x90 [ 3538.366035] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 3538.366046] RIP: 0033:0x7f9be40ff27b [ 3538.366069] Code: f0 ff ff 73 01 c3 48 8b 0d a2 ab 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 ab 0f 00 f7 d8 64 89 01 48 [ 3538.367031] RSP: 002b:00007ffd8640de58 EFLAGS: 00000246 ORIG_RAX: 0000000000000057 [ 3538.367576] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f9be40ff27b [ 3538.368116] RDX: 0000000000000007 RSI: 0000000000000001 RDI: 00007f9bdd4af698 [ 3538.368646] RBP: 00007f9bdd4af698 R08: 00000000ffffffc9 R09: 0000000000000038 [ 3538.369156] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 3538.369671] R13: 00007f9bdd4af698 R14: 0000000000000001 R15: 00007f9be3c15290 [ 3538.370182] </TASK> [ 3538.370682] Modules linked in: ceph libceph dns_resolver fscache netfs nft_counter ipt_REJECT xt_owner xt_conntrack nft_compat nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables libcrc32c nfnetlink vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency_common virtio_gpu virtio_dma_buf drm_shmem_helper isst_if_common drm_kms_helper nfit syscopyarea sysfillrect sysimgblt fb_sys_fops libnvdimm i2c_i801 iTCO_wdt iTCO_vendor_support lpc_ich i2c_smbus virtio_balloon rapl joydev pcspkr drm fuse ext4 mbcache jbd2 sr_mod cdrom sg ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel virtio_net virtio_blk virtio_console net_failover virtio_scsi failover serio_raw ``` I checked the changelogs of the kernel versions and spotted these three commits that were backported: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… The first one adds changes that look related. Does anybody have experienced this as well or know something about this? Thanks and best regards, Marc

1 week, 1 day

6
7
0 0

2024

2023

2022

2021

2020

2019

ceph-users