- Dev - lists.ceph.io

Call for Interest: Managed SMB Protocol Support

by John Mulligan

Hello Ceph List, I'd like to formally let the wider community know of some work I've been involved with for a while now: adding Managed SMB Protocol Support to Ceph. SMB being the well known network file protocol native to Windows systems and supported by MacOS (and Linux). The other key word "managed" meaning integrating with Ceph management tooling - in this particular case cephadm for orchestration and eventually a new MGR module for managing SMB shares. The effort is still in it's very early stages. We have a PR adding initial support for Samba Containers to cephadm [1] and a prototype for an smb MGR module [2]. We plan on using container images based on the samba-container project [3] - a team I am already part of. What we're aiming for is a feature set similar to the current NFS integration in Ceph, but with a focus on bridging non-Linux/Unix clients to CephFS using a protocol built into those systems. A few major features we have planned include: * Standalone servers (internally defined users/groups) * Active Directory Domain Member Servers * Clustered Samba support * Exporting Samba stats via Prometheus metrics * A `ceph` cli workflow loosely based on the nfs mgr module I wanted to share this information in case there's wider community interest in this effort. I'm happy to take your questions / thoughts / suggestions in this email thread, via Ceph slack (or IRC), or feel free to attend a Ceph Orchestration weekly meeting! I try regularly attend and we sometimes discuss design aspects of the smb effort there. It's on the Ceph Community Calendar. Thanks! [1] - https://github.com/ceph/ceph/pull/55068 [2] - https://github.com/ceph/ceph/pull/56350 [3] - https://github.com/samba-in-kubernetes/samba-container/ Thanks for reading, --John Mulligan

1 week

4
10
0 0

[BUG] famous "rbd: unmap failed: (16) Device or resource busy" trapped

by Nicolas FOURNIL

Hello I work on the famous "rbd: unmap failed: (16) Device or resource busy" which causes problems for some projects such as Incus (LXD), and many people just live with it... I post it on the ceph-user M/L and Murilo Morais works for several days with me on it. https://discuss.linuxcontainers.org/t/incus-0-x-and-ceph-rbd-map-is-sometim… <= Work with Stephane Graber INCUS/LXD main developer => https://discuss.linuxcontainers.org/t/howto-delete-container-with-ceph-rbd-… I work on an easy reproducible bug setup without any other tool than stock ceph setup : create an image, map it, format&mount it, ADD AN OSD, and try to unmap the image ... tada ! "rbd: unmap failed: (16) Device or resource busy" is here. Here's the more complete explanation, (who is the great job of Murilo Morais) : *I managed to reproduce.The problem is how docker/podman binds "/" to "/rootfs" in containers.When ceph creates the files for SystemD to start the services, it includes a system root bind to /rootfs [1], I do not recommend removing this bind, as it will break MON.* *By default they use "rprivate" [2][3], it causes the mount points to be propagated to the container but does not receive any "mount" or "umount" events from the host [4]. This causes this behavior in your cluster.* *This will happen whenever any container starts/restarts, regardless of whether it is a new daemon or not.* *A quick alternative would be to change the unit files in /var/lib/ceph/<fsid>/<daemon>/ and add "slave" or "rslave" to the podman bind argument. Where it contains "-v /:/rootfs" add ":slave", leaving "-v /:/rootfs:slave". The inconvenience is that it will be necessary to restart all daemons, and, when adding/redeploying a daemon, you will have to perform the same steps.A definitive solution would be to change the source code, but I didn't have time to try this option. Tomorrow, as soon as I get to the office, I will try to do this, I will report back to you as soon as I discover something!* *If you wish, you can respond to the public list about your problem and the need to change the bind to rootfs, for everyone to see and so that some of the project's devs can perhaps comment on something.* *Have a good night!* *[1] https://github.com/ceph/ceph/blob/7714874efb08facee80f92b358993fa56854bb01/… <https://github.com/ceph/ceph/blob/7714874efb08facee80f92b358993fa56854bb01/…>* *[2] https://docs.docker.com/storage/bind-mounts/#configure-bind-propagation <https://docs.docker.com/storage/bind-mounts/#configure-bind-propagation>* *[3] https://docs.podman.io/en/latest/markdown/podman-create.1.html#mount-type-t… <https://docs.podman.io/en/latest/markdown/podman-create.1.html#mount-type-t…>* *[4] https://man7.org/linux/man-pages/man7/mount_namespaces.7.html <https://man7.org/linux/man-pages/man7/mount_namespaces.7.html>* And the reproduction log is : First I create a new image and map it (to be alone from Incus) : root@ceph02-r2b-fl1:~# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 53.12473 root default ... -5 3.63879 host ceph02-r2b-fl1 6 hdd 0.90970 osd.6 up 1.00000 1.00000 9 hdd 0.90970 osd.9 up 1.00000 1.00000 2 nvme 0.90970 osd.2 up 1.00000 1.00000 4 nvme 0.90970 osd.4 up 1.00000 1.00000 ... root@ceph02-r2b-fl1:~# rbd create image1 --size 1024 --pool customers-clouds.ix-mrs2.fr.eho root@ceph02-r2b-fl1:~# RBD_DEVICE=$(rbd map customers-clouds.ix-mrs2.fr.eho/image1) root@ceph02-r2b-fl1:~# mkfs.ext4 ${RBD_DEVICE} mke2fs 1.47.0 (5-Feb-2023) Discarding device blocks: done Creating filesystem with 262144 4k blocks and 65536 inodes Filesystem UUID: c97362e1-11db-4ff3-ba62-ede6d58884b9 Superblock backups stored on blocks: 32768, 98304, 163840, 229376 Allocating group tables: done Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done root@ceph02-r2b-fl1:~# mount ${RBD_DEVICE} /media/test Let's list current mapped devices. *root@ceph02-r2b-fl1:~# mount | grep rbd/dev/rbd4 on /var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd5 on /var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd2 on /var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd1 on /var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd3 on /var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd0 on /var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd6 on /media/test type ext4 (rw,relatime,stripe=16)* ============================================================================================ ===============> Adding the new OSD (an old small disk to be faster...) <=================== =========================== (via service task in dashboard) ================================ ============================================================================================ root@ceph02-r2b-fl1.ep-ws.fr.eholab.admin:~# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 53.25822 root default ... -5 3.77229 host ceph02-r2b-fl1 6 hdd 0.90970 osd.6 up 1.00000 1.00000 9 hdd 0.90970 osd.9 up 1.00000 1.00000 * 26 hdd 0.13350 osd.26 up 1.00000 1.00000 <=== Here's the brand new OSD* 2 nvme 0.90970 osd.2 up 1.00000 1.00000 4 nvme 0.90970 osd.4 up 1.00000 1.00000 .... ============ Let's check what is NS contains ... ======================= root@ceph02-r2b-fl1:~# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ..... 6dbd4fd4e4f3 cephpodregistry:5000/ceph@sha256:e205163225ec8ce460d6581df66ba4866585e3a4817866910f85bedcdcff7935 -n osd.26 -f --se... 2 minutes ago Up 2 minutes ago ceph-c3f59906-c43d-11ee-a2d6-3a82cb8036b6-osd-26 *root@ceph02-r2b-fl1.ep-ws.fr.eholab.admin:~# podman exec 6dbd4fd4e4f3 mount | grep rbd/dev/rbd4 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd5 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd2 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd1 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd3 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd0 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-xx type ext4 (rw,relatime,discard,stripe=16)/dev/rbd6 on /rootfs/media/test type ext4 (rw,relatime,stripe=16)======= of course these mountpoints are not needed by the newly created OSD ... and this full copy is problematic ============* And then : root@ceph02-r2b-fl1:~# umount /dev/rbd6 -I can unmount the rbd device ... no reference is host NS- *root@ceph02-r2b-fl1:~# rbd unmap /dev/rbd6rbd: sysfs write failedrbd: unmap failed: (16) Device or resource busy* *root@ceph02-r2b-fl1:~# podman exec 6dbd4fd4e4f3 mount | grep rbd/dev/rbd4 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-06c995c3 type ext4 (rw,relatime,discard,stripe=16)/dev/rbd5 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-0bc99da2 type ext4 (rw,relatime,discard,stripe=16)/dev/rbd2 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-62cc652e type ext4 (rw,relatime,discard,stripe=16)/dev/rbd1 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-5dcc5d4f type ext4 (rw,relatime,discard,stripe=16)/dev/rbd3 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-efd3feea type ext4 (rw,relatime,discard,stripe=16)/dev/rbd0 on /rootfs/var/lib/incus/storage-pools/default/containers/ec-59cc5703 type ext4 (rw,relatime,discard,stripe=16)/dev/rbd6 on /rootfs/media/test type ext4 (rw,relatime,stripe=16)* ... Because OSD's Namespace still had the mount (as all rbd mapped...) Hope someone could create a ticket for this bug in ceph bug tracker. I didn't find how to create a ticket without being "a member". Regards Nicolas FOURNIL https://www.eho.link

1 week

2
2
0 0

Run-clang-tidy script changes

by Suyash Dongre

I am modifying the run-clang-tidy script in order to NOT run clang-tidy checks on the specified files, my plan is to use a json / csv file which would have all the files that the clang-tidy should not run on. Should I continue with this approach or is there any better way to achieve this? If I continue on this path, which is better, csv or json in terms of ease of use for users? Regards, Suyash Dongre

1 week, 5 days

1
0
0 0

on dropping ubuntu focal support for squid

by Casey Bodley

there was a consensus to drop support for ubuntu focal and centos stream 8 with the squid release, and i'd love to remove those distros from the shaman build matrix for squid and main branches asap however, i see that quincy never supported ubuntu jammy, so our quincy upgrade tests still have to run against focal. that means we'd still have to build focal packages for squid would it be possible to start building jammy packages for quincy to allow those upgrade tests to run jammy instead? this isn't an issue on the centos side because we've been building centos 9 packages for quincy even though it's not listed in https://docs.ceph.com/en/latest/start/os-recommendations/#platforms

2 weeks

2
6
0 0

04/04/2024 perf meeting is canceled

by Mark Nelson

Hi Folks, I'm off to go see the eclipse! Have a great week everyone. Thanks, Mark -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nelson(a)clyso.com We are hiring: https://www.clyso.com/jobs/

2 weeks

1
0
0 0

ceph-object-corpus test coverage

by Casey Bodley

this ceph-object-corpus repo is the basis of our ceph-dencoder test src/test/encoding/readable.sh, which verifies that we can still decode all of the data structures encoded by older ceph versions i'd like to raise awareness that this ceph-object-corpus repo hasn't been updated with new encodings since pacific 16.2.0, so we're missing important regression test coverage since then Nitzan prepared the encodings for reef 18.2.0 in https://github.com/ceph/ceph-object-corpus/pull/17, but those haven't merged yet. i had opened https://github.com/ceph/ceph/pull/54735 to test that, but 'make check' identified failures like: > The following tests FAILED: > 147 - readable.sh (Failed) > > **** reencode of /home/jenkins-build/build/workspace/ceph-pull-requests/ceph-object-corpus/archive/18.2.0/objects/chunk_refs_t/ccb69d9ecd572c1f6ed9598899773cf1 resulted in a different dump **** can we find a way to prioritize this? it would be great to have these reef encodings while we're validating the squid release

2 weeks

4
4
0 0

Using Tracker for PRs testing

by Yuri Weinstein

Hello, If you have never used Trello for PRs testing, you may stop reading this email. We used to keep track of PRs testing and collaboration in Trello. However, after Trello introduced a limitation of 10 collaborators in the free version, we decided to move to Tracker. Ceph QA Project Here is the project we started using for Ceph QA: https://tracker.ceph.com/projects/ceph-qa/issues Getting Help If you have any questions or suggestions, please post in the Slack chat channel "ps_testing" or reply to this email. Regards, YuriW

2 weeks

2
1
1 0

Ceph Developer Monthly happening this week, April 3rd

by Laura Flores

Hi all, CDM is happening this week, April 3rd at 11:00 AM ET. See more meeting details below. Please add any topics you'd like to discuss to the agenda: https://tracker.ceph.com/projects/ceph/wiki/CDM_03-APR-2024 Thanks, Laura Flores Meeting link: <https://meet.jit.si/ceph-dev-monthly> <https://meet.jit.si/ceph-dev-monthly>https://meet.jit.si/ceph-dev-monthly Time conversions: UTC: Wednesday, April 3, 15:00 UTC Mountain View, CA, US: Wednesday, April 3, 8:00 PDT Phoenix, AZ, US: Wednesday, April 3, 8:00 MST Denver, CO, US: Wednesday, April 3, 9:00 MDT Huntsville, AL, US: Wednesday, April 3, 10:00 CDT Raleigh, NC, US: Wednesday, April 3, 11:00 EDT London, England: Wednesday, April 3, 16:00 BST Paris, France: Wednesday, April 3, 17:00 CEST Helsinki, Finland: Wednesday, April 3, 18:00 EEST Tel Aviv, Israel: Wednesday, April 3, 18:00 IDT Pune, India: Wednesday, April 3, 20:30 IST Brisbane, Australia: Thursday, April 4, 1:00 AEST Singapore, Asia: Wednesday, April 3, 23:00 +08 Auckland, New Zealand: Thursday, April 4, 4:00 NZDT -- Laura Flores She/Her/Hers Software Engineer, Ceph Storage <https://ceph.io> Chicago, IL lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com> M: +17087388804

2 weeks

1
1
0 0

weekly rgw meeting canceled for CDM

by Casey Bodley

please join us in the ceph developer monthly instead: https://tracker.ceph.com/projects/ceph/wiki/CDM_03-APR-2024 https://meet.jit.si/ceph-dev-monthly

2 weeks, 1 day

1
0
0 0

v17.2.7 Quincy now supports Ubuntu 22.04 (Jammy Jellyfish)

by Casey Bodley

Ubuntu 22.04 packages are now available for the 17.2.7 Quincy release. The upcoming Squid release will not support Ubuntu 20.04 (Focal Fossa). Ubuntu users planning to upgrade from Quincy to Squid will first need to perform a distro upgrade to 22.04. Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at https://download.ceph.com/tarballs/ceph-17.2.7.tar.gz * Containers at https://quay.io/repository/ceph/ceph * For packages, see https://docs.ceph.com/en/latest/install/get-packages/ * Release git sha1: b12291d110049b2f35e32e0de30d70e9a4c060d2

2 weeks, 6 days

1
0
0 0

2024

2023

2022

2021

2020

2019

Dev