hey Gal and Eric,
in today's standup, we discussed the version of our apache arrow
submodule. it's currently pinned at 6.0.1, which was tagged in nov.
2021. the centos9 builds are using the system package
libarrow-devel-9.0.0. arrow's upstream recently tagged an 11.0.0
release
as far as i know, there still aren't any system packages for ubuntu,
so we're likely to be stuck with the submodule for quite a while. how
do guys want to handle these updates? is it worth trying to update
before the reef release?
hi Ernesto and lists,
> [1] https://github.com/ceph/ceph/pull/47501
are we planning to backport this to quincy so we can support centos 9
there? enabling that upgrade path on centos 9 was one of the
conditions for dropping centos 8 support in reef, which i'm still keen
to do
if not, can we find another resolution to
https://tracker.ceph.com/issues/58832? as i understand it, all of
those python packages exist in centos 8. do we know why they were
dropped for centos 9? have we looked into making those available in
epel? (cc Ken and Kaleb)
On Fri, Sep 2, 2022 at 12:01 PM Ernesto Puerta <epuertat(a)redhat.com> wrote:
>
> Hi Kevin,
>
>>
>> Isn't this one of the reasons containers were pushed, so that the packaging isn't as big a deal?
>
>
> Yes, but the Ceph community has a strong commitment to provide distro packages for those users who are not interested in moving to containers.
>
>> Is it the continued push to support lots of distros without using containers that is the problem?
>
>
> If not a problem, it definitely makes it more challenging. Compiled components often sort this out by statically linking deps whose packages are not widely available in distros. The approach we're proposing here would be the closest equivalent to static linking for interpreted code (bundling).
>
> Thanks for sharing your questions!
>
> Kind regards,
> Ernesto
> _______________________________________________
> Dev mailing list -- dev(a)ceph.io
> To unsubscribe send an email to dev-leave(a)ceph.io
Hello,
There's so many ways to build ceph, from sources i'm pretty confused, so i
need some help.
I want to build regularly ceph from "main/master", to create debian packets
out of it.
I somehow have a solution which is working, but what's the best practice
for doing this right now?
On top of that, I didn't find ANY solution to build a "crimson-osd" packet
out of the latest sources, i even spent hours on this, what's the correct
way to do this?
Thanks!
Sascha
Hi all,
I wanted to call attention to some RGW issues that we've observed on a
Pacific cluster over the past several weeks. The problems relate to versioned
buckets and index entries that can be left behind after transactions complete
abnormally. The scenario is multi-faceted and we're still investigating some of
the details, but I wanted to provide a big-picture summary of what we've found
so far. It looks like most of these issues should be reproducible on versions
before and after Pacific as well. I'll enumerate the individual issues below:
1. PUT requests during reshard of versioned bucket fail with 404 and leave
behind dark data
Tracker: https://tracker.ceph.com/issues/61359
2. When bucket index ops are cancelled it can leave behind zombie index entries
This one was merged a few months ago and did make the v16.2.13 release, but
in our case we had billions of extra index entries by the time that we had
upgraded to the patched version.
Tracker: https://tracker.ceph.com/issues/58673
3. Issuing a delete for a key that already has a delete marker as the current
version leaves behind index entries and OLH objects
Note that the tracker's original description describes the problem a bit
differently, but I've clarified the nature of the issue in a comment.
Tracker: https://tracker.ceph.com/issues/59663
The extra index entries and OLH objects that are left behind due to these sorts
of issues are obviously annoying in regards to the fact that they unnecessarily
consume space, but we've found that they can also cause severe performance
degradation for bucket listings, lifecycle processing, and other ops indirectly
due to higher osd latencies.
The reason for the performance impact is that bucket listing calls must
repeatedly perform additional OSD ops until they find the requisite number
of entries to return. The OSD cls method for bucket listing also does its own
internal iteration for the same purpose. Since these entries are invalid, they
are skipped. In the case that we observed, where some of our bucket indexes were
filled with a sea of contiguous leftover entries, the process of continually
iterating over and skipping invalid entries caused enormous read amplification.
I believe that the following tracker is describing symptoms that are related to
the same issue: https://tracker.ceph.com/issues/59164.
Note that this can also cause LC processing to repeatedly fail in cases where
there are enough contiguous invalid entries, since the OSD cls code eventually
gives up and returns an error that isn't handled.
The severity of these issues likely varies greatly based upon client behavior.
If anyone has experienced similar problems, we'd love to hear about the nature
of how they've manifested for you so that we can be more confident that we've
plugged all of the holes.
Thanks,
Cory Snyder
11:11 Systems
Details of this release are summarized here:
https://tracker.ceph.com/issues/61515#note-1
Release Notes - TBD
Seeking approvals/reviews for:
rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
merge https://github.com/ceph/ceph/pull/51788 for
the core)
rgw - Casey
fs - Venky
orch - Adam King
rbd - Ilya
krbd - Ilya
upgrade/octopus-x - deprecated
upgrade/pacific-x - known issues, Ilya, Laura?
upgrade/reef-p2p - N/A
clients upgrades - not run yet
powercycle - Brad
ceph-volume - in progress
Please reply to this email with approval and/or trackers of known
issues/PRs to address them.
gibba upgrade was done and will need to be done again this week.
LRC upgrade TBD
TIA
Hi Chris,
I think, you have missed one steps and that is to change mtime for directory explicitly. Please have a look at highlighted steps.
CEPHFS
===========================================================================
root@sds-ceph:/mnt/cephfs/volumes/_nogroup/test1/d5052b71-39ec-4d0a-9b0b-2091e1723538# mkdir dir1
root@sds-ceph:/mnt/cephfs/volumes/_nogroup/test1/d5052b71-39ec-4d0a-9b0b-2091e1723538# stat dir1
File: dir1
Size: 0 Blocks: 0 IO Block: 65536 directory
Device: 28h/40d Inode: 1099511714911 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2023-05-24 11:09:25.260851345 +0530
Modify: 2023-05-24 11:09:25.260851345 +0530
Change: 2023-05-24 11:09:25.260851345 +0530
Birth: 2023-05-24 11:09:25.260851345 +0530
root@sds-ceph:/mnt/cephfs/volumes/_nogroup/test1/d5052b71-39ec-4d0a-9b0b-2091e1723538# touch -m -d '26 Aug 1982 22:00' dir1
root@sds-ceph:/mnt/cephfs/volumes/_nogroup/test1/d5052b71-39ec-4d0a-9b0b-2091e1723538# stat dir1/
File: dir1/
Size: 0 Blocks: 0 IO Block: 65536 directory
Device: 28h/40d Inode: 1099511714911 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2023-05-24 11:09:25.260851345 +0530
Modify: 1982-08-26 22:00:00.000000000 +0530
Change: 2023-05-24 11:10:04.881454967 +0530
Birth: 2023-05-24 11:09:25.260851345 +0530
root@sds-ceph:/mnt/cephfs/volumes/_nogroup/test1/d5052b71-39ec-4d0a-9b0b-2091e1723538# mkdir dir1/dir2
root@sds-ceph:/mnt/cephfs/volumes/_nogroup/test1/d5052b71-39ec-4d0a-9b0b-2091e1723538# stat dir1/
File: dir1/
Size: 1 Blocks: 0 IO Block: 65536 directory
Device: 28h/40d Inode: 1099511714911 Links: 3
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2023-05-24 11:09:25.260851345 +0530
Modify: 1982-08-26 22:00:00.000000000 +0530
Change: 2023-05-24 11:10:19.141672220 +0530
Birth: 2023-05-24 11:09:25.260851345 +0530
root@sds-ceph:/mnt/cephfs/volumes/_nogroup/test1/d5052b71-39ec-4d0a-9b0b-2091e1723538#
Note : In a last step, it is expected that “Modify” time should change.
Thanks and Regards
Sandip Divekar
From: Chris Palmer <chris.palmer(a)idnet.com>
Sent: Thursday, May 25, 2023 9:46 PM
To: Sandip Divekar <sandip.divekar(a)hitachivantara.com>; ceph-users(a)ceph.io
Cc: dev(a)ceph.io; Gavin Lucas <gavin.lucas(a)hitachivantara.com>; Joseph Fernandes <joseph.fernandes(a)hitachivantara.com>; Simon Crosland <Simon.Crosland(a)hitachivantara.com>
Subject: Re: [ceph-users] Re: Unexpected behavior of directory mtime after being set explicitly
***** EXTERNAL EMAIL *****
Hi Sandip
Ceph servers (debian11/ceph base with Proxmox installed on top - NOT the ceph that comes with Proxmox!):
ceph@pve1:~$ uname -a
Linux pve1 5.15.107-2-pve #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) x86_64 GNU/Linux
ceph@pve1:~$ ceph version
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
Fedora workstation. I waited until the minute had clicked over before doing each step:
[chris@rex mtime]$ uname -a
Linux rex.palmer 6.2.15-300.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 11 17:37:39 UTC 2023 x86_64 GNU/Linux
[chris@rex mtime]$ rpm -q ceph-common
ceph-common-17.2.6-2.fc38.x86_64
[chris@rex mtime]$ df .
Filesystem 1K-blocks Used Available Use% Mounted on
192.168.80.121,192.168.80.122,192.168.80.123:/data2 8589930496 4944801792 3645128704 58% /mnt/data2
[chris@rex mtime]$ mount|grep data2
systemd-1 on /mnt/data2 type autofs (rw,relatime,fd=61,pgrp=1,timeout=600,minproto=5,maxproto=5,direct,pipe_ino=22804)
192.168.80.121,192.168.80.122,192.168.80.123:/data2 on /mnt/data2 type ceph (rw,noatime,nodiratime,name=data2-rex,secret=<hidden>,fsid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx,acl,_netdev,x-systemd.mount-timeout=30,x-systemd.automount,x-systemd.idle-timeout=600)
[chris@rex mtime]$ date; mkdir one; ls -ld one
Thu 25 May 16:57:28 BST 2023
drwxrwxr-x 2 chris groupname 0 May 25 16:57 one
[chris@rex mtime]$ date; touch one; ls -ld one
Thu 25 May 16:58:14 BST 2023
drwxrwxr-x 2 chris groupname 0 May 25 16:58 one
[chris@rex mtime]$ date; mkdir one/two; ls -ld one
Thu 25 May 16:59:26 BST 2023
drwxrwxr-x 3 chris groupname 1 May 25 16:59 one
I also repeated it with the test run on the ceph debian11 server, having mounted the cephfs filesystem on the ceph server - exactly the same result.
I then repeated it again on a pure debian11 ceph 17.2.6 cluster, using a debian11 client, and it also worked as expected.
All systems have latest patches applied.
Hope that helps
Chris
On 25/05/2023 15:57, Sandip Divekar wrote:
Hi Chris,
Kindly request you that follow steps given in previous mail and paste the output here.
The reason behind this request is that we have encountered an issue which is easily reproducible on
Latest version of both quincy and pacific, also we have thoroughly investigated the matter and we are certain that
No other factors are at play in this scenario.
Note : We have used Debian 11 for testing.
sdsadmin@ceph-pacific-1:~$ uname -a
Linux ceph-pacific-1 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux
sdsadmin@ceph-pacific-1:~$ sudo ceph -v
ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
Thanks for your prompt reply.
Regards
Sandip Divekar
-----Original Message-----
From: Chris Palmer <chris.palmer(a)idnet.com><mailto:chris.palmer@idnet.com>
Sent: Thursday, May 25, 2023 7:25 PM
To: ceph-users(a)ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] Re: Unexpected behavior of directory mtime after being set explicitly
***** EXTERNAL EMAIL *****
Hi Milind
I just tried this using the ceph kernel client and ceph-common 17.2.6 package in the latest Fedora kernel, against Ceph 17.2.6 and it worked perfectly...
There must be some other factor in play.
Chris
On 25/05/2023 13:04, Sandip Divekar wrote:
Hello Milind,
We are using Ceph Kernel Client.
But we found this same behavior while using Libcephfs library.
Should we treat this as a bug? Or
Is there any existing bug for similar issue ?
Thanks and Regards,
Sandip Divekar
From: Milind Changire <mchangir(a)redhat.com><mailto:mchangir@redhat.com>
Sent: Thursday, May 25, 2023 4:24 PM
To: Sandip Divekar <sandip.divekar(a)hitachivantara.com><mailto:sandip.divekar@hitachivantara.com>
Cc: ceph-users(a)ceph.io<mailto:ceph-users@ceph.io>; dev(a)ceph.io<mailto:dev@ceph.io>
Subject: Re: [ceph-users] Unexpected behavior of directory mtime after
being set explicitly
***** EXTERNAL EMAIL *****
Sandip,
What type of client are you using ?
kernel client or fuse client ?
If it's the kernel client, then it's a bug.
FYI - Pacific and Quincy fuse clients do the right thing
On Wed, May 24, 2023 at 9:24 PM Sandip Divekar <sandip.divekar(a)hitachivantara.com<mailto:sandip.divekar@hitachivantara.com><mailto:sandip.divekar@hitachivantara.com><mailto:sandip.divekar@hitachivantara.com>> wrote:
Hi Team,
I'm writing to bring to your attention an issue we have encountered with the "mtime" (modification time) behavior for directories in the Ceph filesystem.
Upon observation, we have noticed that when the mtime of a directory
(let's say: dir1) is explicitly changed in CephFS, subsequent additions of files or directories within 'dir1' fail to update the directory's mtime as expected.
This behavior appears to be specific to CephFS - we have reproduced this issue on both Quincy and Pacific. Similar steps work as expected in the ext4 filesystem amongst others.
Reproduction steps:
1. Create a directory - mkdir dir1
2. Modify mtime using the touch command - touch dir1 3. Create a file
or directory inside of 'dir1' - mkdir dir1/dir2 Expected result:
mtime for dir1 should change to the time the file or directory was
created in step 3 Actual result:
there was no change to the mtime for 'dir1'
Note : For more detail, kindly find the attached logs.
Our queries are :
1. Is this expected behavior for CephFS?
2. If so, can you explain why the directory behavior is inconsistent depending on whether the mtime for the directory has previously been manually updated.
Best Regards,
Sandip Divekar
Component QA Lead SDET.
_______________________________________________
ceph-users mailing list --
ceph-users(a)ceph.io<mailto:ceph-users@ceph.io><mailto:ceph-users@ceph.io><mailto:ceph-users@ceph.io>
To unsubscribe send an email to
ceph-users-leave(a)ceph.io<mailto:ceph-users-leave@ceph.io><mailto:ceph-users-leave@ceph.io><mailto:ceph-users-leave@ceph.io>
--
Milind
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an
email to ceph-users-leave(a)ceph.io<mailto:ceph-users-leave@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-leave(a)ceph.io<mailto:ceph-users-leave@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to ceph-users-leave(a)ceph.io<mailto:ceph-users-leave@ceph.io>
the public Jitsi rooms started displaying a warning that "The room
name is unsafe" before joining, and requires you to click an 'I
understand the risks' checkbox before joining. i assume it's just
warning that the room is public, which was the intent. is there any
way to get rid of that warning?
Our downstream QE team recently observed an md5 mismatch of replicated
objects when testing rgw's server-side encryption in multisite. This
corruption is specific to s3 multipart uploads, and only affects the
replicated copy - the original object remains intact. The bug likely
affects Ceph releases all the way back to Luminous where server-side
encryption was first introduced.
To expand on the cause of this corruption: Encryption of multipart
uploads requires special handling around the part boundaries, because
each part is uploaded and encrypted separately. In multisite, objects
are replicated in their encrypted form, and multipart uploads are
replicated as a single part. As a result, the replicated copy loses
its knowledge about the original part boundaries required to decrypt
the data correctly.
We don't have a fix yet, but we're tracking it in
https://tracker.ceph.com/issues/46062. The fix will only modify the
replication logic, so won't repair any objects that have already
replicated incorrectly. We'll need to develop a radosgw-admin command
to search for affected objects and reschedule their replication.
In the meantime, I can only advise multisite users to avoid using
encryption for multipart uploads. If you'd like to scan your cluster
for existing encrypted multipart uploads, you can identify them with a
s3 HeadObject request. The response would include a
x-amz-server-side-encryption header, and the ETag header value (with
"s removed) would be longer than 32 characters (multipart ETags are in
the special form "<md5sum>-<num parts>"). Take care not to delete the
corrupted replicas, because an active-active multisite configuration
would go on to delete the original copy.