Matt recently raised the issue of ceph assertions in production code,
which reminded me of Sage's 2016 pr
https://github.com/ceph/ceph/pull/9969 that added a
ceph_assert_always(). the idea was to eventually make ceph_assert()
conditional on NDEBUG to match the behavior of libc's assert(),
leaving ceph_assert_always() as the marked unconditional case
i would love to see this finally happen, but there are some potential risks:
* ceph_assert()s with side effects won't behave as expected in release
builds. assert() documents this same issue at
https://www.man7.org/linux/man-pages/man3/assert.3.html#BUGS. if we
could at least identify these cases, we can switch them to
* in teuthology, we test the same release builds that we ship to
users. that means teuthology won't catch the code paths that trigger
debug assertions. if those lead to crashes, they could be much harder
to debug without the assertions and backtraces
* conversely, merging pull requests after a successful teuthology run
may introduce new assertions in debug builds. it could be annoying for
developers to track down and fix new assertions after pulling the
latest main or stable release branch
* unused variable warnings in release builds where ceph_assert() was
the only reference. at least the compiler will catch all of these for
us, and [[maybe_unused]] annotations can clear them up
in general, do folks agree that this is a change worth making? if so,
what can we do to mitigate the risks?
if not, how should we handle the use of ceph_assert() vs raw assert()
in new code? should there be some guidance in CodingStyle?
as a half-measure, we might introduce a new ceph_assert_debug() as an
alternative to raw assert(), then convert some existing uses of
ceph_assert() on a case-by-case basis
When Reef was released, the announcement said that Debian packages would
be built once the blocking bug in Bookworm was fixed. As I noted on the
tracker item https://tracker.ceph.com/issues/61845 a couple of weeks
ago, that is now the case after the most recent Bookworm point release.
I also opened a PR to make the minimal change that would build Reef
packages on Bookworm. I subsequently opened another PR to fix some
low-hanging fruit in terms of packaging errors - missing #! in
maintscripts, syntax errors in debian/control, erroneous dependencies on
Essential packages. Neither PR has had any feedback/review as far as
I can see.
Those packages (and the previous state of the debian/ tree) had some
significant problems - no copyright file, and some of them contain
python scripts without declaring a python dependency, so I've today
submitted a slightly larger PR that brings the dh compatibility level up
to what I think the latest lowest-common-denominator level is, as well
as fixing these errors.
I believe these changes all ought to go into the reef branch, but
obviously you might prefer to just make the bare-minimum-to-build change
in the first PR.
Is there any chance of having some reef packages for Bookworm, please?
Relatedly, is there interest in further packaging fixes for future
branches? lintian still has quite a lot to say about the .debs for Ceph,
and while you might reasonably not want to care about crossing every t
of Debian policy, I think there are still changes that would be worth
I should declare a bit of an interest here - I'd like to evaluate
cephadm for work use, which would require us to be able to build our own
packages per local policy, which in turn would mean I'd want to get
Debian-based images going again. But that requires Reef .debs being
available to install onto said images :)
CDM is happening tomorrow, November 1st at 9:00 PM EST. See more meeting
Please add any topics you'd like to discuss to the agenda:
UTC: Thursday, November 2, 1:00 UTC
Mountain View, CA, US: Wednesday, November 1, 18:00 PDT
Phoenix, AZ, US: Wednesday, November 1, 18:00 MST
Denver, CO, US: Wednesday, November 1, 19:00 MDT
Huntsville, AL, US: Wednesday, November 1, 20:00 CDT
Raleigh, NC, US: Wednesday, November 1, 21:00 EDT
London, England: Thursday, November 2, 1:00 GMT
Paris, France: Thursday, November 2, 2:00 CET
Helsinki, Finland: Thursday, November 2, 3:00 EET
Tel Aviv, Israel: Thursday, November 2, 3:00 IST
Pune, India: Thursday, November 2, 6:30 IST
Brisbane, Australia: Thursday, November 2, 11:00 AEST
Singapore, Asia: Thursday, November 2, 9:00 +08
Auckland, New Zealand: Thursday, November 2, 14:00 NZDT
Software Engineer, Ceph Storage <https://ceph.io>
lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com>
We're happy to announce the 7th backport release in the Quincy series.
* `ceph mgr dump` command now displays the name of the Manager module that
registered a RADOS client in the `name` field added to elements of the
`active_clients` array. Previously, only the address of a module's RADOS
client was shown in the `active_clients` array.
* mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has
undergone significant usability and design improvements to address the slow
backfill issue. Some important changes are:
* The 'balanced' profile is set as the default mClock profile because it
represents a compromise between prioritizing client IO or recovery IO. Users
can then choose either the 'high_client_ops' profile to prioritize client IO
or the 'high_recovery_ops' profile to prioritize recovery IO.
* QoS parameters including reservation and limit are now specified in terms
of a fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity.
* The cost parameters (osd_mclock_cost_per_io_usec_* and
osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation
is now determined using the random IOPS and maximum sequential bandwidth
capability of the OSD's underlying device.
* Degraded object recovery is given higher priority when compared to misplaced
object recovery because degraded objects present a data safety issue not
present with objects that are merely misplaced. Therefore, backfilling
operations with the 'balanced' and 'high_client_ops' mClock profiles may
progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ)
* The QoS allocations in all mClock profiles are optimized based on the above
fixes and enhancements.
* For more detailed information see:
* RGW: S3 multipart uploads using Server-Side Encryption now replicate
correctly in multi-site. Previously, the replicas of such objects were
corrupted on decryption. A new tool, ``radosgw-admin bucket resync encrypted
multipart``, can be used to identify these original multipart uploads. The
``LastModified`` timestamp of any identified object is incremented by 1
nanosecond to cause peer zones to replicate it again. For multi-site
deployments that make any use of Server-Side Encryption, we recommended
running this command against every bucket in every zone after all zones have
* CephFS: MDS evicts clients which are not advancing their request tids which
causes a large buildup of session metadata resulting in the MDS going
read-only due to the RADOS operation exceeding the size threshold.
`mds_session_metadata_threshold` config controls the maximum size that a
(encoded) session metadata can grow.
* CephFS: After recovering a Ceph File System post following the disaster
recovery procedure, the recovered files under `lost+found` directory can now
* Git at git://github.com/ceph/ceph.git
* Tarball at https://download.ceph.com/tarballs/ceph-17.2.7.tar.gz
* Containers at https://quay.io/repository/ceph/ceph
* For packages, see https://docs.ceph.com/en/latest/install/get-packages/
* Release git sha1: b12291d110049b2f35e32e0de30d70e9a4c060d2
I've been experimenting with tracing configurations for ceph from the docs and it seems like it doesn't work as described.
There is an option using jaeger, described in the documentation - https://docs.ceph.com/en/latest/jaegertracing/#jaeger-distributed-tracing/.
Unfortunately, at this time there are only a few spans left inside the traces, and there is no end-to-end tracing between components. This is not enough to work.
There is also an option using LTTng and zipkin for visualization, described in the documentation - https://docs.ceph.com/en/latest/dev/blkin/#tracing-ceph-with-lttng.
When compilation flags are added, the system stops functioning.
After adding -DWITH_LTTNG=ON - a crash occurs while the rados bench is running.
After adding the -DWITH_BLKIN=ON flag, the cluster cannot create a pool.
When you add the -DWITH_EVENTTRACE=ON flag, the application does not build at all.
Are there any plans to restore LTTng functionality?
Are there any plans to improve Jaeger tracing?
Is there any recommended way to use tracing in ceph today?
Thanks in advance.
Hi Ceph users and developers,
You are invited to join us at the User + Dev meeting tomorrow at 10:00 AM
EST! See below for more meeting details.
We have two guest speakers joining us tomorrow:
1. "CRUSH Changes at Scale" by Joshua Baergen, Digital Ocean
In this talk, Joshua Baergen will discuss the problems that operators
encounter with CRUSH changes at scale and how DigitalOcean built
pg-remapper to control and speed up CRUSH-induced backfill.
2. "CephFS Management with Ceph Dashboard" by Pedro Gonzalez Gomez, IBM
This talk will demonstrate new Dashboard behavior regarding CephFS
The last part of the meeting will be dedicated to open discussion. Feel
free to add questions for the speakers or additional topics under the "Open
Discussion" section on the agenda:
If you have an idea for a focus topic you'd like to present at a future
meeting, you are welcome to submit it to this Google Form:
Any Ceph user or developer is eligible to submit!
Meeting link: https://meet.jit.si/ceph-user-dev-monthly
UTC: Thursday, October 19, 14:00 UTC
Mountain View, CA, US: Thursday, October 19, 7:00 PDT
Phoenix, AZ, US: Thursday, October 19, 7:00 MST
Denver, CO, US: Thursday, October 19, 8:00 MDT
Huntsville, AL, US: Thursday, October 19, 9:00 CDT
Raleigh, NC, US: Thursday, October 19, 10:00 EDT
London, England: Thursday, October 19, 15:00 BST
Paris, France: Thursday, October 19, 16:00 CEST
Helsinki, Finland: Thursday, October 19, 17:00 EEST
Tel Aviv, Israel: Thursday, October 19, 17:00 IDT
Pune, India: Thursday, October 19, 19:30 IST
Brisbane, Australia: Friday, October 20, 0:00 AEST
Singapore, Asia: Thursday, October 19, 22:00 +08
Auckland, New Zealand: Friday, October 20, 3:00 NZDT
Software Engineer, Ceph Storage <https://ceph.io>
lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com>
Details of this release are summarized here:
Release Notes - TBD
Issue https://tracker.ceph.com/issues/63192 appears to be failing several runs.
Should it be fixed for this release?
Seeking approvals/reviews for:
smoke - Laura
rados - Laura, Radek, Travis, Ernesto, Adam King
rgw - Casey
fs - Venky
orch - Adam King
rbd - Ilya
krbd - Ilya
upgrade/quincy-p2p - Known issue IIRC, Casey pls confirm/approve
client-upgrade-quincy-reef - Laura
powercycle - Brad pls confirm
ceph-volume - Guillaume pls take a look
Please reply to this email with approval and/or trackers of known
issues/PRs to address them.
Josh, Neha - gibba and LRC upgrades -- N/A for quincy now after reef release.
Here are this week's notes from the CLT:
* Collective review of the Reef/Squid "State of Cephalopod" slides.
* Smoke test suite was unscheduled but it's back on now.
* 17.2.7: about to start building last week, delayed by a few
https://github.com/ceph/ceph/pull/54169). ceph_exporter test coverage
will be prioritized.
* 18.2.1: all PRs in testing or merged.
* Ceph Board approved a new Foundation member tiers model, Silver,
Gold, Platinum, Diamond. Working on implementation with LF.
How could I make rgw built with tcmalloc/jemalloc? I had tcmalloc installed. Although build.ninja has libtcmalloc.so listed for bin/radosgw, it is still using libc’s malloc. Then I had jemalloc installed and the result is the same. What is the proper step to make rgw built with tcmalloc/jemalloc?