All,
At the CLT today we discussed the proliferation of write/admin access
on the ceph repository. One of the consequences of this has been that
Ceph's code guidelines have not been followed in merges [1].
Additionally, having too many folks -- many of whom have retired from
active development -- with write access to the repository presents
security concerns.
With the CLT's support, I have addressed this by pruning write/admin
access to the Ceph repository to only these Github teams:
- https://github.com/orgs/ceph/teams/ceph-maintainers
- https://github.com/orgs/ceph/teams/ceph-release-team
- https://github.com/orgs/ceph/teams/admins
- https://github.com/orgs/ceph/people?query=role%3Aowner
"ceph-maintainers" is a new team that includes component team leads
and senior Ceph engineers. If you feel you should be in this list and
were missed (sorry!), please reply to this mail.
"ceph-release-team" is a new team that includes folks working on Ceph
releases, right now Yuri.
"admins" is an extant team that includes members who help administrate
the Ceph project.
The members of the Ceph org who are owners have write/admin privileges
regardless of team organization. I've included that for completeness.
Anyone not in these aforementioned teams will (should) be unable to
push to ceph.git [2]. Please coordinate with your component team lead
for merging your changes.
[1] https://github.com/ceph/ceph/blob/main/SubmittingPatches.rst
[2] https://github.com/ceph/ceph/settings/access
--
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
Hello,
Topics discussed:
- a noticeable backlog of "make check" jobs and shaman builds
(>6 hours)
- mostly self-inflicted because folks have been retriggering "make
check" a lot recently in the hopes of working around a number of
transient failures
- even more so if the pool of jenkins workers used for "make check"
and shaman builds is the same
- Laura will confirm in the infra meeting
- Patrick will downgrade github org owners that aren't active to
regular members
- proposal to prune down the list of individuals with write access to
ceph.git repo (Patrick)
- component leads and long-time senior contributors only
- the goal is to enforce our SubmittingPatches.rst rules better
- also some additional security
- concerns over fractured issue tracking
- the most recent example is ceph-nvmeof using github issues, but
there is also nvmeof subproject on tracker.ceph.com
- other notable examples are ceph-csi and go-ceph, although there
hasn't been anything on tracker.ceph.com for these
- conclusion: github issues or any other issue tracking system is
fine as long as there is no tight coupling to ceph.git
- question: does a repo being brought in as a submodule as is the
case with ceph-nvmeof in https://github.com/ceph/ceph/pull/54671
constitute tight coupling?
- in this case the submodule is intended to bring in two .proto
files for use by NVMeofGwMonitorClient daemon, everything else in
ceph-nvmeof repo shouldn't be looked at
- is this the best way to do that -- could these files just be
copied?
- dashboard already carries a copy of one of them
(src/pybind/mgr/dashboard/services/proto/gateway.proto)
- creating a pypy package is another option
- QA nightlies
- now that they are back, need to ensure they are looked at!
- poll among component leads: is status quo where all results go to
ceph-qa list OK or do we need to have teuthology email people/teams
directly?
- let's tally up next week
Topics moved to next week:
- 19.1.0 status
- trello to limit free workspaces to 10 collaborators
- need a replacement for Yuri's board at the very least
Thanks,
Ilya
Hi there.
I am creating a static analysis tool using clang-tidy for the Ceph codebase
to quickly find out potential bugs.
I wanted references to or examples of bad code or code which is frequently
used which when written badly can result in failures.
With this I hope to create the tool to check for bad code.
Any help would be appreciated.
Thank you.
Regards,
Suyash
Details of this release are summarized here:
https://tracker.ceph.com/issues/64721#note-1
Release Notes - TBD
LRC upgrade - TBD
Seeking approvals/reviews for:
smoke - in progress
rados - Radek, Laura?
quincy-x - in progress
Also need approval from Travis, Redouane for Prometheus fix testing.
Hi Cephers,
These are the topics covered in today's meeting:
- *Releases*
- *Hot fixes Releases*
- *18.2.2*
- https://github.com/ceph/ceph/pull/55491 - reef: mgr/prometheus: fix
orch check to prevent Prometheus crash
- https://github.com/ceph/ceph/pull/55709 - reef: debian/*.postinst: add
adduser as a dependency and specify --home when adduser
- https://github.com/ceph/ceph/pull/55712 - reef: src/osd/OSDMap.cc: Fix
encoder to produce same bytestream
- [Laura] When/who can upgrade the LRC for this release? (Dan will after
we do some last checks today)
- Gibba has been upgraded with no apparent issues
- *17.2.8* (on hold based on whether the osdmap fix requirement) - *No
longer needed*
- osdmap fix (not needed:
https://github.com/ceph/ceph/blob/quincy/src/osd/OSDMap.cc#L3087-L3089)
- Rook requests to include this c-v fix that blocks OSDs in some
scenarios (not worthy of its own hotfix, just please include if we do the
hotfix)
- https://github.com/ceph/ceph/pull/54522 - quincy: ceph-volume: fix a
regression in raw list
- As crc fix is not needed, the Rook request can be included in a
regular quincy release
- *Regular releases*
- *18.2.3* - exporter fixes for rook and debian-derived users make this
more urgent than quincy
- mgr usage of pyO3/cryptography an issue for debian - and possibly
centos9 (https://tracker.ceph.com/issues/64213#note-2) - see notes from
02/07
- Any updates on potentially dropping modules or another fix? Adam?
- Squid *19.1.0*
- CephFS waiting for 2 feature PR
- RGW PRs
- NVMe? To be confirmed with Aviv
- squid blockers:
- build centos 9 containers:
https://github.com/ceph/ceph-container/pull/2183
- ceph-object-corpus:
https://github.com/ceph/ceph-object-corpus/pull/17 (testing
in https://github.com/ceph/ceph/pull/54735)
- Milestone for squid blockers (use to tag blockers for the first 19.1.0
RC): https://github.com/ceph/ceph/milestone/21
- Squid RCs and community testing
- https://pad.ceph.com/p/squid_scale_testing
- Target date March ~20
- *17.2.9*
- need jammy builds for quincy before a squid release. maybe we can just
build them for the 17.2.7 release? (do you mean the 17.2.8 release?)
- *Meeting time - *change days to Monday or Thursday? (added by Josh -
who has a conflict on Wednesdays now)
- Thursday has several conflicting community meetings
- Any objections to Monday at the same time?
- Note the change to US daylight savings next week
- Let's do a poll (Doodle)
- *debian-reef_OLD email thread "[ceph-users] debian-reef_OLD?"*
- Fixed by Yuri
- *CDM APAC tonight*:
https://tracker.ceph.com/projects/ceph/wiki/CDM_06-MAR-2024
- *Sepia Lab*:
- PSA: https://github.com/ceph/ceph/pull/55820 merged (squid crontab
additions and overhaul to nightlies)
- New grafana widget for smithi node utilization:
-
https://grafana-route-grafana.apps.os.sepia.ceph.com/d/teuthology/teutholog…
- (Basically: unlocked machine * hours / total machine * hours )
- [Zac] *ceph-exporter release notes question from Jan Horacek* (from
the upstream community)
- Route to Juanmi Olmo
- [Zac] - *Eugen Block's question about removing sensitive information
from ceph-users mailing list*
- No easy way to request/remove sensitive information.
- [Zac] - *Anthony D'Atri submits Index HQ in Toronto as a possible
venue for Cephalocon 2024*
- Venue already booked (Patrick)
- [Zac] - *CQ issue 4 -- submit your requests before 25 Mar 2024* --
zac.dover(a)proton.me
Kind Regards,
Ernesto Puerta
Hi everyone,
CDM (APAC) is happening tomorrow, March 6th at 9:00pm ET. See more meeting
details below.
Please add any topics you'd like to discuss to the agenda:
https://tracker.ceph.com/projects/ceph/wiki/CDM_06-MAR-2024
Thanks,
Laura Flores
Meeting link:
<https://meet.jit.si/ceph-dev-monthly>
<https://meet.jit.si/ceph-dev-monthly>https://meet.jit.si/ceph-dev-monthly
Time conversions:
UTC: Thursday, March 7, 2:00 UTC
Mountain View, CA, US: Wednesday, March 6, 18:00 PST
Phoenix, AZ, US: Wednesday, March 6, 19:00 MST
Denver, CO, US: Wednesday, March 6, 19:00 MST
Huntsville, AL, US: Wednesday, March 6, 20:00 CST
Raleigh, NC, US: Wednesday, March 6, 21:00 EST
London, England: Thursday, March 7, 2:00 GMT
Paris, France: Thursday, March 7, 3:00 CET
Helsinki, Finland: Thursday, March 7, 4:00 EET
Tel Aviv, Israel: Thursday, March 7, 4:00 IST
Pune, India: Thursday, March 7, 7:30 IST
Brisbane, Australia: Thursday, March 7, 12:00 AEST
Singapore, Asia: Thursday, March 7, 10:00 +08
Auckland, New Zealand: Thursday, March 7, 15:00 NZDT
--
Laura Flores
She/Her/Hers
Software Engineer, Ceph Storage <https://ceph.io>
Chicago, IL
lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com>
M: +17087388804
We're happy to announce the 15th, and expected to be the last,
backport release in the Pacific series.
https://ceph.io/en/news/blog/2024/v16-2-15-pacific-released/
Notable Changes
---------------
* `ceph config dump --format <json|xml>` output will display the localized
option names instead of their normalized version. For example,
"mgr/prometheus/x/server_port" will be displayed instead of
"mgr/prometheus/server_port". This matches the output of the non pretty-print
formatted version of the command.
* CephFS: MDS evicts clients who are not advancing their request tids,
which causes
a large buildup of session metadata, resulting in the MDS going
read-only due to
the RADOS operation exceeding the size threshold. The
`mds_session_metadata_threshold`
config controls the maximum size that an (encoded) session metadata can grow.
* RADOS: The `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated
due to its susceptibility to false negative results. Its safer replacement is
`pool_is_in_selfmanaged_snaps_mode`.
* RBD: When diffing against the beginning of time (`fromsnapname == NULL`) in
fast-diff mode (`whole_object == true` with `fast-diff` image feature enabled
and valid), diff-iterate is now guaranteed to execute locally if exclusive
lock is available. This brings a dramatic performance improvement for QEMU
live disk synchronization and backup use cases.
Getting Ceph
------------
* Git at git://github.com/ceph/ceph.git
* Tarball at https://download.ceph.com/tarballs/ceph-16.2.15.tar.gz
* Containers at https://quay.io/repository/ceph/ceph
* For packages, see https://docs.ceph.com/en/latest/install/get-packages/
* Release git sha1: 618f440892089921c3e944a991122ddc44e60516
Hello all,
We are using Ceph as the storage backend for some Cloud research which involves offloading functions to storage nodes to benefit from near-storage processing. We are using rados_exec to achieve this by attempting to call a class method on the object which then executes the function locally. However, we have been running into an issue where rados_exec fails with EIO and the request is never reaching the storage node with method never being called.
Upon debugging this, I have noticed that if i re-put the same object with a different key it works (provided it is on a different OSD). It appears that the OSD cannot serve a rados_exec request.
This bug happens under a few conditions
1. If we invoke the function before uploading it
2. Non-deterministically when the OSD is under load.
I cannot seem to debug it for the life of me and only thing I have to go on is the OSDs cannot serve requests. I have attempted to remove the object from the pool and put it back with the same key and it does the exact same thing.
Any advice on where to get started / help debugging this would be greatly appreciated as my thesis depends on it. (any request to OSD.0 and OSD.1 fails)🙁
Donald