- Dev - lists.ceph.io

by Patrick Donnelly

All, At the CLT today we discussed the proliferation of write/admin access on the ceph repository. One of the consequences of this has been that Ceph's code guidelines have not been followed in merges [1]. Additionally, having too many folks -- many of whom have retired from active development -- with write access to the repository presents security concerns. With the CLT's support, I have addressed this by pruning write/admin access to the Ceph repository to only these Github teams: - https://github.com/orgs/ceph/teams/ceph-maintainers - https://github.com/orgs/ceph/teams/ceph-release-team - https://github.com/orgs/ceph/teams/admins - https://github.com/orgs/ceph/people?query=role%3Aowner "ceph-maintainers" is a new team that includes component team leads and senior Ceph engineers. If you feel you should be in this list and were missed (sorry!), please reply to this mail. "ceph-release-team" is a new team that includes folks working on Ceph releases, right now Yuri. "admins" is an extant team that includes members who help administrate the Ceph project. The members of the Ceph org who are owners have write/admin privileges regardless of team organization. I've included that for completeness. Anyone not in these aforementioned teams will (should) be unable to push to ceph.git [2]. Please coordinate with your component team lead for merging your changes. [1] https://github.com/ceph/ceph/blob/main/SubmittingPatches.rst [2] https://github.com/ceph/ceph/settings/access -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

1 month

2
2
0 0

Ceph Leadership Team meeting 2024-03-13

by Ilya Dryomov

Hello, Topics discussed: - a noticeable backlog of "make check" jobs and shaman builds (>6 hours) - mostly self-inflicted because folks have been retriggering "make check" a lot recently in the hopes of working around a number of transient failures - even more so if the pool of jenkins workers used for "make check" and shaman builds is the same - Laura will confirm in the infra meeting - Patrick will downgrade github org owners that aren't active to regular members - proposal to prune down the list of individuals with write access to ceph.git repo (Patrick) - component leads and long-time senior contributors only - the goal is to enforce our SubmittingPatches.rst rules better - also some additional security - concerns over fractured issue tracking - the most recent example is ceph-nvmeof using github issues, but there is also nvmeof subproject on tracker.ceph.com - other notable examples are ceph-csi and go-ceph, although there hasn't been anything on tracker.ceph.com for these - conclusion: github issues or any other issue tracking system is fine as long as there is no tight coupling to ceph.git - question: does a repo being brought in as a submodule as is the case with ceph-nvmeof in https://github.com/ceph/ceph/pull/54671 constitute tight coupling? - in this case the submodule is intended to bring in two .proto files for use by NVMeofGwMonitorClient daemon, everything else in ceph-nvmeof repo shouldn't be looked at - is this the best way to do that -- could these files just be copied? - dashboard already carries a copy of one of them (src/pybind/mgr/dashboard/services/proto/gateway.proto) - creating a pypy package is another option - QA nightlies - now that they are back, need to ensure they are looked at! - poll among component leads: is status quo where all results go to ceph-qa list OK or do we need to have teuthology email people/teams directly? - let's tally up next week Topics moved to next week: - 19.1.0 status - trello to limit free workspaces to 10 collaborators - need a replacement for Yuri's board at the very least Thanks, Ilya

1 month

1
0
0 0

References to common bad code / practices

by Suyash Dongre

Hi there. I am creating a static analysis tool using clang-tidy for the Ceph codebase to quickly find out potential bugs. I wanted references to or examples of bad code or code which is frequently used which when written badly can result in failures. With this I hope to create the tool to check for bad code. Any help would be appreciated. Thank you. Regards, Suyash

1 month, 1 week

1
0
0 0

v18.2.2 Reef (hot-fix) released

by Yuri Weinstein

We're happy to announce the 2nd hotfix release in the Reef series. We recommend users to update to this release. For detailed release notes with links & changelog please refer to the official blog entry at https://ceph.io/en/news/blog/2024/v18-2-2-reef-released/ Notable Changes --------------- * mgr/Prometheus: refine the orchestrator availability check to prevent against crashes in the prometheus module during startup. Introduce additional checks to handle daemon_ids generated within the Rook environment, thus preventing potential issues during RGW metrics metadata generation. Related tracker: https://tracker.ceph.com/issues/64721 Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at https://download.ceph.com/tarballs/ceph_18.2.2.orig.tar.gz * Containers at https://quay.io/repository/ceph/ceph * For packages, see https://docs.ceph.com/en/latest/install/get-packages/ * Release git sha1: 531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2

1 month, 1 week

1
0
0 0

reef 18.2.2 (hot-fix) QE validation status

by Yuri Weinstein

Details of this release are summarized here: https://tracker.ceph.com/issues/64721#note-1 Release Notes - TBD LRC upgrade - TBD Seeking approvals/reviews for: smoke - in progress rados - Radek, Laura? quincy-x - in progress Also need approval from Travis, Redouane for Prometheus fix testing.

1 month, 1 week

7
10
0 0

03/07/2024 perf meeting is on!

by Mark Nelson

Hi Folks, Meeting starts in 3 minutes! Today let's talk about EC partial read testing for Squid! Etherpad: https://pad.ceph.com/p/performance_weekly Meeting URL: https://meet.google.com/uhb-cysu-nvg Mark -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nelson(a)clyso.com We are hiring: https://www.clyso.com/jobs/

1 month, 1 week

1
0
0 0

Ceph Leadership Team Meeting Minutes - March 6, 2024

by Ernesto Puerta

Hi Cephers, These are the topics covered in today's meeting: - *Releases* - *Hot fixes Releases* - *18.2.2* - https://github.com/ceph/ceph/pull/55491 - reef: mgr/prometheus: fix orch check to prevent Prometheus crash - https://github.com/ceph/ceph/pull/55709 - reef: debian/*.postinst: add adduser as a dependency and specify --home when adduser - https://github.com/ceph/ceph/pull/55712 - reef: src/osd/OSDMap.cc: Fix encoder to produce same bytestream - [Laura] When/who can upgrade the LRC for this release? (Dan will after we do some last checks today) - Gibba has been upgraded with no apparent issues - *17.2.8* (on hold based on whether the osdmap fix requirement) - *No longer needed* - osdmap fix (not needed: https://github.com/ceph/ceph/blob/quincy/src/osd/OSDMap.cc#L3087-L3089) - Rook requests to include this c-v fix that blocks OSDs in some scenarios (not worthy of its own hotfix, just please include if we do the hotfix) - https://github.com/ceph/ceph/pull/54522 - quincy: ceph-volume: fix a regression in raw list - As crc fix is not needed, the Rook request can be included in a regular quincy release - *Regular releases* - *18.2.3* - exporter fixes for rook and debian-derived users make this more urgent than quincy - mgr usage of pyO3/cryptography an issue for debian - and possibly centos9 (https://tracker.ceph.com/issues/64213#note-2) - see notes from 02/07 - Any updates on potentially dropping modules or another fix? Adam? - Squid *19.1.0* - CephFS waiting for 2 feature PR - RGW PRs - NVMe? To be confirmed with Aviv - squid blockers: - build centos 9 containers: https://github.com/ceph/ceph-container/pull/2183 - ceph-object-corpus: https://github.com/ceph/ceph-object-corpus/pull/17 (testing in https://github.com/ceph/ceph/pull/54735) - Milestone for squid blockers (use to tag blockers for the first 19.1.0 RC): https://github.com/ceph/ceph/milestone/21 - Squid RCs and community testing - https://pad.ceph.com/p/squid_scale_testing - Target date March ~20 - *17.2.9* - need jammy builds for quincy before a squid release. maybe we can just build them for the 17.2.7 release? (do you mean the 17.2.8 release?) - *Meeting time - *change days to Monday or Thursday? (added by Josh - who has a conflict on Wednesdays now) - Thursday has several conflicting community meetings - Any objections to Monday at the same time? - Note the change to US daylight savings next week - Let's do a poll (Doodle) - *debian-reef_OLD email thread "[ceph-users] debian-reef_OLD?"* - Fixed by Yuri - *CDM APAC tonight*: https://tracker.ceph.com/projects/ceph/wiki/CDM_06-MAR-2024 - *Sepia Lab*: - PSA: https://github.com/ceph/ceph/pull/55820 merged (squid crontab additions and overhaul to nightlies) - New grafana widget for smithi node utilization: - https://grafana-route-grafana.apps.os.sepia.ceph.com/d/teuthology/teutholog… - (Basically: unlocked machine * hours / total machine * hours ) - [Zac] *ceph-exporter release notes question from Jan Horacek* (from the upstream community) - Route to Juanmi Olmo - [Zac] - *Eugen Block's question about removing sensitive information from ceph-users mailing list* - No easy way to request/remove sensitive information. - [Zac] - *Anthony D'Atri submits Index HQ in Toronto as a possible venue for Cephalocon 2024* - Venue already booked (Patrick) - [Zac] - *CQ issue 4 -- submit your requests before 25 Mar 2024* -- zac.dover(a)proton.me Kind Regards, Ernesto Puerta

1 month, 1 week

1
0
0 0

Ceph Developer Monthly (APAC) happening tomorrow, March 6th

by Laura Flores

Hi everyone, CDM (APAC) is happening tomorrow, March 6th at 9:00pm ET. See more meeting details below. Please add any topics you'd like to discuss to the agenda: https://tracker.ceph.com/projects/ceph/wiki/CDM_06-MAR-2024 Thanks, Laura Flores Meeting link: <https://meet.jit.si/ceph-dev-monthly> <https://meet.jit.si/ceph-dev-monthly>https://meet.jit.si/ceph-dev-monthly Time conversions: UTC: Thursday, March 7, 2:00 UTC Mountain View, CA, US: Wednesday, March 6, 18:00 PST Phoenix, AZ, US: Wednesday, March 6, 19:00 MST Denver, CO, US: Wednesday, March 6, 19:00 MST Huntsville, AL, US: Wednesday, March 6, 20:00 CST Raleigh, NC, US: Wednesday, March 6, 21:00 EST London, England: Thursday, March 7, 2:00 GMT Paris, France: Thursday, March 7, 3:00 CET Helsinki, Finland: Thursday, March 7, 4:00 EET Tel Aviv, Israel: Thursday, March 7, 4:00 IST Pune, India: Thursday, March 7, 7:30 IST Brisbane, Australia: Thursday, March 7, 12:00 AEST Singapore, Asia: Thursday, March 7, 10:00 +08 Auckland, New Zealand: Thursday, March 7, 15:00 NZDT -- Laura Flores She/Her/Hers Software Engineer, Ceph Storage <https://ceph.io> Chicago, IL lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com> M: +17087388804

1 month, 2 weeks

1
0
0 0

v16.2.15 Pacific released

by Yuri Weinstein

We're happy to announce the 15th, and expected to be the last, backport release in the Pacific series. https://ceph.io/en/news/blog/2024/v16-2-15-pacific-released/ Notable Changes --------------- * `ceph config dump --format <json|xml>` output will display the localized option names instead of their normalized version. For example, "mgr/prometheus/x/server_port" will be displayed instead of "mgr/prometheus/server_port". This matches the output of the non pretty-print formatted version of the command. * CephFS: MDS evicts clients who are not advancing their request tids, which causes a large buildup of session metadata, resulting in the MDS going read-only due to the RADOS operation exceeding the size threshold. The `mds_session_metadata_threshold` config controls the maximum size that an (encoded) session metadata can grow. * RADOS: The `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated due to its susceptibility to false negative results. Its safer replacement is `pool_is_in_selfmanaged_snaps_mode`. * RBD: When diffing against the beginning of time (`fromsnapname == NULL`) in fast-diff mode (`whole_object == true` with `fast-diff` image feature enabled and valid), diff-iterate is now guaranteed to execute locally if exclusive lock is available. This brings a dramatic performance improvement for QEMU live disk synchronization and backup use cases. Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at https://download.ceph.com/tarballs/ceph-16.2.15.tar.gz * Containers at https://quay.io/repository/ceph/ceph * For packages, see https://docs.ceph.com/en/latest/install/get-packages/ * Release git sha1: 618f440892089921c3e944a991122ddc44e60516

1 month, 2 weeks

2
1
0 0

rados_exec returning with EIO Debugging

by Donald Jennings

Hello all, We are using Ceph as the storage backend for some Cloud research which involves offloading functions to storage nodes to benefit from near-storage processing. We are using rados_exec to achieve this by attempting to call a class method on the object which then executes the function locally. However, we have been running into an issue where rados_exec fails with EIO and the request is never reaching the storage node with method never being called. Upon debugging this, I have noticed that if i re-put the same object with a different key it works (provided it is on a different OSD). It appears that the OSD cannot serve a rados_exec request. This bug happens under a few conditions 1. If we invoke the function before uploading it 2. Non-deterministically when the OSD is under load. I cannot seem to debug it for the life of me and only thing I have to go on is the OSDs cannot serve requests. I have attempted to remove the object from the pool and put it back with the same key and it does the exact same thing. Any advice on where to get started / help debugging this would be greatly appreciated as my thesis depends on it. (any request to OSD.0 and OSD.1 fails)🙁 Donald

1 month, 2 weeks

2
1
0 0

2024

2023

2022

2021

2020

2019

Dev