hey Gal and Eric,
in today's standup, we discussed the version of our apache arrow
submodule. it's currently pinned at 6.0.1, which was tagged in nov.
2021. the centos9 builds are using the system package
libarrow-devel-9.0.0. arrow's upstream recently tagged an 11.0.0
release
as far as i know, there still aren't any system packages for ubuntu,
so we're likely to be stuck with the submodule for quite a while. how
do guys want to handle these updates? is it worth trying to update
before the reef release?
Hi Folks,
Recently we discovered a flaw in how the upstream Ubuntu and Debian
builds of Ceph compile RocksDB. It causes a variety of performance
issues including slower than expected write performance, 3X longer
compaction times, and significantly higher than expected CPU utilization
when RocksDB is heavily utilized. The issue has now been fixed in main.
Igor Fedotov, however, observed during the performance meeting today
that there were no backports for the fix in place. He also rightly
pointed out that it would be helpful to make an announcement about the
issue given the severity for the affected users. I wanted to give a bit
more background and make sure people are aware and understand what's
going on.
1) Who's affected?
Anyone running an upstream Ubuntu/Debian build of Ceph from the last
several years. External builds from Canonical and Gentoo suffered from
this issue as well, but were fixed independently.
2) How can you check?
There's no easy way to tell at the moment. We are investigating if
running "strings" on the OSD executable may provide a clue. For now,
assume that if you are using our Debian/Ubuntu builds in a non-container
configuration you are affected. Proxmox for instance was affected prior
to adopting the fix.
3) Are Cephadm deployments affected?
Not as far as we know. Ceph container builds are compiled slightly
differently from stand-alone Debian builds. They do not appear to
suffer from the bug.
4) What versions of Ceph will get the fix?
Casey Bodley kindly offered to backport the fix to both Reef and Quincy.
He also verified that the fix builds properly with Pacific. We now
have 3 separate backport PRs for the releases here:
https://github.com/ceph/ceph/pull/55500https://github.com/ceph/ceph/pull/55501https://github.com/ceph/ceph/pull/55502
Please feel free to reply if you have any questions!
Thanks,
Mark
--
Best Regards,
Mark Nelson
Head of Research and Development
Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nelson(a)clyso.com
We are hiring: https://www.clyso.com/jobs/
Dear CEPH dev team,
I have a CEPH cluster with three MONs two of each are down. When I try
to start them they crash and
journalctl shows that they crashed and a core dump was created. Would
that be a bug? Or a corrupt DB?
I have then a third MON that starts fine but when I get mon_status
through the admin-socket I see
"quorum": []
"state": "probing"
because of that (I believe) I cannot use 'ceph orch' to create new MONs.
So my questions:
- Is there way that I can use 'ceph orch' to create new MONS?
- Can I just rsync the store.db from this running node to the crashing
MON nodes?
- Or do I have to rebuild the store.db using a script as in
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#…
Regards,
Raul
Hi Folks,
Meeting starts in 2 minutes! Nothing hard set on the agenda today,
though we have a couple of new PRs to look at. Please feel free to add
a discussion topic if you have one!
Thanks,
Mark
Etherpad:
https://pad.ceph.com/p/performance_weekly
Meeting URL:
https://meet.google.com/uhb-cysu-nvg
Mark
--
Best Regards,
Mark Nelson
Head of R&D (USA)
Clyso GmbH
p: +49 89 21552391 12
a: Loristraße 8 | 80335 München | Germany
w: https://clyso.com | e: mark.nelson(a)clyso.com
We are hiring: https://www.clyso.com/jobs/
hi James and Thomas,
sorry to revive this ancient thread, but i wanted to follow up on
debian packaging for this arrow dependency. Kaleb (cc'ed) worked
through this on the rpm side quite a while ago. it would be nice to
get the debs too so we can drop the submodule
the upstream apache arrow project has been publishing deb packages,
see https://arrow.apache.org/install/
can you share any guidance to help us get these packages into the distros?
On Fri, Dec 16, 2022 at 11:13 AM Casey Bodley <cbodley(a)redhat.com> wrote:
>
> On Wed, Dec 14, 2022 at 11:47 AM Kaleb Keithley <kkeithle(a)redhat.com> wrote:
> >
> >
> >
> > On Wed, Dec 14, 2022 at 11:25 AM Casey Bodley <cbodley(a)redhat.com> wrote:
> >>
> >> On Wed, Dec 14, 2022 at 11:03 AM Kaleb Keithley <kkeithle(a)redhat.com> wrote:
> >> >
> >> >
> >> > On Wed, Dec 14, 2022 at 10:39 AM Casey Bodley <cbodley(a)redhat.com> wrote:
> >> >>
> >> >> here's what i'm seeing from the latest builds in
> >> >> https://shaman.ceph.com/builds/ceph/wip-use-epel-arrow/1ffcfa14e73b6ac92168…
> >> >>
> >> >
> >> > I'm kinda lost here. Of the six builds listed there, I can't find any mention of libarrow in any of the centos8 logs I can find. (Just "no matching package for arrow-devel.) Would you please send me a link directly to the log file(s) you're looking at?
> >>
> >> there's a "Log Output" link in the individual build pages. here's a
> >> plain text link to the centos8 x86-64 build log:
> >
> >
> > Yes, those were the links I was following trying to find something. My fault, I did not look at the last build on that list (326003) where the only mention of arrow is:
> > Package arrow-devel-6.0.1-1.el8.x86_64 is already installed.
> >
> > The other two centos8 logs that I did look at have only
> > No matching package to install: 'arrow-devel'
> >
> > That aside, the RPM names in EPEL (and Fedora, and CBS) are libarrow and libarrow-devel. (And parquet-libs and parquet-libs-devel.)
> >
> > I've never packaged Arrow 6. Not in EPEL, not in Fedora, not in CBS. EPEL8 has only ever had Arrow 8 (libarrow-8.0.z, currently libarrow-8.0.1-1.el8) EPEL9 has Arrow 9 (currently libarrow-9.0.0-8.el9)
> >
> > I have no idea where that one build got arrow-devel-6.0.1-1.el8 from.
>
> maybe those 6.0.1 packages were left over from the old apache.jfrog.io
> repo we were using for initial parquet testing, but i thought we had
> removed that repo from all the builders (this was discussed in
> https://www.spinics.net/lists/dev-ceph/msg04048.html)
>
> with those package names corrected to libarrow-devel and
> parquet-libs-devel, i'm now getting clean builds for centos 9. for
> centos 8, we'll need to keep building arrow from submodule until that
> subversion-devel module is available
>
> thanks Kaleb!
>
> >
> >
> >>
> >>
> >> https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=x86_64,AVAILABLE_ARCH=…
> >>
> >> >
> >> > Also, epel8 (and Stream8 in the CBS Storage SIG) have libarrow-8. I've never even built libarrow-6 in either epel8 or for Stream8. I'm not sure how anything is getting libarrow-6.
> >> >
> >> >
> >> >
> >> >>
> >> >> on centos8 x86-64, it installs the arrow-devel-6.0.1-1 and
> >> >> parquet-devel-6.0.1-1 packages and compiles successfully, but the
> >> >> packaging step fails at the end with a bunch of errors like:
> >> >> - nothing provides libarrow.so.600()(64bit) needed by
> >> >> librgw2-2:18.0.0-1351.g1ffcfa14.el8.x86_64
> >> >> - nothing provides libparquet.so.600()(64bit) needed by
> >> >> librgw2-2:18.0.0-1351.g1ffcfa14.el8.x86_64
> >> >>
> >> >> would those errors indicate an issue with the arrow/parquet packages
> >> >> themselves, or is the ceph build doing something wrong here?
> >> >>
> >> >> it fails to install the arrow- and parquet-devel packages for all
> >> >> other rpm builds. for centos8 arm64, i assume it's because there are
> >> >> no arm packages. but for x86-64 builds of centos9 and centos8+crimson,
> >> >> i couldn't tell why it behaves differently from the centos8 build
> >> >>
> >> >> On Wed, Dec 14, 2022 at 9:00 AM Kaleb Keithley <kkeithle(a)redhat.com> wrote:
> >> >> >
> >> >> > <top post>
> >> >> > So, in their infinite wisdom, RHEL engineering has decided that utf8proc-devel is to be included in the subversion-devel *module*.
> >> >> >
> >> >> > I haven't used modules much (and modules have gone through a lot of churn in Fedora) so I'm not especially familiar with them. That aside I have learned—
> >> >> >
> >> >> > To install utf8proc-devel on RHEL you need to:
> >> >> > sudo dnf module enable subversion-devel
> >> >> > sudo dnf install -y utf8proc-devel
> >> >> >
> >> >> > That will also work on Stream8 – eventually – after they fix the missing subversion-devel module.
> >> >> > </top post>
> >> >> >
> >> >> > On Tue, Dec 13, 2022 at 1:52 PM Kaleb Keithley <kkeithle(a)redhat.com> wrote:
> >> >> >>
> >> >> >> On Tue, Dec 13, 2022 at 12:39 PM Ken Dreyer <kdreyer(a)redhat.com> wrote:
> >> >> >>>
> >> >> >>> This is an example of why we should move to el9-only (and drop el8
> >> >> >>> support).
> >> >> >>
> >> >> >>
> >> >> >> RHEL8, C8, and Stream8 usage numbers are still (anecdotally by CentOS and Red Hat peeps) much higher than RHEL9/Stream9 numbers. If you want people using & testing it upstream, it needs to be built or buildable on el8. If you don't want to build it for el8 in teuthology, e.g., that's outside my area of interest and/or responsibility. Do whatever you want there.
> >> >> >>
> >> >> >> AFAICT the utf8proc-epel package — which provides utf8proc-devel for epel8 — seems to have been allowed to fall behind what's in RHEL. Perhaps in anticipation of retiring it due to an apparent misunderstanding of what is in RHEL8. (A mistake?)
> >> >> >>
> >> >> >> We should be good community members and the the bug fixed in epel8, instead of using it as justification for yanking support for el8. Which would be premature IMO.
> >> >> >>
> >> >> >>>
> >> >> >>> It's in RHEL 9 in
> >> >> >>> https://bugzilla.redhat.com/show_bug.cgi?id=2041057
> >> >> >>
> >> >> >>
> >> >> >> Uh, yes. As I already indicated in my earlier email. But it's in AppStream and CodeReadyBuilder. Not the base.
> >> >> >>
> >> >> >> And utf8proc is in RHEL8, also in AppStream (in a module). And AFAICT we need to resurrect utf8proc-devel in epel8 and get it updated to the same version that's in RHEL and Stream8.
> >> >> >>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> - Ken
> >> >> >>>
> >> >> >>> On Sat, Dec 10, 2022 at 6:29 AM Kaleb Keithley <kkeithle(a)redhat.com> wrote:
> >> >> >>> >
> >> >> >>> > I opened https://bugzilla.redhat.com/show_bug.cgi?id=2152211 and the utf8proc-epel packager has added the following comment:
> >> >> >>> >
> >> >> >>> > Actually, utf8proc-devel was added to the subversion-devel module in CRB - bz#1996237, so I have now retired this package. I noticed that subversion-devel seems to be missing from CentOS Stream 8, so I filed bz#2152265.
> >> >> >>> >
> >> >> >>> >
> >> >> >>> > On Fri, Dec 9, 2022 at 1:39 PM Kaleb Keithley <kkeithle(a)redhat.com> wrote:
> >> >> >>> >>
> >> >> >>> >> For rhel8 and centos8/centos8s utf8proc is in base, and utf8proc-devel is in epel[1]. It appears that utf8proc-devel in epel is behind the times and installing utf8proc-devel will downgrade utf8proc to version 2.1.1. utf8proc was updated for rhel8.7 back in July and the epel maintainer has not updated it yet. I'll look into it.
> >> >> >>> >>
> >> >> >>> >> For rhel9 and centos9s utf8proc is in AppStream and utf8proc-devel is in codeready-builder (CRB)
> >> >> >>> >>
> >> >> >>> >> For both el8 and el9 libarrow(-devel) and libparquet(-devel) are in epel8[2] and epel9[3].
> >> >> >>> >>
> >> >> >>> >> [1] https://koji.fedoraproject.org/koji/packageinfo?packageID=34406
> >> >> >>> >> [2] https://koji.fedoraproject.org/koji/buildinfo?buildID=2019255
> >> >> >>> >> [3] https://koji.fedoraproject.org/koji/buildinfo?buildID=2059829
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> On Fri, Dec 9, 2022 at 1:16 PM Casey Bodley <cbodley(a)redhat.com> wrote:
> >> >> >>> >>>
> >> >> >>> >>> i've been trying to get our rpms to build without these arrow and
> >> >> >>> >>> utf8proc submodules in https://github.com/ceph/ceph/pull/46535, but
> >> >> >>> >>> the shaman centos builds can't find their EPEL packages:
> >> >> >>> >>>
> >> >> >>> >>> > No matching package to install: 'arrow-devel'
> >> >> >>> >>> > No matching package to install: 'parquet-devel'
> >> >> >>> >>>
> >> >> >>> >>> can anyone help me figure out why? do we need to do anything special
> >> >> >>> >>> to enable EPEL here, or would it already be available?
> >> >> >>> >>>
> >> >> >>> >>> On Thu, Mar 31, 2022 at 2:51 PM Casey Bodley <cbodley(a)redhat.com> wrote:
> >> >> >>> >>> >
> >> >> >>> >>> > i've opened https://github.com/ceph/ceph/pull/45740 to add a paragraph
> >> >> >>> >>> > about these bundled dependencies under "CMake Options" of README.md.
> >> >> >>> >>> > it doesn't list all of these options, but a `grep WITH_SYSTEM_
> >> >> >>> >>> > CMakeLists.txt` will
> >> >> >>> >>> >
> >> >> >>> >>> > On Thu, Mar 31, 2022 at 9:05 AM Mark Nelson <mnelson(a)redhat.com> wrote:
> >> >> >>> >>> > >
> >> >> >>> >>> > >
> >> >> >>> >>> > > On 3/30/22 10:05, Casey Bodley wrote:
> >> >> >>> >>> > > > On Tue, Mar 29, 2022 at 8:18 PM Kaleb Keithley <kkeithle(a)redhat.com> wrote:
> >> >> >>> >>> > > >>
> >> >> >>> >>> > > >> On Tue, Mar 29, 2022 at 7:59 PM Gregory Farnum <gfarnum(a)redhat.com> wrote:
> >> >> >>> >>> > > >>>
> >> >> >>> >>> > > >>>
> >> >> >>> >>> > > >>> install-deps.sh should handle grabbing any relevant packages.
> >> >> >>> >>> > > >>>
> >> >> >>> >>> > > >> Nope. Bad assumption.
> >> >> >>> >>> > > >>
> >> >> >>> >>> > > >> I'll be happy to walk you through building packages for RHEL, Fedora, and CentOS SIGs some time.
> >> >> >>> >>> > > >>
> >> >> >>> >>> > > >> Brew, Koji, and CBS all run builds in mock. Dependencies are preloaded and installed before rpmbuild runs. mock has networking disabled by default, so you quite simply can't do things like install other rpms, download outside sources, or do a git submodule init.
> >> >> >>> >>> > > >>
> >> >> >>> >>> > > >> I have some experience with pbuilder on Debian and Ubuntu too, but I don't know if it disables networking like mock does, although I suspect it may.
> >> >> >>> >>> > > >>
> >> >> >>> >>> > > >> --
> >> >> >>> >>> > > >>
> >> >> >>> >>> > > >> Kaleb
> >> >> >>> >>> > > > since this is a discussion about our default values for WITH_SYSTEM_
> >> >> >>> >>> > > > cmake variables, i think it's best to separate out packaging concerns
> >> >> >>> >>> > > > - we assume that distros are customizing these WITH_SYSTEM_ variables
> >> >> >>> >>> > > > to match what's available, as we do in-tree with ceph.spec.in and
> >> >> >>> >>> > > > debian/rules
> >> >> >>> >>> > > >
> >> >> >>> >>> > > > these cmake defaults are most important for ceph developers doing
> >> >> >>> >>> > > > their local builds. and here, we do expect that install-deps.sh will
> >> >> >>> >>> > > > install any system packages that are available on the user's distro
> >> >> >>> >>> > > >
> >> >> >>> >>> > > > from the perspective of a ceph developer trying to get the fastest
> >> >> >>> >>> > > > local builds, i agree that it would be nice for some of these
> >> >> >>> >>> > > > variables to default to ON instead of tuning them manually
> >> >> >>> >>> > > >
> >> >> >>> >>> > > > but from the perspective of maintaining an open source project, i also
> >> >> >>> >>> > > > see a value in making sure that our default cmake configuration 'just
> >> >> >>> >>> > > > works'. ceph is a big project with a complicated build system. by
> >> >> >>> >>> > > > minimizing barriers to entry, we encourage more participation in the
> >> >> >>> >>> > > > project
> >> >> >>> >>> > >
> >> >> >>> >>> > >
> >> >> >>> >>> > > 100% Casey. Right now we are going to release quincy in a state where
> >> >> >>> >>> > > it fails to build properly from source on RHEL and CentOS stream and
> >> >> >>> >>> > > installing the system level utf8proc doesn't fix it. The only way to fix
> >> >> >>> >>> > > it is to manually apply the fix from master or pass in
> >> >> >>> >>> > > -DWITH_SYSTEM_UTF8PROC=OFF to do_cmake.sh (which we don't give the
> >> >> >>> >>> > > user/developer instructions to do anywhere afaik). It's the kind of
> >> >> >>> >>> > > thing that makes a developer wonder if they are better off installing a
> >> >> >>> >>> > > non-redhat distro.
> >> >> >>> >>> > >
> >> >> >>> >>> > >
> >> >> >>> >>> > > Mark
> >> >> >>> >>> > >
> >> >> >>> >>> > >
> >> >> >>> >>> > > >
> >> >> >>> >>> > > > _______________________________________________
> >> >> >>> >>> > > > Dev mailing list -- dev(a)ceph.io
> >> >> >>> >>> > > > To unsubscribe send an email to dev-leave(a)ceph.io
> >> >> >>> >>> > > >
> >> >> >>> >>> > >
> >> >> >>> >>> > > _______________________________________________
> >> >> >>> >>> > > Dev mailing list -- dev(a)ceph.io
> >> >> >>> >>> > > To unsubscribe send an email to dev-leave(a)ceph.io
> >> >> >>> >>>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> --
> >> >> >>> >>
> >> >> >>> >> Kaleb
> >> >> >>> >
> >> >> >>> >
> >> >> >>> >
> >> >> >>> > --
> >> >> >>> >
> >> >> >>> > Kaleb
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >>
> >> >> >> Kaleb
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> >
> >> >> > Kaleb
> >> >>
> >> >
> >> >
> >> > --
> >> >
> >> > Kaleb
> >>
> >
> >
> > --
> >
> > Kaleb
Hi everyone,
CDM is happening this week, February 7th at 11:00am ET. See more meeting
details below.
Please add any topics you'd like to discuss to the agenda:
https://tracker.ceph.com/projects/ceph/wiki/CDM_07-FEB-2024
Thanks,
Laura Flores
Meeting link:
<https://meet.jit.si/ceph-dev-monthly>
<https://meet.jit.si/ceph-dev-monthly>https://meet.jit.si/ceph-dev-monthly
Time conversions:
UTC: Wednesday, February 7, 16:00 UTC
Mountain View, CA, US: Wednesday, February 7, 8:00 PST
Phoenix, AZ, US: Wednesday, February 7, 9:00 MST
Denver, CO, US: Wednesday, February 7, 9:00 MST
Huntsville, AL, US: Wednesday, February 7, 10:00 CST
Raleigh, NC, US: Wednesday, February 7, 11:00 EST
London, England: Wednesday, February 7, 16:00 GMT
Paris, France: Wednesday, February 7, 17:00 CET
Helsinki, Finland: Wednesday, February 7, 18:00 EET
Tel Aviv, Israel: Wednesday, February 7, 18:00 IST
Pune, India: Wednesday, February 7, 21:30 IST
Brisbane, Australia: Thursday, February 8, 2:00 AEST
Singapore, Asia: Thursday, February 8, 0:00 +08
Auckland, New Zealand: Thursday, February 8, 5:00 NZDT
--
Laura Flores
She/Her/Hers
Software Engineer, Ceph Storage <https://ceph.io>
Chicago, IL
lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com>
M: +17087388804
Hi cephers,
I've been looking into better balancing our clusters with upmaps lately,
and ran into upmap cases that behave in a less than ideal way. If there
is any cycle in the upmaps like
ceph osd pg-upmap-items <pgid> a b b a
or
ceph osd pg-upmap-items <pgid> a b b c c a
the upmap validation passes, the upmap gets added to the osdmap, but
then gets silently ignored. Obviously this is for EC pools - irrelevant
for replicated pools where the order of OSDs is not significant.
The relevant code OSDMap::_apply_upmap even has a comment about this:
if (q != pg_upmap_items.end()) {
// NOTE: this approach does not allow a bidirectional swap,
// e.g., [[1,2],[2,1]] applied to [0,1,2] -> [0,2,1].
for (auto& r : q->second) {
// make sure the replacement value doesn't already appear
...
I'm trying to understand the reasons for this limitation: is it the case
that this is just a matter of convenience of coding
(OSDMap::_apply_upmap could do this correctly with a bit more careful
approach), or is there some inherent limitation somewhere else that
prevents these cases from working? I did notice that just updating
crush weights (without using upmaps) produces similar changes to the UP
set (swaps OSDs in EC pools sometimes), so the OSDs seem to be perfectly
capable of doing backfills for osdmap changes that shuffle the order of
OSDs in the UP set. Some insight/history here would be appreciated.
Either way, the behavior of validation passing on an upmap and then the
upmap getting silently ignored is not ideal. I do realize that all
clients would have to agree on this code, since clients independently
execute it to find the OSDs to access (so rolling out a change to this
is challenging).
Andras
Hi Folks,
Meeting starts in 2 minutes! Today we have a new meeting URL for
google-meet since jitsi has been so flaky. Topics today include the 1
TiB/s results published on ceph.io. Josh Salomon may be presenting his
new work on the balancer as well. Hope to see you there!
Thanks,
Mark
Etherpad:
https://pad.ceph.com/p/performance_weekly
Meeting URL:
https://meet.google.com/ryg-fsip-pwi
Mark
--
Best Regards,
Mark Nelson
Head of R&D (USA)
Clyso GmbH
p: +49 89 21552391 12
a: Loristraße 8 | 80335 München | Germany
w: https://clyso.com | e: mark.nelson(a)clyso.com
We are hiring: https://www.clyso.com/jobs/
Hello
Dev Leads - just a friendly reminder that we've defined new milestones
for reef and quincy releases: 18.2.2 and 17.2.8
Please assign the PRs that are to be tested and merged as part of
those milestones and add the "needs-qa" label.
TIA