September 2020 - Dev - lists.ceph.io

Fwd: [ceph-mgr - 15.2.4 octopus] the cpeh-mgr failed to get the correct status of all PGs

by HAO Xiong

Hi guys, We recently upgrade the ceph-mgr to 15.2.4, Octopus in our production clusters. The status of the cluster now is as follow: *# ceph versions* *{* * "mon": {* * "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 5* * },* * "mgr": {* * "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 3* * },* * "osd": {* * "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 1933* * },* * "mds": {* * "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 14* * },* * "overall": {* * "ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)": 1955* * }* *}* Now we suffered some problems in this cluster: 1. it always took a significant longer time to get the result of `ceph pg dump`. 2. the ceph-exportor might failed to get cluster metrics. 3. sometimes the cluster showed a few inactive/down pgs but recovered very soon. We did a investigation on the ceph-mgr, didn't get the root cause yet. But there are some dispersed clews (I am not sure if they ca help): 1. the ms_dispatch thread is always busy with one core. 2. the msg size is significant larger than 40K. *2020-09-24T14:47:50.216+0000 7f8f811f6700 1 -- [v2:{mgr_ip}:6800/111,v1:{mgr_ip}:6801/111] <== osd.3038 v2:{osd_ip}:6800/384927 431 ==== pg_stats(17 pgs tid 0 v 0) v2 ==== 42153+0+0 (secure 0 0 0) 0x55dae07c1800 con 0x55daf6dde400* 3. get some errors of "Fail to parse JSON result". *2020-09-24T15:47:42.739+0000 7f8f8da0f700 0 [devicehealth ERROR root] Fail to parse JSON result from daemon osd.1292 ()* 4. in the sending channel, we could see lots of faults. *2020-09-24T14:53:17.725+0000 7f8fa866e700 1 -- [v2:{mgr_ip}:6800/111,v1:{mgr_ip}:6801/111] >> v1:{osd_ip}:0/1442957044 conn(0x55db38757400 legacy=0x55db03d8e800 unknown :6801 s=STATE_CONNECTION_ESTABLISHED l=1).tick idle (909347879) for more than 900000000 us, fault.* *2020-09-24T14:53:17.725+0000 7f8fa866e700 1 --1- [v2:{mgr_ip}:6800/111,v1:{mgr_ip}:6801/111] >> v1:{osd_ip}:0/1442957044 conn(0x55db38757400 0x55db03d8e800 :6801 s=OPENED pgs=1572189 cs=1 l=1).fault on lossy channel, failing* 5. or the mgr-fin thread would be busy with one core. [image: image.png] and from the perf dump we got: * "finisher-Mgr": { "queue_len": 1359862, "complete_latency": { "avgcount": 14, "sum": 40300.307764855, "avgtime": 2878.593411775 } },* Sorry about these clews are a little messy. Could you have any comments on this? Thanks. Regards, Hao

3 years, 7 months

1
0
0 0

09/24/2020 perf meeting is on at 8AM PST!

by Mark Nelson

Hi Folks, The weekly performance meeting will be starting in 15 minutes! This week we will discuss a paper from FAST20: "Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook" https://www.usenix.org/system/files/fast20-cao_zhichao.pdf See you there! Etherpad: https://pad.ceph.com/p/performance_weekly Bluejeans: https://bluejeans.com/908675367 Thanks, Mark

3 years, 7 months

2
1
0 0

Possible amendment to the cherry-picking rules

by Nathan Cutler

Today it came to my attention that not all Ceph developers agree with the following cherry-picking rule: "if a commit could not be cherry-picked from master, the commit message must explain why that was not possible" [1] [1] https://github.com/ceph/ceph/blob/master/SubmittingPatches-backports.rst#ch… Now, I (Nathan) am the one who wrote these rules down, but I'm not their author. These rules are a codification of a set of best practices I "inherited" from Loic. Although I hesitate to speak on his behalf, I don't think he's around here anymore, so I'll just go ahead and present what I think is the rationale for this particular rule. In the past, regressions often happened because bugs got fixed directly in a stable branch, but not in master. Later, after a new major stable release was split off from master and users upgraded their clusters to it, BOOM the bugs were back! Of course, nobody initially knew why, but it was clear that the bug was a regression. Therefore, forensic investigations of the git history were undertaken to find the answer to the question: "which commit fixed this bug in N-1 and why is that commit not in N?". One possible tactic in such an investigation is to find all commits in the N-1 stable branch (which does not exhibit the bug) that aren't cherry-picks, but potentially should have been. One of these might be the fix, but which one? Some bugs have to be fixed directly in a stable branch: they cannot be cherry-picked from master for any number of valid reasons. So, in our hypothetical forensic investigation, we are faced with the necessity of distinguishing these "good" direct bug-fixing commits from "bad" ones which should have been cherry-picks, but are not. But how to make that distinction when the commit messages themselves are silent on the question of why they aren't cherry picks? That, I believe, is where this rule came from. Nowadays, it would seem that this type of forensic investigation is rarely undertaken. BUT let us ask ourselves, could that be because (1) we have these cherry-picking rules and (2) they are - for the most part - enforced? Anyway, I thought I'd bring the matter up here to in the hopes of finding a consensus on whether this rule should stand as-is or be revised. I don't relish being in a position of enforcing a rule that the leads (and the developer community as a whole) don't understand or agree with. Thanks, Nathan

3 years, 7 months

3
4
0 0

ceph config get -vs- ceph config-key get ??

by Wyll Ingersoll

I'm looking for clarification on which command should be used to manage configuration settings in Nautilus. It's not clear which of the "config" commands are supposed to be used. The documentation refers to "ceph config set/get ...", but the man page for ceph(8) only references "ceph config-key ...". What is the difference between "ceph config set" and "ceph config-key set" and which one is supposed to be used in Nautilus (and later) ? For example, setting values for the dashboard appears to work using "ceph config set mgr ...": ceph config set mgr mgr/dashboard/mon02/server_addr 10.4.3.22 But trying to read this value back using "config get" fails: ceph config get mgr mgr/dashboard/mon02/server_addr Error EINVAL: unrecognized entity 'mgr' Furthermore, "ceph config ls" shows a rather long list of keys that should be able to be read back, but attempting to fetch them with "ceph config get" always results in the same above error "EINVAL: unrecognized entity 'mgr'". Is this a known bug? The inconsistencies and documentation are confusing. thanks, Wyllys Ingersoll

3 years, 7 months

1
0
0 0

Vitastor, a fast Ceph-like block storage for VMs

by vitalif＠yourcmc.ru

Hi! After almost a year of development in my spare time I present my own software-defined block storage system: Vitastor - https://vitastor.io I designed it similar to Ceph in many ways, it also has Pools, PGs, OSDs, different coding schemes, rebalancing and so on. However it's much simpler and much faster. In a test cluster with SATA SSDs it achieved Q1T1 latency of 0.14ms which is especially great compared to Ceph RBD's 1ms for writes and 0.57ms for reads. In an "iops saturation" parallel load benchmark it reached 895k read / 162k write iops, compared to Ceph's 480k / 100k on the same hardware, but the most interesting part was CPU usage: Ceph OSDs were using 40 CPU cores out of 64 on each node and Vitastor was only using 4. Of course it's an early pre-release which means that, for example, it lacks snapshot support and other useful features. However the base is finished - it works and runs QEMU VMs. I like the design and I plan to develop it further. There are more details in the README file which currently opens from the domain https://vitastor.io Sorry if it was a bit off-topic, I just thought it could be interesting for you :) -- With best regards, Vitaliy Filippov

3 years, 7 months

2
2
0 0

Outreachy 2020 Winter Round Project Ideas

by Ali Maredia

Hi Ceph Developers, The Ceph community is planning on participating in the upcoming round of Outreachy (https://www.outreachy.org/). Applicants will be applying for internships during the month of October and interns would work on their projects from December - March. If you're interested in mentoring a project, please add your ideas to this projects list: https://pad.ceph.com/p/project-ideas I will be visiting various standup meetings over the coming weeks to discuss project ideas as well. If you have any questions, please reach out to me. Best, Ali

3 years, 7 months

1
1
0 0

Missed backport to Nautilus

by Willem Jan Withagen

Hoi, Clearly some code has been backported to Nautilus, since My FreeBSD Nautilus builds fail on: gmake[2]: Leaving directory '/home/jenkins/workspace/ceph-nautilus/build' /home/jenkins/workspace/ceph-nautilus/src/test/libcephfs/lazyio.cc:24:10: fatal error: 'sys/xattr.h' file not found #include <sys/xattr.h> ^~~~~~~~~~~~~ gmake[2]: Entering directory '/home/jenkins/workspace/ceph-nautilus/build' Something I have fixed in: https://github.com/ceph/ceph/pull/30505 And tracked in: https://tracker.ceph.com/issues/42448 So what do I need to do to get this fix backported as well? Thanx, --WjW

3 years, 7 months

2
4
0 0

Re: [ceph-users] Re: Vitastor, a fast Ceph-like block storage for VMs

by vitalif＠yourcmc.ru

Thanks Marc :) It's easier to write code than to cooperate :) I can do whatever I want in my own project. Ceph is rather complex. For example, I failed to find bottlenecks in OSD when I tried to profile it - I'm not an expert of course, but still... The only bottleneck I found was cephx_sign_messages=true by default. Now I always disable it. In fact I don't think Ceph needs those signatures at all because 99.9% of setups live in private networks. Ceph has ~1M lines of code. Vitastor has 22k :). Bluestore is complicated, SeaStore seems like it may also end up being complicated, there are a lot of other architectural things like RBD cache, RBD object map, immediate commit semantics for all writes and so on that can't be easily fixed. It would take MUCH more than a year to fix everything. Ceph is great for object storage, but 1ms write latency in an NVMe cluster is something that annoyed me so much that I basically had to try to reinvent the wheel. So I hope my wheel will make its way into production at some point :) > Vitaliy you are crazy ;) But really cool work. Why not combine efforts > with ceph? Especially with something as important as SDS and PB's of > clients data stored on it, everyone with a little bit of brain chooses a > solution from a 'reliable' source. For me it was decisive to learn that > CERN and NASA were using this on a large scale. I do not have the > expertise nor time (like probably 90% of ceph users) to test how they > have been testing and using ceph. > > I often see opensource projects that could benefit from cooperation. > Some teams totally lack the expertise that others have, and vice versa. > Providing the community with 10 or 20 'shitty' projects instead of 3 > 'good' projects. > I think opensource projects should more often embrace a sort of modular > development solution. Where others can change functionality by replacing > just a module. If I ever get my idea funded, I would make it like this.

3 years, 7 months

1
0
0 0

09/17/2020 perf meeting is on at 8AM PST!

by Mark Nelson

Hi Folks, The weekly performance meeting will be starting in 20 minutes! Last week was mostly spent discussing Igor's recent testing, so today we are going to continue discussing refactoring onodes in bluestore to improve memory usage and CPU overhead. See you there! Etherpad: https://pad.ceph.com/p/performance_weekly Bluejeans: https://bluejeans.com/908675367 Thanks, Mark

3 years, 7 months

1
0
0 0

Re: [ceph-users] Re: Migration to ceph.readthedocs.io underway

by Lenz Grimmer

Hi Marc, On 9/17/20 11:16 AM, Marc Roos wrote: > This[1] and natural evolution(?) > > [1] > https://bootstrap-datepicker.readthedocs.io/en/v1.9.0/ > Support Read the Docs! > > Please help keep us sustainable by allowing our Ethical Ads in your ad > blocker or go ad-free by subscribing. > > Thank you! ❤️ Thanks for the info! That prompted me to read up some more on this. RTD is actually quite open and honest about their advertising model [1] and they do provide options to opt out of paid ads [2], so that's definitely an option that we should consider disabling, if we haven't done so already. At present, I personally feel that the benefits of hosting the docs on their platform vs. doing it on our own infrastructure are an acceptable trade-off for ads, especially if these are about other open source projects only. As RTD is really an aggregation point for a lot of other open source projects, it may also help us to gain visibility. Of course, if "natural evolution" really kicks in and things get too intrusive/annoying, we should re-evaluate this and take action accordingly. Lenz [1] https://docs.readthedocs.io/en/latest/advertising/index.html [2] https://docs.readthedocs.io/en/latest/advertising/ethical-advertising.html#… -- SUSE Software Solutions Germany GmbH - Maxfeldstr. 5 - 90409 Nuernberg GF: Felix Imendörffer, HRB 36809 (AG Nürnberg)

3 years, 7 months

2
1
0 0

2024

2023

2022

2021

2020

2019

Dev September 2020