June 2019 - Dev - lists.ceph.io

Benchmarks of logging using dout vs LTTng

by Mohamad Gebai

Hi all, In last week's performance call I presented benchmark results for logging using dout vs LTTng. The cluster is all flash storage. Here are the results: https://docs.google.com/spreadsheets/d/1NfNBV8sm1_tgdBYmMAfmoEufZkkUNpkJNZ0… The PR is here: https://github.com/ceph/ceph/pull/22458 Note that I only converted log statements in BlueStore.cc. The rest of the codebase is unchanged, and the default log levels are set. The take-aways are the following: - There is a noteworthy performance difference only for high log levels (debug_bluestore = 20). At a log level of 1, the performance difference isn't significant. - The performance difference is mostly shown for high queue depths, which probably increases contention over some resource on the OSDs. Looks like there might be a lock on an internal buffer for the log statements of dout, or it might be due to serialization of the messages to the output file. LTTng uses per-CPU buffers and per-CPU files which is why we don't see scalability issues. - The difference of the log files' disk size is significant, about an order of magnitude smaller when using LTTng (this is not shown in these results) The downside of converting tracepoints to LTTng is the usability concerns both for users and developers. For users, LTTng has to be running, we need to add tracing session management, etc. For developers, adding tracepoints to the code is not as easy as dout. In my PR, there are two ways of adding LTTng tracepoints in Ceph: 1) https://github.com/ceph/ceph/pull/22458/files#diff-a9faffcf40600fd57aea5451… And 2) https://github.com/ceph/ceph/pull/22458/files#diff-a9faffcf40600fd57aea5451… The second way is easier, but it is less efficient as everything is converted to strings. It would still benefit from LTTng's per-CPU logging though. Unfortunately, I didn't benchmark it. The benchmark results were done using the first way, which uses basic types whenever possible instead of converting everything to strings. The downside is that the type of each field has to be specified. What do you think can be done following these results? One option obviously is to keep using dout since the performance penalty is acceptable for the default log levels. We could then rework some of dout's code to reduce the tail latency and reduce contention if possible. Another option would be to change the messages of high log levels to LTTng and keep the critical messages (dout(0) and dout(1)) using dout. Either way, I think being more mindful when adding log statements to the code is important: only log short messages, reduce string copying as much as possible, only log the necessary information to reduce the usage of the << operator as it does a lot of conversion of basic types to strings, be conservative when setting dout_prefix, etc. Mohamad

4 years, 9 months

1
0
0 0

14.2.2 PRs readiness for QE validation

by Yuri Weinstein

Hello, This is current queue of PRs : https://github.com/ceph/ceph/pulls?utf8=%E2%9C%93&q=is%3Aopen+label%3Anauti… (all cepfs are being reviewed by Patrick ATM) Do we expect more PRs to be included ? PS: PRs passed tests, but require approvals before merging: https://github.com/ceph/ceph/pull/28552 https://github.com/ceph/ceph/pull/28516 https://github.com/ceph/ceph/pull/28555

4 years, 9 months

4
6
0 0

DPDK change made builds harder?

by Gregory Farnum

Hey Kefu, dev@ I rebased a branch on top of current master and now cmake doesn't work for me. It's building in Jenkins so obviously it's some kind of configuration issue, but I don't know what I should be doing to resolve it. I'm running Fedora 27. I "rm -rf build"; "./install-deps.sh"; "./do_cmake.sh" and get an error about an unsupported target. https://paste.fedoraproject.org/paste/kbrTrAbr~GBY0DH0-Ndf8A In particular, x86_64-native-linux-gcc is not in a long list of target options? This seems to be a DPDK-only thing so there's not a lot of help available online. I rolled back master to the commit before https://github.com/ceph/ceph/pull/28507 and it worked, but there might have been something else more recent that broke it. (My first candidate was today's update to a newer DPDK, but rolling back before that didn't fix it.) Any ideas? -Greg

4 years, 9 months

2
3
0 0

about ceph-mgr

by seribe pasada

Hi dev, Nice to meet you I want to modify the code from ceph dashboard I try to modify the code from /usr/lib/ceph/mgr/dashboard/frontend, but the web do noting even I delete all frontend folder. I guess the frontend should built to the /frontend/dist folder by angular or nodejs; however, both of the methods is failed. Can you tell me how to build or run the frontend code I modify to the web? Thank you very much Best regards Hausiowen Nayu

4 years, 9 months

3
2
0 0

ceph-monstore-tool rebuild question

by huang jun

Hi,all I recently read the ceph-monstore-tool code, and have a question about rebuild operations. In update_paxos() we read osdmap, pgmap, auth and pgmap_pg records to pending_proposal(a bufferlist) as the value of the key paxos_1, and set paxos_pending_v=1, and set the paxos_last_committed=0 and paxos_first_committed=0; My question is if we start the mon after rebuild, let's say there is only one mon now, the mon will not commit the paxos_pending_v=1, and if we change the osdmap by 'ceph osd set noout' the new pending_v=1 will overwrite the former one in rebuild, so i think we don't need to set paxos_1=pending_proposal, paxos_pending_v=1 in 'ceph-monstore-tool rebuild'. Thanks!

4 years, 9 months

2
2
0 0

Static Analysis

by Brad Hubbard

Latest static analyser results are up on http://people.redhat.com/bhubbard/ Weekly Fedora Copr builds are at https://copr.fedorainfracloud.org/coprs/badone/ceph-weeklies/ -- Cheers, Brad

4 years, 9 months

1
0
0 0

Coverity Scan subscription confirmation

by scan-admin＠coverity.com

4 years, 9 months

1
0
0 0

06/20/2019 perf meeting is on!

by Mark Nelson

Hi Folks, Perf meeting is on in ~15 minutes! Today's agenda: bluestore cache refactor (25-30% IOPS gain for 4k randwrites!), recovery analysis and testing. See you there! Etherpad: https://pad.ceph.com/p/performance_weekly Bluejeans: https://bluejeans.com/908675367 Thanks, Mark

4 years, 10 months

1
0
0 0

Re: what's the best way to stop an MDS?

by Patrick Donnelly

+dev(a)ceph.io On Wed, Jun 19, 2019 at 11:18 AM Rishabh Dave <ridave(a)redhat.com> wrote: > > Hi all, > > I am working on a ceph-ansible playbook[1] that removes an MDS from an > already deployed Ceph cluster. Going through documentation and > ceph-ansible codebase I found out 3 ways to stop an MDS - > > * ceph fail mds fail <mds-name> && rm -rf /var/lib/cephmds/ceph-{id} [2] > * systemctl stop ceph-mds@$HOSTNAME > * ceph tell mds.x exit > > How do these 3 ways compare to each other? I ran these commands on > ceph-ansible deployed cluster and all 3 had the very same effect. Is > any one of these better than the rest? The first one doesn't cause the mds process to exit. I would suggest the systemd approach as systemd may restart a daemon if it exits normally (third approach). > What about "ceph mds rm" and "ceph mds rmfailed"? The first time I was Those are dev commands not meant for this purpose. > looking for various ways to stop an MDS, I tried "ceph mds fail > <mds-name> && ceph mds rm <global-id>" and it did not work since "ceph > mds rm" requires an MDS to inactive[3]. Is there a way to render an > MDS inactive? I couldn't find one. > > I also tried "ceph mds fail <mds-name> && ceph mds rmfailed > <mds-rank>" but this did not stop MDS. It only changed MDS's state to > 'standby" - > > (teuth-venv) $ ./bin/ceph fs dump | grep -A 1 standby_count_wanted 2> /dev/null > dumped fsmap epoch 4 > standby_count_wanted 0 > 4232: [v2:192.168.0.217:6826/2113356090,v1:192.168.0.217:6827/2113356090] > 'a' mds.0.3 up:active seq 4 > (teuth-venv) $ ./bin/ceph mds fail a 2> /dev/null && ./bin/ceph mds > rmfailed --yes-i-really-mean-it 0 2> /dev/null && ./bin/ceph fs dump | > grep -A 3 Standby 2> /dev/null > dumped fsmap epoch 6 > Standby daemons: > > 4286: [v2:192.168.0.217:6826/401505106,v1:192.168.0.217:6827/401505106] > 'a' mds.-1.0 up:standby seq 1 > (teuth-venv) $ > > Also, I find the usage of "remove" in this doc[2] ambiguous -- it can > mean removing MDS from cluster by changing MDS's state to standby or > it can mean killing/stopping it altogether. Reading [2] my impression > was that it meant killing/stopping it but "remove" is also used to > describe "ceph mds rm" and "ceph mds rmfailed" commands. Of these, at > least "ceph mds rmfailed" does not stop the MDS. If I am not the only > one to find this ambiguous, I'll go ahead and change the docs > accordingly. [2] is not really useful documentation, unfortunately. The best way to stop an MDS such that you want to permanently remove the daemon is to just have the service manager (systemd) stop it. The only consideration otherwise is whether you have a replacement MDS available to take-over (if the operator even wants that to happen). > [1] https://github.com/ceph/ceph-ansible/pull/4083 > [2] http://docs.ceph.com/docs/master/cephfs/add-remove-mds/ > [3] http://docs.ceph.com/docs/master/man/8/ceph/ -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

4 years, 10 months

1
0
0 0

mutable health warnings

by Neha Ojha

Hi everyone, There has been some interest in a feature that helps users to mute health warnings. There is a trello card[1] associated with it and we've had some discussion[2] in the past in a CDM about it. In general, we want to understand a few things: 1. what is the level of interest in this feature 2. for how long should we mute these warnings - should the period be decided by us or the user 3. possible misuse of this feature and negative impacts of muting some warnings Let us know what you think. [1] https://trello.com/c/vINMkfTf/358-mute-health-warnings [2] https://pad.ceph.com/p/cephalocon-usability-brainstorming Thanks, Neha

4 years, 10 months

2
1
0 0

2024

2023

2022

2021

2020

2019

Dev June 2019