February 2020 - Dev - lists.ceph.io

DocuBetter meeting 26 Feb 2020

by John Zachary Dover

Hi everyone. The next DocuBetter meeting is scheduled for tomorrow. This is at the following time: 1800 PST 26 Feb 2020 0100 UTC 26 Feb 2020 1200 AEST 27 Feb 2020 Etherpad: https://pad.ceph.com/p/Ceph_Documentation Meeting: https://bluejeans.com/908675367 Agenda: This week we will be discussing documentation requests, the ceph osd df man page and documentation, the PG repair issue (finally complete), and a request to include information in the bootstrapping procedure about how long the bootstrapping command takes to execute.

4 years, 1 month

1
0
0 0

Introduction and wish to contribute

by tonypro

Dear all, I am Duplex kamdjou (you can called me Duplex) a web developer with good knowledge in JavaScript and intermediate knowledge in C/C++ programming, I was looking for a good project to improve my C programming skills and come across this project. I will like to learn more about Ceph and its applications. Then I will appreciate any guidance you can point me to get me involve and contribute to this project. I am looking more on features related to C and JavaScript. Thanks in advance and I hope to read from you soon. -- Kamdjou Temfack Duplex M Phone: +237 670274538 *__________________________________________________________________________________* *Software **Engineer* / Full-Stack developer *Open-source** Contributor* *__________________________________________________________________________________* *Twitter:* @tony14pro <https://twitter.com/Tony14Pro> *Github:* https://github.com/kamdjouduplex *Website:* http://bproo.com <https://bproo.com> *________________________________________________________________________________*

4 years, 2 months

1
0
0 0

Ceph @ SoCal Linux Expo

by Gregory Farnum

Hey all, we're excited to be returning properly to SCaLE in Pasadena[1] this year (March 5-8) with a Thursday Birds-of-a-Feather session[2] and a booth in the expo hall. Please come by if you're attending the conference or are in the area to get face time with other area users and Ceph developers. :) Also, I got drafted into organizing this so if you'd be willing to help man the booth in exchange for an Expo pass, shoot me an email! I think I've got 3 spots left. -Greg [1]: https://www.socallinuxexpo.org/scale/18x [2]: https://www.socallinuxexpo.org/scale/18x/presentations/ceph-storage

4 years, 2 months

1
0
0 0

make check fails to launch tests on centos 8/senta03

by Rishabh Dave

Hi all, I ran "make check" senta03.front.sepia.ceph.com but I get following error on stdout - Ignoring mock: markers 'python_version <= "3.3"' don't match your environment Ignoring ipaddress: markers 'python_version < "3.3"' don't match your environment Looking in links: file:///home/rishabh/master/src/pybind/mgr/wheelhouse Obtaining file:///home/rishabh/master/src/python-common (from -r requirements.txt (line 4)) ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output. Here's the requirements.txt[1] being referred above. Do I need python version > 2.7 but lower than 3.3? I don't spot any such version on a different machine (it runs Fedora 31) where "make check" launched successfully (although the tests there failed). I see every version from python3.4 to python3.9 and python2.7 on my machine but nothing that matches "python_version <= 3.3". On senta03 I can see python3.6, python3.7 and python2.7. Also, where and by what name is the log mentioned in error message saved as? I tried running run-make-check.sh too. It too aborted before launching any tests. For CentOS, 8 it complained that packages colm-0.13.0.7-1.el8.x86_64.rpm and ragel-7.0.0.12-2.el8.x86_64.rpm are unsigned and on Fedora 31 it complained that python37-coverage is missing. I've attached output for make check, run-make-check.sh on senta03 and fedora 31 as make-check.log, run-make-check-centos8.log and run-make-check-f31.log. [1] https://github.com/ceph/ceph/tree/master/src/python-common/ Thanks, - Rishabh

4 years, 2 months

2
2
0 0

Request for Comments on the Documentation & DocuBetter Meeting 26 Feb

by John Zachary Dover

My name is Zac Dover and I was hired by Sage to improve the Ceph documentation. For the past few months, I have been reading the documentation that exists and making bugfixes where I am able. Now I think it's time to ask the general Ceph community for complaints about and request for improvements to the documentation. There is a general documentation meeting called the "DocuBetter Meeting", and it is held every two weeks. The next DocuBetter Meeting will be on February 26, 2020 at 6 PM PST, and will run for thirty minutes. Everyone with a documentation-related request or complaint is invited. The meeting will be held here: https://bluejeans.com/908675367 <https://www.google.com/url?q=https://bluejeans.com/908675367&sa=D&usd=2&usg…> Send documentation-related requests and complaints to me by replying to this email and CCing me at zac.dover(a)gmail.com. This message will be sent to dev(a)ceph.io every Monday morning, North American time. Zac Dover

4 years, 2 months

1
0
0 0

02/20/2019 perf meeting is on at 8AM PST!

by Mark Nelson

Hi Folks, Josh and I area back from PTO so it's time to get the perf meeting going again! Today I'd like to talk a little bit about some testing I did wile on PTO lookling at CephFS performance using the HPC io500 benchmark. If Igor is able to make it, I'm hoping we can also talk a little bit about his new hybrid AVL/bitmap allocator for bluestore and deferred write PR. Hope to see you there! Etherpad: https://pad.ceph.com/p/performance_weekly Bluejeans: https://bluejeans.com/908675367 Thanks, Mark

4 years, 2 months

1
0
0 0

Backport device health fixes to Nautilus

by Benoît Knecht

Hi, I would like to see https://github.com/ceph/ceph/pull/28848 backported to Nautilus, as I'm currently unable to use devicehealth on 14.2.7 due to the fact that smartctl exist code > 0 is not handled properly. I cherry-picked those commits on the nautilus branch, and they all apply cleanly, but when I try to follow https://github.com/ceph/ceph/blob/master/SubmittingPatches-backports.rst#cr…, I'm stuck because (as far as I can tell) the "master tracker issue" doesn't exist. What would be the best way forward in this case? Submit a PR without a backport tracker issue? Manually create the backport issue? Cheers, -- Ben

4 years, 2 months

2
2
0 0

FYI nautilus branch is locked

by Yuri Weinstein

We are getting ready to test 14.2.9 and nautilus branch is locked for merges until it's done. sah1 - 4d5b84085009968f557baaa4209183f1374773cd Nathan, Abhishek pls confirm. Thank you YuriW

4 years, 2 months

3
4
0 0

Pitfalls when using RBD Snapshot as timely backup

by Xiaoxi Chen

Hi List, We are using RBD Snapshots as timely backup for DBs, 24 hourly snapshot + 30 daily snapshots are taken for each RBDs. It works perfect at the beginning however with the # of volumes increasing, more and more significant pitfalls were seen. we are at ~ 700 volumes which will create 700 snapshots and rotate 700 snapshots every hour. 1. Huge and frequent OSDMap update The OSDMap is ~640K in size , with a long and scattered "removed_snaps". The holes in the removed_snap interval set are from two part, - In our use case as we keep daily snapshots for longer ,which turn out to be a hole in the removed_snap interval set for each daily snapshots. - https://github.com/ceph/ceph/blob/v14.2.4/src/osd/osd_types.cc#L1583-L1586 add a new snapid for each snapshot removal, according to the comment the new snapid is intent to keep the interval_set contiguous. However I cannot understand how it works, it seems to me like this behavior is creating more holes when create/delete interleaving with each other. - After processing 4 or 5 versions of map, the rocksdb write-ahead log (WAL) is full and the corresponding memtable has to be flushed to disk. 2. pgstat update burn out MGR starting from Mimic, PG by default update 500 (osd_max_snap_prune_intervals_per_epoch) purged_snapshot interval to MGR, which significant inflate the size of pg_stat and causing MGR using 20GB+ Mem, 260%+ CPU(mostly on messenger threads and MGR_FIN thread), and very unresponsive. Reduce the osd_max_snap_prune_intervals_per_epoch to 10 fix the issue in our env. 3. SnapTrim IO overhead Though there are tuning knobs to control the speed of snaptrim however it anyway need to catch up with the snapshot creation speed. What is more, the snaptrim introduce huge amplification in RocksDB WAL, maybe due to to 4K alignment in WAL. We observed 156GB WAL was written during trimming 100 snapshots, however the generated L0 is 4.63GB which seems related with WAL page align amplification. The PG purged snapshot from snaptrim_q one by one , we are thinking if several purged snapshots for a given volume, can be compacted and trim together, perhaps we can get better efficiency (we only need change snapset for given obj once). 4. Deep-scurb on objects with hundreds of snapshots are super slow and resulting osd_op_w_latency surged up 10x in our env, not yet deep dived. 5. How cache tier works with snapshots? does cache tier help with write performance in this case? There are several outstanding PRs like https://github.com/ceph/ceph/pull/28330 to optimize the Snaptrim especially get rid of the removed_snaps, we believe it will helps partly on #1 but not sure how significant it helps others. As the env is a production env so upgrading to Octopus RC is not flexible at the moment, will try out once stable released. -Xiaoxi

4 years, 2 months

3
2
0 0

flock is held after ceph-osd daemon being stopped

by Yiming Zhang

Hi All, I noticed a locking issue in kernel device. When I stopped the ceph cluster and all daemons, the kernel device _lock somehow is still held and this line below will return r < 0: int KernelDevice::_lock() { int r = ::flock(fd_directs[WRITE_LIFE_NOT_SET], LOCK_EX | LOCK_NB); … } The way I stop the cluster and daemons: sudo ../src/stop.sh sudo bin/init-ceph --verbose forcestop This error happens even after the reboot when I try to use vstart: bdev _lock flock failed on ceph/build/dev/osd0/block bdev open failed to lock /home/yzhan298/ceph/build/dev/osd0/block: (11) Resource temporarily unavailable OSD::mkfs: couldn't mount ObjectStore: error (11) Resource temporarily unavailable ** ERROR: error creating empty object store in ceph/build/dev/osd0: (11) Resource temporarily unavailable Please advice. (On master branch) Thanks, Yiming

4 years, 2 months

3
6
0 0

2024

2023

2022

2021

2020

2019

Dev February 2020