July 2019 - Dev - lists.ceph.io

Fwd: [lca-announce] linux.conf.au 2020 - Call for Sessions and Miniconfs now open!

by Tim Serong

Here we go again! As usual the conference theme is intended to inspire, not to restrict; talks on any topic in the world of free and open source software, hardware, etc. are most welcome, and Ceph talks definitely fit. I've added this to https://pad.ceph.com/p/cfp-coordination as well. -------- Forwarded Message -------- Subject: [lca-announce] linux.conf.au 2020 - Call for Sessions and Miniconfs now open! Date: Tue, 25 Jun 2019 21:19:43 +1000 From: linux.conf.au Announcements <lca-announce(a)lists.linux.org.au> Reply-To: lca-announce(a)lists.linux.org.au To: lca-announce(a)lists.linux.org.au The linux.conf.au 2020 organising team is excited to announce that the linux.conf.au 2020 Call for Sessions and Call for Miniconfs are now open! These will stay open from now until Sunday 28 July Anywhere on Earth (AoE) (https://en.wikipedia.org/wiki/Anywhere_on_Earth). Our theme for linux.conf.au 2020 is "Who's Watching", focusing on security, privacy and ethics. As big data and IoT-connected devices become more pervasive, it's no surprise that we're more concerned about privacy and security than ever before. We've set our sights on how open source could play a role in maximising security and protecting our privacy in times of uncertainty. With the concept of privacy continuing to blur, open source could be the solution to give us '2020 vision'. Call for Sessions Would you like to talk in the main conference of linux.conf.au 2020? The main conference runs from Wednesday to Friday, with multiple streams catering for a wide range of interest areas. We welcome you to submit a session (https://linux.conf.au/programme/sessions/) proposal for either a talk or tutorial now. Call for Miniconfs Miniconfs are dedicated day-long streams focusing on single topics, creating a more immersive experience for delegates than a session. Miniconfs are run on the first two days of the conference before the main conference commences on Wednesday. If you would like to organise a miniconf (https://linux.conf.au/programme/miniconfs/) at linux.conf.au, we want to hear from you. Have we got you interested? You can find out how to submit your session or miniconf proposals at https://linux.conf.au/programme/proposals/. If you have any other questions you can contact us via email at contact(a)lca2020.linux.org.au. We are looking forward to reading your submissions. linux.conf.au 2020 Organising Team --- Read this online at https://lca2020.linux.org.au/news/call-for-sessions-miniconfs-now-open/ _______________________________________________ lca-announce mailing list lca-announce(a)lists.linux.org.au http://lists.linux.org.au/mailman/listinfo/lca-announce

4 years, 8 months

2
2
0 0

RFC: Telemetry revamp

by Lars Marowsky-Bree

Hi all, I've had a chat with Sage & Dan Mick about the current state of telemetry, and I'd like to propose a few ideas to hopefully improve it and make the data collected more relevant. The current data is quite limited. I was able to take a look at, say, how many pools out there (well, of the ~300ish clusters that ever reported) have a non-2^n pg_num, but seeing whether this affects performance or data distribution was not possible. My goal is to have telemetry data that allows us to make more informed decisions about what matters to the user base; the comments below are not necessarily ordered by relevance, since they grew out of a thread on looking at the current data reported. Curious about your thoughts - too detailed information? Anything you'd like to see included? What'd help you in your area? - The crash section does expose actual hostnames ("entity_name"). If we want to preserve that we can see whether it's the same entity crashing or another, I'd propose that, similar to report_id, we generate a report_secret_salt in the plugin that we don't share with the server - we can then use this to hash any potential strings consistently. (This will change with Sage's pending PR to point this at a different channel.) - The pool reporting should include: - EC policy (plugin, parameters) - I can tell whether a pool is EC, and k+m, but not even k or m individually ... - Pool application association (and it'd be lovely if we could tell data/metadata pools apart for CephFS/RBD) - Possibly per-pool usage? - Report should included enabled plugins - Plugins should have a standard API call to report their own telemetry - e.g., balancer/pg_autoscaler settings come to mind - The current way how the ceph version/os/distro/kernel/description/cpu/arch fields are aggregated individually makes these very difficult to analyze. In case you're not familiar, it looks something like this (trimmed): "kernel_version": { "4%15%0-54-generic": 6, "4%15%0-50-generic": 20, "4%18%0-25-generic": 3 }, "ceph_version": { "ceph version 14%2%1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable)": 29 }, "kernel_description": { "#58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019": 6, "#54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019": 20, "#26~18%04%1-Ubuntu SMP Thu Jun 27 07:28:31 UTC 2019": 3 }, "cpu": { "Intel(R) Xeon(R) CPU E5-1650 v3 @ 3%50GHz": 20, "Intel(R) Core(TM) i7-7700 CPU @ 3%60GHz": 9 } } I'd rather see it aggregated at the tuple level: environment: [ { kernel_version: "4%15%0-54-generic", arch: "x86_64", distro: "ubuntu", cpu: "Intel(R) Core(TM) i7-7700 CPU @ 3%60GHz", kernel_description: "...", ceph_version: "...", count: 6 }, ... ] - The OSD section could be revamped and expose more details. This is overly simplified. Is BlueStore used with rotational_media? NVMe? SSD? Is FileStore? On which media and, possibly, how big are the WAL/RocksDB/data partitions? Is encryption used? Are these ceph-volume, ceph-disk, ...? Which file system is used with Filestore? Do we have enabled uring? Etc. (In short, probably ceph osd metadata should grow to encompass this and telemetry would scrape a subset of that.) - While we're on hardware, I'd like to know if there's a separate cluster/public network and if we can deduce the hardware that's associated with (10? 25? VLAN? bond? etc) - Are there any msgr features we'd want to know about? v2? Encryption? - Anything on the MDS? - RFC: include "ceph features"? - There's no actual performance data (commit latency or anything else). Could we grab a histogram or at least min/max/avg/stddev/sum(?) of some high-level metrics since the last report from the prometheus instance most more recent environments would likely have? (I'd like to see if we can deduce that a certain update made the clusters in the field slower or faster.) - I'd love to see data on OSD utilization/variance as well. (I could have used that this morning to check how this varied for clusters with non-2^n pg_num, but it'd also help us show the improvement over time as folks roll out the new automations etc.) We can either grab this from the OSD daemons, or again ask Prometheus. - Do we know anything about the client versions talking to us, beyond require_min_compat_client? - We may want to get more details on the services/gateways (iSCSI, NFS, CIFS). Even just if they're used would be good. - I'd pull contact/organization/description into a separate section and channel. We'll need to also document what this information is used for. Basically, this is a long laundry list of wishes for more detail. ;-) I'm wondering what the best way of tracking all wishes and then deciding on which to fulfil is. Regards, Lars -- SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg) "Architects should open possibilities and not determine everything." (Ueli Zbinden)

4 years, 9 months

3
3
0 0

07/25/2019 perf meeting is on!

by Mark Nelson

Hi Folks, Perf meeting is on in ~15 minutes! No set agenda for today, other than that the bluestore cache rebase is almost set to merge. Please feel free to add your own! Otherwise, probably a short meeting today. Etherpad: https://pad.ceph.com/p/performance_weekly Bluejeans: https://bluejeans.com/908675367 Thanks, Mark

4 years, 9 months

1
0
0 0

Static Analysis

by Brad Hubbard

Latest static analyser results are up on http://people.redhat.com/bhubbard/ Weekly Fedora Copr builds are at https://copr.fedorainfracloud.org/coprs/badone/ceph-weeklies/ -- Cheers, Brad

4 years, 9 months

1
0
0 0

I need help to get access from my kubernetes pod to external cephfs cluster

by Konstantin Pupkov

Can someone help me to setup access to external cephfs cluster without 3rd parties like rook etc. I wasn't able to follow your instructions from GitHub. I am looking for step by step instructions. My Ceph version is "nautilus". Thanks, Konstantin

4 years, 9 months

1
0
0 0

reproducable rbd-nbd crashes

by Marc Schöchlin

Hello cephers, rbd-nbd crashes in a reproducible way here. I created the following bug report: https://tracker.ceph.com/issues/40822 Do you also experience this problem? Do you have suggestions for in depth debug data collection? I invoke the following command on a freshly mapped rbd and rbd_rbd crashes: # find . -type f -name "*.sql" -exec ionice -c3 nice -n 20 gzip -v {} \; gzip: ./deprecated_data/data_archive.done/entry_search_201232.sql.gz already exists; do you wish to overwrite (y or n)? y ./deprecated_data/data_archive.done/entry_search_201232.sql: 84.1% -- replaced with ./deprecated_data/data_archive.done/entry_search_201232.sql.gz ./deprecated_data/data_archive.done/entry_search_201233.sql: gzip: ./deprecated_data/data_archive.done/entry_search_201233.sql: Input/output error gzip: ./deprecated_data/data_archive.done/entry_search_201234.sql: Input/output error gzip: ./deprecated_data/data_archive.done/entry_search_201235.sql: Input/output error gzip: ./deprecated_data/data_archive.done/entry_search_201236.sql: Input/output error .... dmesg output: [579763.020890] block nbd0: Connection timed out [579763.020926] block nbd0: shutting down sockets [579763.020943] print_req_error: I/O error, dev nbd0, sector 3221296950 [579763.020946] block nbd0: Receive data failed (result -32) [579763.020952] print_req_error: I/O error, dev nbd0, sector 4523172248 [579763.021001] XFS (nbd0): metadata I/O error: block 0xc0011736 ("xlog_iodone") error 5 numblks 512 [579763.021031] XFS (nbd0): xfs_do_force_shutdown(0x2) called from line 1261 of file /build/linux-hwe-xJVMkx/linux-hwe-4.15.0/fs/xfs/xfs_log.c. Return address = 0x00000000918af758 [579763.021046] print_req_error: I/O error, dev nbd0, sector 4523172248 [579763.021161] XFS (nbd0): Log I/O Error Detected. Shutting down filesystem [579763.021176] XFS (nbd0): Please umount the filesystem and rectify the problem(s) [579763.176834] print_req_error: I/O error, dev nbd0, sector 3221296969 [579763.176856] print_req_error: I/O error, dev nbd0, sector 2195113096 [579763.176869] XFS (nbd0): metadata I/O error: block 0xc0011749 ("xlog_iodone") error 5 numblks 512 [579763.176884] XFS (nbd0): xfs_do_force_shutdown(0x2) called from line 1261 of file /build/linux-hwe-xJVMkx/linux-hwe-4.15.0/fs/xfs/xfs_log.c. Return address = 0x00000000918af758 [579763.252836] print_req_error: I/O error, dev nbd0, sector 2195113352 [579763.252859] print_req_error: I/O error, dev nbd0, sector 2195113608 [579763.252869] print_req_error: I/O error, dev nbd0, sector 2195113864 [579763.356841] print_req_error: I/O error, dev nbd0, sector 2195114120 [579763.356885] print_req_error: I/O error, dev nbd0, sector 2195114376 [579763.358040] XFS (nbd0): writeback error on sector 2195119688 [579763.916813] block nbd0: Connection timed out [579768.140839] block nbd0: Connection timed out [579768.140859] print_req_error: 21 callbacks suppressed [579768.140860] print_req_error: I/O error, dev nbd0, sector 2195112840 [579768.141101] XFS (nbd0): writeback error on sector 2195115592 /var/log/ceph/ceph-client.archiv.log 2019-07-18 14:52:55.387815 7fffcf7fe700 1 -- 10.23.27.200:0/3920476044 --> 10.23.27.151:6806/2322641 -- osd_op(unknown.0.0:1853 34.132 34:4cb446f4:::rbd_header.6c73776b8b4567:head [watch unwatch cookie 140736414969824] snapc 0=[] ondisk+write+known_if_redirected e256219) v8 -- 0x7fffc803a340 con 0 2019-07-18 14:52:55.388656 7fffe913b700 1 -- 10.23.27.200:0/3920476044 <== osd.17 10.23.27.151:6806/2322641 90 ==== watch-notify(notify (1) cookie 140736414969824 notify 1100452225614816 ret 0) v3 ==== 68+0+0 (1852866777 0 0) 0x7fffe187b4c0 con 0x7fffc00054d0 2019-07-18 14:52:55.388738 7fffe913b700 1 -- 10.23.27.200:0/3920476044 <== osd.17 10.23.27.151:6806/2322641 91 ==== osd_op_reply(1852 rbd_header.6c73776b8b4567 [notify cookie 140736550101040] v0'0 uv2102967 ondisk = 0) v8 ==== 169+0+8 (3077247585 0 3199212159) 0x7fffe0002ef0 con 0x7fffc00054d0 2019-07-18 14:52:55.388815 7fffcffff700 5 librbd::Watcher: 0x7fffc0001010 notifications_blocked: blocked=1 2019-07-18 14:52:55.388904 7fffcffff700 1 -- 10.23.27.200:0/3920476044 --> 10.23.27.151:6806/2322641 -- osd_op(unknown.0.0:1854 34.132 34:4cb446f4:::rbd_header.6c73776b8b4567:head [notify-ack cookie 0] snapc 0=[] ondisk+read+known_if_redirected e256219) v8 -- 0x7fffc00600a0 con 0 2019-07-18 14:52:55.389594 7fffe913b700 1 -- 10.23.27.200:0/3920476044 <== osd.17 10.23.27.151:6806/2322641 92 ==== osd_op_reply(1853 rbd_header.6c73776b8b4567 [watch unwatch cookie 140736414969824] v256219'2102968 uv2102967 ondisk = 0) v8 ==== 169+0+0 (242862078 0 0) 0x7fffe0002ef0 con 0x7fffc00054d0 2019-07-18 14:52:55.389838 7fffcd7fa700 10 librbd::image::CloseRequest: 0x555555946390 handle_unregister_image_watcher: r=0 2019-07-18 14:52:55.389849 7fffcd7fa700 10 librbd::image::CloseRequest: 0x555555946390 send_flush_readahead 2019-07-18 14:52:55.389848 7fffe913b700 1 -- 10.23.27.200:0/3920476044 <== osd.17 10.23.27.151:6806/2322641 93 ==== watch-notify(notify_complete (2) cookie 140736550101040 notify 1100452225614816 ret 0) v3 ==== 42+0+8 (2900410459 0 0) 0x7fffe09888e0 con 0x7fffc00054d0 2019-07-18 14:52:55.389895 7fffe913b700 1 -- 10.23.27.200:0/3920476044 <== osd.17 10.23.27.151:6806/2322641 94 ==== osd_op_reply(1854 rbd_header.6c73776b8b4567 [notify-ack cookie 0] v0'0 uv2102967 ondisk = 0) v8 ==== 169+0+0 (3363304947 0 0) 0x7fffe0002ef0 con 0x7fffc00054d0 2019-07-18 14:52:55.389944 7fffcffff700 20 librbd::watcher::Notifier: 0x7fffc00010c0 handle_notify: r=0 2019-07-18 14:52:55.389955 7fffcffff700 20 librbd::watcher::Notifier: 0x7fffc00010c0 handle_notify: pending=0 2019-07-18 14:52:55.389974 7fffcf7fe700 10 librbd::image::CloseRequest: 0x555555946390 handle_flush_readahead: r=0 2019-07-18 14:52:55.389978 7fffcf7fe700 10 librbd::image::CloseRequest: 0x555555946390 send_shut_down_cache 2019-07-18 14:52:55.390119 7fffcf7fe700 10 librbd::image::CloseRequest: 0x555555946390 handle_shut_down_cache: r=0 2019-07-18 14:52:55.390125 7fffcf7fe700 10 librbd::image::CloseRequest: 0x555555946390 send_flush_op_work_queue 2019-07-18 14:52:55.390130 7fffcf7fe700 10 librbd::image::CloseRequest: 0x555555946390 handle_flush_op_work_queue: r=0 2019-07-18 14:52:55.390135 7fffcf7fe700 10 librbd::image::CloseRequest: 0x555555946390 handle_flush_image_watcher: r=0 2019-07-18 14:52:55.390168 7fffcf7fe700 10 librbd::ImageState: 0x555555947170 0x555555947170 handle_close: r=0 2019-07-18 14:52:55.391016 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.151:6806/2322641 conn(0x7fffc00054d0 :-1 s=STATE_OPEN pgs=823187 cs=1 l=1).mark_down 2019-07-18 14:52:55.391054 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.152:6804/3417032 conn(0x7fffc8108a70 :-1 s=STATE_OPEN pgs=517757 cs=1 l=1).mark_down 2019-07-18 14:52:55.391070 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.150:6814/2366227 conn(0x7fffc807dcb0 :-1 s=STATE_OPEN pgs=529918 cs=1 l=1).mark_down 2019-07-18 14:52:55.391098 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.150:6804/2366065 conn(0x7fffc80fb770 :-1 s=STATE_OPEN pgs=912307 cs=1 l=1).mark_down 2019-07-18 14:52:55.391120 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.151:6810/2322644 conn(0x7fffc81e2130 :-1 s=STATE_OPEN pgs=612830 cs=1 l=1).mark_down 2019-07-18 14:52:55.391129 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.150:6818/2366231 conn(0x7fffc81fa980 :-1 s=STATE_OPEN pgs=791458 cs=1 l=1).mark_down 2019-07-18 14:52:55.391145 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.150:6812/2366206 conn(0x555555954480 :-1 s=STATE_OPEN pgs=528620 cs=1 l=1).mark_down 2019-07-18 14:52:55.391153 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.151:6818/2322735 conn(0x7fffc80d42f0 :-1 s=STATE_OPEN pgs=696439 cs=1 l=1).mark_down 2019-07-18 14:52:55.391161 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.151:6802/2322629 conn(0x7fffc805b6a0 :-1 s=STATE_OPEN pgs=623470 cs=1 l=1).mark_down 2019-07-18 14:52:55.391477 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.391566 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.150:6802/2366074 conn(0x7fffc8068670 :-1 s=STATE_OPEN pgs=757332 cs=1 l=1).mark_down 2019-07-18 14:52:55.392068 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392086 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392093 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.151:6814/2322637 conn(0x7fffc8077360 :-1 s=STATE_OPEN pgs=680587 cs=1 l=1).mark_down 2019-07-18 14:52:55.392114 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.151:6816/2322731 conn(0x7fffc8061f00 :-1 s=STATE_OPEN pgs=605415 cs=1 l=1).mark_down 2019-07-18 14:52:55.392129 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.152:6820/51139 conn(0x7fffc811d5e0 :-1 s=STATE_OPEN pgs=343039 cs=1 l=1).mark_down 2019-07-18 14:52:55.392138 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.152:6800/3416932 conn(0x7fffc820fa10 :-1 s=STATE_OPEN pgs=781718 cs=1 l=1).mark_down 2019-07-18 14:52:55.392147 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.152:6806/3417034 conn(0x7fffc80533c0 :-1 s=STATE_OPEN pgs=536148 cs=1 l=1).mark_down 2019-07-18 14:52:55.392165 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.152:6808/3417140 conn(0x7fffc8134810 :-1 s=STATE_OPEN pgs=761314 cs=1 l=1).mark_down 2019-07-18 14:52:55.392181 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.152:6816/3416957 conn(0x7fffc8146610 :-1 s=STATE_OPEN pgs=837393 cs=1 l=1).mark_down 2019-07-18 14:52:55.392214 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392228 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392256 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392286 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392436 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392448 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392452 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.392194 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.150:6810/2366205 conn(0x7fffc810f360 :-1 s=STATE_OPEN pgs=587338 cs=1 l=1).mark_down 2019-07-18 14:52:55.392882 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.150:6806/2366079 conn(0x7fffc8005320 :-1 s=STATE_OPEN pgs=588682 cs=1 l=1).mark_down 2019-07-18 14:52:55.393119 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.152:6822/3417160 conn(0x7fffc813b3d0 :-1 s=STATE_OPEN pgs=667392 cs=1 l=1).mark_down 2019-07-18 14:52:55.393136 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.151:6812/2322646 conn(0x7fffc8148070 :-1 s=STATE_OPEN pgs=671116 cs=1 l=1).mark_down 2019-07-18 14:52:55.393150 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6810/3215426 conn(0x7fffc800f2c0 :-1 s=STATE_OPEN pgs=432372 cs=1 l=1).mark_down 2019-07-18 14:52:55.393165 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6822/3215746 conn(0x7fffc8166d60 :-1 s=STATE_OPEN pgs=363273 cs=1 l=1).mark_down 2019-07-18 14:52:55.393179 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6814/3215437 conn(0x7fffc8045cd0 :-1 s=STATE_OPEN pgs=360091 cs=1 l=1).mark_down 2019-07-18 14:52:55.393187 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6806/3215303 conn(0x7fffc80f4e90 :-1 s=STATE_OPEN pgs=301563 cs=1 l=1).mark_down 2019-07-18 14:52:55.393195 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6802/3215397 conn(0x7fffc806eec0 :-1 s=STATE_OPEN pgs=288488 cs=1 l=1).mark_down 2019-07-18 14:52:55.393644 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.393717 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.393732 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.393736 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.393741 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.393208 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6804/3215221 conn(0x7fffc81259b0 :-1 s=STATE_OPEN pgs=321005 cs=1 l=1).mark_down 2019-07-18 14:52:55.393758 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6808/3215396 conn(0x7fffc812d9a0 :-1 s=STATE_OPEN pgs=361415 cs=1 l=1).mark_down 2019-07-18 14:52:55.393767 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.161:6818/3215660 conn(0x7fffc817ca00 :-1 s=STATE_OPEN pgs=357502 cs=1 l=1).mark_down 2019-07-18 14:52:55.393933 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.153:6801/2546 conn(0x7fffd000ca40 :-1 s=STATE_OPEN pgs=2798975 cs=1 l=1).mark_down 2019-07-18 14:52:55.393965 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 >> 10.23.27.154:6789/0 conn(0x7fffa801a900 :-1 s=STATE_OPEN pgs=7851695 cs=1 l=1).mark_down 2019-07-18 14:52:55.394004 7fffe993c700 1 -- 10.23.27.200:0/3920476044 reap_dead start 2019-07-18 14:52:55.396891 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 shutdown_connections 2019-07-18 14:52:55.397293 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 shutdown_connections 2019-07-18 14:52:55.397384 7ffff7fe3000 1 -- 10.23.27.200:0/3920476044 wait complete. The problem appeared now every 3 reproduction attempts: Simple read/write of a file larger than rbd cache works without a problem: # dd if=/dev/zero of=la bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 27.9281 s, 38.4 MB/s # dd if=la of=/dev/null bs=1M 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 93.6031 s, 11.5 MB/s Environment: - Kernel: 4.15.0-45-generic - Ceph Client: 12.2.11 - XFS Mount options: rw,relatime,attr2,discard,largeio,inode64,allocsize=4096k,logbufs=8,logbsize=256k,noquota,_netdev - Volume: rbd image 'archive-001-srv_ec': size 3TiB in 786432 objects order 22 (4MiB objects) data_pool: rbd_hdd_ec block_name_prefix: rbd_data.34.6c73776b8b4567 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, data-pool flags: create_timestamp: Thu Jul 11 17:59:00 2019 - Client configuration [client] rbd cache = true rbd cache size = 536870912 rbd cache max dirty = 268435456 rbd cache target dirty = 134217728 rbd cache max dirty age = 15 rbd readahead max bytes = 4194304 - Pool created by ceph osd erasure-code-profile set archive_profile \ k=2 \ m=2 \ crush-device-class=hdd \ crush-failure-domain=host ceph osd pool create rbd_hdd_ec 64 64 erasure archive_profile ceph osd pool set rbd_hdd_ec allow_ec_overwrites true ceph osd pool application enable rbd_hdd_ec rbd What can i do? I never experienced something like this krbd or rbd-nbd (12.2.5 in my xen-hypervisor, https://github.com/vico-research-and-consulting/RBDSR) Regards Marc

4 years, 9 months

3
9
0 0

customized promotional items, you need us

by sales05＠tradepro.net

Hello, We are a factory, professional in making various kinds of promitional products, such as non-woven drawstring bag with customer's logo on it plastic advertising fan with customer's logo on it pvc sun visor cap with customer's logo on it cotton t-shirt with customer's logo on it ball-point pen with customer's logo on it ........... We can offer very good price and fast delivery. Do you want to receive our online catalog and have a check ? Please give a reply to this email. Thanks.

4 years, 9 months

1
0
0 0

v14.2.2 Nautilus released

by Nathan Cutler

This is the second bug fix release of Ceph Nautilus release series. We recommend all Nautilus users upgrade to this release. For upgrading from older releases of ceph, general guidelines for upgrade to nautilus must be followed. Notable Changes --------------- * The no{up,down,in,out} related commands have been revamped. There are now 2 ways to set the no{up,down,in,out} flags: the old 'ceph osd [un]set <flag>' command, which sets cluster-wide flags; and the new 'ceph osd [un]set-group <flags> <who>' command, which sets flags in batch at the granularity of any crush node, or device class. * radosgw-admin introduces two subcommands that allow the managing of expire-stale objects that might be left behind after a bucket reshard in earlier versions of RGW. One subcommand lists such objects and the other deletes them. Read the troubleshooting section of the dynamic resharding docs for details. * Earlier Nautilus releases (14.2.1 and 14.2.0) have an issue where deploying a single new (Nautilus) BlueStore OSD on an upgraded cluster (i.e. one that was originally deployed pre-Nautilus) breaks the pool utilization stats reported by ceph df. Until all OSDs have been reprovisioned or updated (via ceph-bluestore-tool repair), the pool stats will show values that are lower than the true value. This is resolved in 14.2.2, such that the cluster only switches to using the more accurate per-pool stats after all OSDs are 14.2.2 (or later), are BlueStore, and (if they were created prior to Nautilus) have been updated via the repair function. * The default value for mon_crush_min_required_version has been changed from firefly to hammer, which means the cluster will issue a health warning if your CRUSH tunables are older than hammer. There is generally a small (but non-zero) amount of data that will move around by making the switch to hammer tunables. If possible, we recommend that you set the oldest allowed client to hammer or later. You can tell what the current oldest allowed client is with: ceph osd dump | grep min_compat_client If the current value is older than hammer, you can tell whether it is safe to make this change by verifying that there are no clients older than hammer current connected to the cluster with: ceph features The newer straw2 CRUSH bucket type was introduced in hammer, and ensuring that all clients are hammer or newer allows new features only supported for straw2 buckets to be used, including the crush-compat mode for the Balancer. For a detailed changelog please refer to the official release notes entry at the ceph blog: https://ceph.com/releases/v14-2-2-nautilus-released/ Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-14.2.2.tar.gz * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ * Release git sha1: 4f8fa0a0024755aae7d95567c63f11d6862d55be

4 years, 9 months

1
0
0 0

Re: Python 2 support in Octopus

by Sage Weil

Adding Alfredo, Ken, and dev@, I think there are some open questions about when the needed py3 dependencies will be in place? sage On Mon, 22 Jul 2019, Ricardo Dias wrote: > Hi Sage, > > This morning during our daily dashboard standup was raised the question > of whether python 2 will still need to be supported in Octopus. > > The information of the discussion around this topic in the mailing lists > is not very clear. > My understanding is that we're going to remove python 2 support in > Octopus but it hasn't still happened in master. > > Can we get a clear statement in the mailing list if the above is > correct, that we indeed will move to a python3 only support? > > Thanks, > -- > Ricardo Dias > Senior Software Engineer - Storage Team > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > HRB 21284 > (AG Nürnberg) > >

4 years, 9 months

2
1
0 0

is anyone using cephfs inline_data support?

by Jeff Layton

A broad question for the cephfs user community: I've been looking at adding inline_data write support for kcephfs [1]. It's non-trivial to handle correctly in the kernel (due to the more complex locking, primarily), and I'm finding some bugs in what's already there. Is anyone actually enabling inline_data in their environments? Is this something we should be expending effort to support? Thanks, -- Jeff Layton <jlayton(a)redhat.com> [1]: http://docs.ceph.com/docs/master/cephfs/experimental-features/#inline-data

4 years, 9 months

1
0
0 0

2024

2023

2022

2021

2020

2019

Dev July 2019