June 2020 - ceph-users - lists.ceph.io

IPv6 connectivity gone for Ceph Telemetry

by Wido den Hollander

Hi, I was just checking on a few (13) IPv6-only Ceph clusters and I noticed that they couldn't send their Telemetry data anymore: telemetry.ceph.com has address 8.43.84.137 This server used to have Dual-Stack connectivity while it was still hosted at OVH. It seemed to have moved to Red Hat, but lost IPv6 connectivity. How can we get this back? Wido

3 years, 9 months

2
1
0 0

Problem with OSD::osd_op_tp thread had timed out and other connected issues

by Jan Pekař - Imatic

Hello, I have ceph cluster version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable) 4 nodes - each node 11 HDD, 1 SSD, 10Gbit network Cluster was empty, fresh install. We filled cluster with data (small blocks) using RGW. Cluster is now used for testing so no client was using it during my admin operations mentioned below After a while (7TB of data / 40M objects uploaded) we decided, that we increase pg_num from 128 to 256 to better spread data and to speedup this operation, I've set ceph config set mgr target_max_misplaced_ratio 1 so that whole cluster rebalance as quickly as it can. I have 3 issues/questions below: 1) I noticed, that manual increase from 128 to 256 caused approx. 6 OSD's to restart with logged heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7f8c84b8b700' had suicide timed out after 150 after a while OSD's were back so I continued after a while with my tests. My question - increasing number of PG with maximal target_max_misplaced_ratio was too much for that OSDs? It is not recommended to do it this way? I had no problem with this increase before, but configuration of cluster was slightly different and it was luminous version. 2) Rebuild was still slow so I increased number of backfills ceph tell osd.* injectargs "--osd-max-backfills 10" and reduced recovery sleep time ceph tell osd.* injectargs "--osd-recovery-sleep-hdd 0.01" and after few hours I noticed, that some of my OSD's were restarted during recovery, in log I can see ... |2020-03-21 06:41:28.343 7fe1f8bee700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe1da154700' had timed out after 15 2020-03-21 06:41:28.343 7fe1f8bee700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe1da154700' had timed out after 15 2020-03-21 06:41:36.780 7fe1da154700 1 heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7fe1da154700' had timed out after 15 2020-03-21 06:41:36.888 7fe1e7769700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.7 down, but it is still running 2020-03-21 06:41:36.888 7fe1e7769700 0 log_channel(cluster) log [DBG] : map e3574 wrongly marked me down at e3573 2020-03-21 06:41:36.888 7fe1e7769700 1 osd.7 3574 start_waiting_for_healthy | I observed network graph usage and network utilization was low during recovery (10Gbit was not saturated). So lot of IOPS on OSD causes also hartbeat operation to timeout? I thought that OSD is using threads and HDD timeouts are not influencing heartbeats to other OSD's and MON. It looks like it is not true. 3) After OSD was wrongly marked down I can see that cluster has object degraded. There were no degraded object before that. Degraded data redundancy: 251754/117225048 objects degraded (0.215%), 8 pgs degraded, 8 pgs undersized It means that this OSD disconnection causes data degraded? How is it possible, when no OSD was lost. Data should be on that OSD and after peering should be everything OK. With luminous I had no problem, after OSD up degraded objects where recovered/found during few seconds and cluster was healthy within seconds. Thank you very much for additional info. I can perform additional tests you recommend because cluster is used for testing purpose now. With regards Jan Pekar -- ============ Ing. Jan Pekař jan.pekar(a)imatic.cz ---- Imatic | Jagellonská 14 | Praha 3 | 130 00 http://www.imatic.cz | +420326555326 ============ --

3 years, 9 months

4
7
0 0

bluestore_default_buffered_write = true

by Adam Koczarski

Has anyone ever tried using this feature? I've added it to the [global] section of the ceph.conf on my POC cluster but I'm not sure how to tell if it's actually working. I did find a reference to this feature via Google and they had it in their [OSD] section?? I've tried that too.. TIA Adam

3 years, 9 months

2
1
0 0

Degradation of write-performance after upgrading to Octopus

by Thomas Gradisnik

We have deployed a small test cluster consisting of three nodes. Each node is running a mon/mgr and two osds (Samsung PM983 3,84TB NVMe split into two partitions), so six osds in total. We started with Ceph 14.2.7 some weeks ago (upgraded to 14.2.9 later) and ran different tests using fio against some rbd volumes in order to get an overview what performance we could expect. The configuration is unchanged compared to the defaults, we only set several debugging options to 0/0. Yesterday we upgraded the whole cluster following the upgrade guidelines to Ceph 15.2.3, which worked without any problems so far. Nevertheless when running the same tests as before with Ceph 14.2.9, we are seeing some clear degradations in write-performance (beside some performance improvements, which shall also be mentioned). Here the results of concern (each with the relevant fio settings used): Test "read-latency-max" (rw=randread, iodepth=64, bs=4k) read_iops: 32500 -> 87000 Test "write-latency-max" (rw=randwrite, iodepth=64, bs=4k) write_iops: 22500 -> 11500 Test "write-throughput-iops-max" (rw=write, iodepth=64, bs=4k) write_iops: 7000 -> 14000 Test "usecase1" (rw=randrw, bssplit=4k/40:8k/5:16k/20:32k/5:64k/10:128k/10:256k/,4k/50:8k/20:16k/20:32k/5:64k/2:128k/:256k/, rwmixread=1, rate_process=poisson, iodepth=64) write_iops: 21000 -> 8500 Test "usecase1-readonly" (rw=randread, bssplit=4k/40:8k/5:16k/20:32k/5:64k/10:128k/10:256k/, rate_process=poisson, iodepth=64) read_iops: 28000 -> 58000 The last two tests represent a typical use case on our systems. Therefore we are especially concerned by the drop in performance from 21000 w/ops to 8500 w/ops (about 60%) after upgrading to Ceph 15.2.3. We ran all tests several times, the values are averaged over all iterations and fairly consistent and reproducible. We even tried wiping the whole cluster, downgrading to Ceph 14.2.9 again, setting up a new cluster/pool, running the tests and upgrading to Ceph 15.2.3 again. The tests have been performed on one of the three cluster nodes using a 50G rbd volume, which had been prefilled with random data before each test-run. Have any changes been introduced with Octopus that could explain the observed changes in performance? What we already tried: - Disabling rbd cache - Reverting rbc cache policy to writeback (default in 14.2) - Setting rbd io scheduler to none - Deploying a fresh cluster starting with Ceph 15.2.3 Kernel is 5.4.38 … I don't know if some other system specs would be helpful besides the already mentioned (since we are talking about a relative change in performance after upgrading Ceph without any further changes) - if so, please let us know.

3 years, 9 months

6
10
0 0

Poor Windows performance on ceph RBD.

by Frank Schilder

Dear all, maybe someone can give me a pointer here. We are running OpenNebula with ceph RBD as a back-end store. We have a pool of spinning disks to create large low-demand data disks, mainly for backups and other cold storage. Everything is fine when using linux VMs. However, Windows VMs perform poorly, they are like a factor 20 slower than a similarly created linux VM. If anyone has pointers what to look for, we would be very grateful. The OpenNebula installation is more or less default. The current OS and libvirt versions we use are: Centos 7.6 with stock kernel 3.10.0-1062.1.1.el7.x86_64 libvirt-client.x86_64 4.5.0-23.el7_7.1 @updates qemu-kvm-ev.x86_64 10:2.12.0-33.1.el7 @centos-qemu-ev Some benchmark results from good to worse workloads: rbd bench --io-size 4M --io-total 4G --io-pattern seq --io-type write --io-threads 16 : 450MB/s rbd bench --io-size 4M --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 230MB/s rbd bench --io-size 1M --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 190MB/s rbd bench --io-size 64K --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 150MB/s rbd bench --io-size 64K --io-total 1G --io-pattern rand --io-type write --io-threads 1 : 26MB/s dd with conv=fdatasync gives awesome 500MB/s inside linux VM for sequential write of 4GB. We copied a couple of large ISO files inside the Windows VM and for the first ca. 1 to 1.5G it performs as expected. Thereafter, however, write speed drops rapidly to ca. 25MB/s and does not recover. It is almost as if Windows translates large sequential writes to small random writes. If anyone has seen and solved this before, please let us know. Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

3 years, 9 months

5
9
0 0

Ceph df Vs Dashboard pool usage mismatch

by Richard Kearsley

Hi My ceph dashboard reports 64% usage for rgw.buckets.data: [image: cephdashboard.png] But "ceph df" command shows 56.81% used: RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 611 TiB 282 TiB 328 TiB 329 TiB 53.81 TOTAL 611 TiB 282 TiB 328 TiB 329 TiB 53.81 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL .rgw.root 1 12 KiB 25 7.5 MiB 0 50 TiB rgw.control 5 0 B 15 0 B 0 50 TiB rgw.meta 6 13 KiB 51 14 MiB 0 50 TiB rgw.log 7 7.5 MiB 783 69 MiB 0 50 TiB rgw.buckets.index 8 57 MiB 16 57 MiB 0 50 TiB buckets.non-ec 10 0 B 0 0 B 0 50 TiB rgw.buckets.data 11 217 TiB 57.48M 328 TiB 56.81 187 TiB Does ceph df have the correct one? 🤔 ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable) Thanks Richard

3 years, 9 months

3
2
0 0

Octopus upgrade breaks Ubuntu 18.04 libvirt

by Andrei Mikhailovsky

Hello, I've upgraded ceph to Octopus (15.2.3 from repo) on one of the Ubuntu 18.04 host servers. The update caused problem with libvirtd which hangs when it tries to access the storage pools. The problem doesn't exist on Nautilus. The libvirtd process simply hangs. Nothing seem to happen. The log file for the libvirtd shows: 2020-06-29 19:30:51.556+0000: 12040: debug : virNetlinkEventCallback:707 : dispatching to max 0 clients, called from event watch 11 2020-06-29 19:30:51.556+0000: 12040: debug : virNetlinkEventCallback:720 : event not handled. 2020-06-29 19:30:51.556+0000: 12040: debug : virNetlinkEventCallback:707 : dispatching to max 0 clients, called from event watch 11 2020-06-29 19:30:51.556+0000: 12040: debug : virNetlinkEventCallback:720 : event not handled. 2020-06-29 19:30:51.557+0000: 12040: debug : virNetlinkEventCallback:707 : dispatching to max 0 clients, called from event watch 11 2020-06-29 19:30:51.557+0000: 12040: debug : virNetlinkEventCallback:720 : event not handled. 2020-06-29 19:30:51.591+0000: 12040: debug : virNetlinkEventCallback:707 : dispatching to max 0 clients, called from event watch 11 2020-06-29 19:30:51.591+0000: 12040: debug : virNetlinkEventCallback:720 : event not handled. Running strace on the libvirtd process shows: root@ais-cloudhost1:/home/andrei# strace -p 12040 strace: Process 12040 attached restart_syscall(<... resuming interrupted poll ...> Nothing happens after that point. The same host server can get access to the ceph cluster and the pools by running ceph -s or rbd -p <pool> ls -l commands for example. Need some help to get the host servers working again with Octopus. Cheers

3 years, 9 months

3
9
0 0

OSDs taking too much memory, for buffer_anon

by Harald Staub

As a follow-up to our recent memory problems with OSDs (with high pglog values: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/LJPJZPBSQRJ… ), we also see high buffer_anon values. E.g. more than 4 GB, with "osd memory target" set to 3 GB. Is there a way to restrict it? As it is called "anon", I guess that it would first be necessary to find out what exactly is behind this? Well maybe it is just as Wido said, with lots of small objects, there will be several problems. Cheers Harry

3 years, 9 months

4
8
0 0

Re: NFS Ganesha 2.7 in Xenial not available

by Victoria Martinez de la Cruz

Thanks Ramana and David. So we are using the Shaman search API to get the latest build for ceph_nautilus flavor of NFS Ganesha, and that's how we get to the mentioned build. We are doing this since it's part of our CI and it's better for automation. Should we use different repos? Thanks, V On Wed, Jun 24, 2020 at 3:33 PM Victoria Martinez de la Cruz < vkmc(a)redhat.com> wrote: > Thanks Ramana and David. > > So we are using the Shaman search API to get the latest build for > ceph_nautilus flavor of NFS Ganesha, and that's how we get to the mentioned > build. We are doing this since it's part of our CI and it's better for > automation. > > Should we use different repos? > > Thanks, > > V > > On Tue, Jun 23, 2020 at 2:42 PM David Galloway <dgallowa(a)redhat.com> > wrote: > >> >> >> On 6/23/20 1:21 PM, Ramana Venkatesh Raja wrote: >> > On Tue, Jun 23, 2020 at 6:59 PM Victoria Martinez de la Cruz >> > <victoria(a)redhat.com> wrote: >> >> >> >> Hi folks, >> >> >> >> I'm hitting issues with the nfs-ganesha-stable packages [0], the repo >> url >> >> [1] is broken. Is there a known issue for this? >> >> >> > >> > The missing packages in chacra could be due to the recent mishap in >> > the sepia long running cluster, >> > >> https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/YQMAHTB7MUHL25QP7V… >> >> Hi Victoria, >> >> Ramana is correct. Do you need 2.7.4 specifically? If not, signed >> nfs-ganesha packages can also be found here: >> http://download.ceph.com/nfs-ganesha/ >> >> > >> >> Thanks, >> >> >> >> Victoria >> >> >> >> [0] >> >> >> https://shaman.ceph.com/repos/nfs-ganesha-stable/V2.7-stable/1a1fb71cdb811c… >> >> [1] >> >> >> https://chacra.ceph.com/r/nfs-ganesha-stable/V2.7-stable/1a1fb71cdb811c1bac… >> >> _______________________________________________ >> >> ceph-users mailing list -- ceph-users(a)ceph.io >> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >> >> >> > >> >>

3 years, 9 months

3
3
0 0

RES: Debian install

by Rafael Quaglio

Thanks for your reply Anastasios, I was waiting for some answer. My /etc/apt/sources.list.d/ceph.list content is: deb https://download.ceph.com/debian-nautilus/ buster main Even if I do “apt-get update”, the packages still the same. The Ceph client (CephFS mount) is working well, but I can´t deploy new osds. The error that I posted occurs when I do : “ceph-deploy osd create --data /dev/sdb node1” I appreciate any help. Rafael. De: Anastasios Dados <tdados(a)hotmail.com> Enviada em: segunda-feira, 29 de junho de 2020 20:01 Para: Rafael Quaglio <quaglio(a)bol.com.br>; ceph-users(a)ceph.io Assunto: Re: [ceph-users] Debian install Hello Rafael, Can you check the apt sources list that exist from your ceph-deploy node? Maybe there you have put luminous debian packages version? Regards, Anastasios On Mon, 2020-06-29 at 06:59 -0300, Rafael Quaglio wrote: Hi, We have already installed a new Debian (10.4) server and I need put it in a Ceph cluster. When I execute the command to install ceph on this node: ceph-deploy install --release nautilus node1 It starts to install a version 12.x in my node... (...) [serifos][DEBUG ] After this operation, 183 MB of additional disk space will be used. [serifos][DEBUG ] Selecting previously unselected package python-cephfs. (Reading database ... 30440 files and directories currently installed.) [serifos][DEBUG ] Preparing to unpack .../python-cephfs_12.2.11+dfsg1-2.1+b1_amd64.deb ... [serifos][DEBUG ] Unpacking python-cephfs (12.2.11+dfsg1-2.1+b1) ... [serifos][DEBUG ] Selecting previously unselected package ceph-common. [serifos][DEBUG ] Preparing to unpack .../ceph-common_12.2.11+dfsg1-2.1+b1_amd64.deb ... [serifos][DEBUG ] Unpacking ceph-common (12.2.11+dfsg1-2.1+b1) ... (...) How do I upgrade this packages? Even installed packages in this version, the installation completes without erros. The question is due to an error message that I'm recieving when deploy a new osd. ceph-deploy osd create --data /dev/sdb node1 At this point: [ceph_deploy.osd][INFO ] Distro info: debian 10.4 buster [ceph_deploy.osd][DEBUG ] Deploying osd to node1 [node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [node1][DEBUG ] find the location of an executable [node1][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb [node1][WARNIN] --> RuntimeError: Unable to create a new OSD id [node1][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key [node1][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 76da6c51-8385-4ffc-9a8e-0dfc11e31feb [node1][DEBUG ] stderr: /build/ceph-qtARip/ceph-12.2.11+dfsg1/src/mon/MonMap.cc: In function 'void MonMap::sanitize_mons(std::map<std::__cxx11::basic_string<char>, entity_addr_t>&)' thread 7f2bc7fff700 time 2020-06-29 06:56:17.331350 [node1][DEBUG ] stderr: /build/ceph-qtARip/ceph-12.2.11+dfsg1/src/mon/MonMap.cc: 77: FAILED assert(mon_info[p.first].public_addr == p.second) [node1][DEBUG ] stderr: ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable) [node1][DEBUG ] stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0xf5) [0x7f2bdaff5f75] [node1][DEBUG ] stderr: 2: (MonMap::sanitize_mons(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, entity_addr_t, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, entity_addr_t> > &)+0x568) [0x7f2bdb050038] [node1][DEBUG ] stderr: 3: (MonMap::decode(ceph::buffer::list::iterator&)+0x4da) [0x7f2bdb05500a] [node1][DEBUG ] stderr: 4: (MonClient::handle_monmap(MMonMap*)+0x216) [0x7f2bdb042a06] [node1][DEBUG ] stderr: 5: (MonClient::ms_dispatch(Message*)+0x4ab) [0x7f2bdb04729b] [node1][DEBUG ] stderr: 6: (DispatchQueue::entry()+0xeba) [0x7f2bdb06bf5a] [node1][DEBUG ] stderr: 7: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f2bdb1576fd] [node1][DEBUG ] stderr: 8: (()+0x7fa3) [0x7f2be499dfa3] [node1][DEBUG ] stderr: 9: (clone()+0x3f) [0x7f2be45234cf] [node1][DEBUG ] stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. [node1][ERROR ] RuntimeError: command returned non-zero exit status: 1 [ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs I think this error occurs because the wrong package that was installed. Thanks, Rafael _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io <mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-leave(a)ceph.io <mailto:ceph-users-leave@ceph.io>

3 years, 9 months

1
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2020