January 2020 - ceph-users

by Bryan Stillwell

After upgrading one of our clusters from Nautilus 14.2.2 to Nautilus 14.2.5 I'm seeing 100% CPU usage by a single ceph-mgr thread (found using 'top -H'). Attaching to the thread with strace shows a lot of mmap and munmap calls. Here's the distribution after watching it for a few minutes: 48.73% - mmap 49.48% - munmap 1.75% - futex 0.05% - madvise I've upgraded 3 other clusters so far (120 OSDs, 30 OSDs, 200 OSDs), but this is the only one which has seen the problem (355 OSDs). Perhaps it has something to do with its size? I was suspecting it might have to do with one of the modules misbehaving, so I disabled all of them: # ceph mgr module ls | jq -r '.enabled_modules' [] But that didn't help (I restarted the mgrs after disabling the modules too). I also tried setting debug_mgr and debug_mgrc to 20, but nothing popped out at me as being the cause of the problem. It only seems to affect the active mgr. If I stop the active mgr the problem moves to one of the other mgrs. Any guesses or tips on what next steps I should take to figure out what's going on? Thanks, Bryan

4 years, 3 months

16
24
0 0

Uneven Node utilization

by Sasha Litvak

Hello, Cephers, I have a small 6 node cluster with 36 OSDs. When running the benchmark/torture tests I noticed that some nodes, usually storage2n6-la and also sometimes others are utilized much more. I see some osds are used 100% and load average goes up to 21 while on the others the load average is 5 - 6 and osds are within 40 - 50 - 60% of utilization. I cannot use upmap mode for balancer because I still have some client machines using hammer. I wonder if my issue is caused by compat balancing mode as compat weight shows nodes with the same number and disk size but different compat weights. If so what can I do to improve the load/disk usage distribution in the cluster? Also, my legacy client machines only need to access cephfs on the new cluster, so I wonder if keeping hammer as the oldest client version makes sense and I should change it to jewel and set crush tunables to optimal. Help is greatly appreciated, ceph df RAW STORAGE:. CLASS SIZE AVAIL USED RAW USED %RAW USED ssd 94 TiB 88 TiB 5.9 TiB 5.9 TiB 6.29 TOTAL 94 TiB 88 TiB 5.9 TiB 5.9 TiB 6.29 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL cephfs_data 1 1.6 TiB 3.77M 4.9 TiB 5.57 28 TiB cephfs_metadata 2 3.9 GiB 367.34k 4.3 GiB 0 28 TiB one 5 344 GiB 90.94k 1.0 TiB 1.20 28 TiB ceph -s cluster: id: 9b4468b7-5bf2-4964-8aec-4b2f4bee87ad health: HEALTH_OK services: mon: 3 daemons, quorum storage2n1-la,storage2n2-la,storage2n3-la (age 39h) mgr: storage2n1-la(active, since 39h), standbys: storage2n2-la, storage2n3-la mds: cephfs:1 {0=storage2n4-la=up:active} 1 up:standby-replay 1 up:standby osd: 36 osds: 36 up (since 37h), 36 in (since 10w) data: pools: 3 pools, 1664 pgs objects: 4.23M objects, 1.9 TiB usage: 5.9 TiB used, 88 TiB / 94 TiB avail pgs: 1664 active+clean io: client: 1.2 KiB/s rd, 46 KiB/s wr, 5 op/s rd, 2 op/s wr Ceph osd df looks like this ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 6 ssd 1.74609 1.00000 1.7 TiB 115 GiB 114 GiB 186 MiB 838 MiB 1.6 TiB 6.45 1.02 92 up 12 ssd 1.74609 1.00000 1.7 TiB 122 GiB 121 GiB 90 MiB 934 MiB 1.6 TiB 6.81 1.08 92 up 18 ssd 1.74609 1.00000 1.7 TiB 112 GiB 111 GiB 107 MiB 917 MiB 1.6 TiB 6.24 0.99 91 up 24 ssd 3.49219 1.00000 3.5 TiB 233 GiB 232 GiB 206 MiB 818 MiB 3.3 TiB 6.53 1.04 185 up 30 ssd 3.49219 1.00000 3.5 TiB 224 GiB 223 GiB 246 MiB 778 MiB 3.3 TiB 6.25 0.99 187 up 35 ssd 3.49219 1.00000 3.5 TiB 216 GiB 215 GiB 252 MiB 772 MiB 3.3 TiB 6.04 0.96 184 up 5 ssd 1.74609 1.00000 1.7 TiB 112 GiB 111 GiB 88 MiB 936 MiB 1.6 TiB 6.28 1.00 92 up 11 ssd 1.74609 1.00000 1.7 TiB 112 GiB 111 GiB 112 MiB 912 MiB 1.6 TiB 6.26 0.99 92 up 17 ssd 1.74609 1.00000 1.7 TiB 112 GiB 111 GiB 274 MiB 750 MiB 1.6 TiB 6.25 0.99 94 up 23 ssd 3.49219 1.00000 3.5 TiB 234 GiB 233 GiB 192 MiB 832 MiB 3.3 TiB 6.54 1.04 183 up 29 ssd 3.49219 1.00000 3.5 TiB 216 GiB 215 GiB 356 MiB 668 MiB 3.3 TiB 6.03 0.96 184 up 34 ssd 3.49219 1.00000 3.5 TiB 227 GiB 226 GiB 267 MiB 757 MiB 3.3 TiB 6.34 1.01 184 up 4 ssd 1.74609 1.00000 1.7 TiB 125 GiB 124 GiB 16 MiB 1008 MiB 1.6 TiB 7.00 1.11 94 up 10 ssd 1.74609 1.00000 1.7 TiB 108 GiB 107 GiB 163 MiB 861 MiB 1.6 TiB 6.01 0.96 93 up 16 ssd 1.74609 1.00000 1.7 TiB 107 GiB 106 GiB 163 MiB 861 MiB 1.6 TiB 6.00 0.95 94 up 22 ssd 3.49219 1.00000 3.5 TiB 221 GiB 220 GiB 385 MiB 700 MiB 3.3 TiB 6.18 0.98 187 up 28 ssd 3.49219 1.00000 3.5 TiB 223 GiB 222 GiB 257 MiB 767 MiB 3.3 TiB 6.23 0.99 186 up 33 ssd 3.49219 1.00000 3.5 TiB 241 GiB 240 GiB 233 MiB 791 MiB 3.3 TiB 6.74 1.07 185 up 1 ssd 1.74609 1.00000 1.7 TiB 103 GiB 102 GiB 240 MiB 784 MiB 1.6 TiB 5.76 0.92 93 up 7 ssd 1.74609 1.00000 1.7 TiB 117 GiB 116 GiB 70 MiB 954 MiB 1.6 TiB 6.56 1.04 91 up 13 ssd 1.74609 1.00000 1.7 TiB 126 GiB 125 GiB 76 MiB 948 MiB 1.6 TiB 7.03 1.12 95 up 19 ssd 3.49219 1.00000 3.5 TiB 230 GiB 229 GiB 307 MiB 717 MiB 3.3 TiB 6.44 1.02 186 up 25 ssd 3.49219 1.00000 3.5 TiB 220 GiB 219 GiB 309 MiB 715 MiB 3.3 TiB 6.15 0.98 185 up 31 ssd 3.49219 1.00000 3.5 TiB 223 GiB 222 GiB 205 MiB 819 MiB 3.3 TiB 6.23 0.99 186 up 0 ssd 1.74609 1.00000 1.7 TiB 116 GiB 115 GiB 151 MiB 873 MiB 1.6 TiB 6.49 1.03 93 up 3 ssd 1.74609 1.00000 1.7 TiB 121 GiB 120 GiB 89 MiB 935 MiB 1.6 TiB 6.77 1.08 91 up 9 ssd 1.74609 1.00000 1.7 TiB 104 GiB 103 GiB 183 MiB 841 MiB 1.6 TiB 5.81 0.92 93 up 15 ssd 3.49219 1.00000 3.5 TiB 222 GiB 221 GiB 205 MiB 819 MiB 3.3 TiB 6.20 0.98 185 up 21 ssd 3.49219 1.00000 3.5 TiB 213 GiB 212 GiB 312 MiB 712 MiB 3.3 TiB 5.95 0.95 182 up 27 ssd 3.49219 1.00000 3.5 TiB 221 GiB 220 GiB 219 MiB 805 MiB 3.3 TiB 6.17 0.98 185 up 2 ssd 1.74609 1.00000 1.7 TiB 104 GiB 103 GiB 116 MiB 908 MiB 1.6 TiB 5.80 0.92 92 up 8 ssd 1.74609 1.00000 1.7 TiB 111 GiB 110 GiB 118 MiB 906 MiB 1.6 TiB 6.21 0.99 91 up 14 ssd 1.74609 1.00000 1.7 TiB 106 GiB 105 GiB 192 MiB 832 MiB 1.6 TiB 5.95 0.94 92 up 20 ssd 3.49219 1.00000 3.5 TiB 226 GiB 225 GiB 196 MiB 828 MiB 3.3 TiB 6.31 1.00 185 up 26 ssd 3.49219 1.00000 3.5 TiB 232 GiB 231 GiB 231 MiB 793 MiB 3.3 TiB 6.47 1.03 184 up 32 ssd 3.49219 1.00000 3.5 TiB 226 GiB 225 GiB 229 MiB 795 MiB 3.3 TiB 6.31 1.00 184 up TOTAL 94 TiB 5.9 TiB 5.9 TiB 6.9 GiB 29 GiB 88 TiB 6.29 MIN/MAX VAR: 0.92/1.12 STDDEV: 0.31 ceph osd crush tree looks like this: ID CLASS WEIGHT (compat) TYPE NAME -1 94.28906 root default -9 15.71484 16.19038 host storage2n1-la 6 ssd 1.74609 1.77313 osd.6 12 ssd 1.74609 1.82532 osd.12 18 ssd 1.74609 2.10315 osd.18 24 ssd 3.49219 3.50087 osd.24 30 ssd 3.49219 3.01933 osd.30 35 ssd 3.49219 3.96858 osd.35 -11 15.71484 15.75711 host storage2n2-la 5 ssd 1.74609 1.84412 osd.5 11 ssd 1.74609 1.71651 osd.11 17 ssd 1.74609 1.76128 osd.17 23 ssd 3.49219 3.73497 osd.23 29 ssd 3.49219 3.27397 osd.29 34 ssd 3.49219 3.42627 osd.34 -7 15.71484 14.19093 host storage2n3-la 4 ssd 1.74609 1.66724 osd.4 10 ssd 1.74609 1.60271 osd.10 16 ssd 1.74609 1.39088 osd.16 22 ssd 3.49219 3.11852 osd.22 28 ssd 3.49219 3.04280 osd.28 33 ssd 3.49219 3.36879 osd.33 -5 15.71484 15.87343 host storage2n4-la 1 ssd 1.74609 1.92644 osd.1 7 ssd 1.74609 2.12386 osd.7 13 ssd 1.74609 1.42424 osd.13 19 ssd 3.49219 3.52307 osd.19 25 ssd 3.49219 3.55241 osd.25 31 ssd 3.49219 3.32341 osd.31 -3 15.71484 16.08948 host storage2n5-la 0 ssd 1.74609 1.97093 osd.0 3 ssd 1.74609 1.87062 osd.3 9 ssd 1.74609 1.57335 osd.9 15 ssd 3.49219 3.82397 osd.15 21 ssd 3.49219 3.59575 osd.21 27 ssd 3.49219 3.25485 osd.27 -13 15.71484 16.18745 host storage2n6-la 2 ssd 1.74609 2.18393 osd.2 8 ssd 1.74609 1.69547 osd.8 14 ssd 1.74609 1.95445 osd.14 20 ssd 3.49219 3.49811 osd.20 26 ssd 3.49219 3.42702 osd.26 32 ssd 3.49219 3.42848 osd.32 ceph balancer status { "last_optimize_duration": "0:00:00.903346", "plans": [], "mode": "crush-compat", "active": true, "optimize_result": "Unable to find further optimization, change balancer mode and retry might help", "last_optimize_started": "Thu Jan 16 05:52:53 2020" }

4 years, 3 months

1
0
0 0

Re: Objects not removed (completely) when removing a rbd image

by 徐蕴

Hi, I just looked through the rbd driver of OpenStack cinder. It seems there is no additional clear_volume step implemented for rbd driver. In my case, objects of this rbd image were deleted partially, so I doubt it’s related Ceph instead of Cinder driver. br, Xu Yun > 2020年1月15日下午7:36，EDH - Manuel Rios <mriosfer(a)easydatahost.com> 写道： > > Hi > > For huge volumes in Openstack and Ceph, setup in your cinder this param: > > volume_clear_size = 50 > > That will wipe only the first 50MB of the file and then ask to ceph to fully delete instead wipe all disk with zeros that sometimes in huge volumes cause timeout. > > In our deploy that was the solution, Openstack Queens here > > > -----Mensaje original----- > De: Eugen Block <eblock(a)nde.ag> > Enviado el: miércoles, 15 de enero de 2020 8:51 > Para: ceph-users(a)ceph.io > Asunto: [ceph-users] Re: Objects not removed (completely) when removing a rbd image > > Hi, > > this might happen if you try to delete images/instances/volumes in openstack that are somehow linked, e.g. if there are snapshots etc. I have experienced this in Ocata, too. Deleting a base image worked but there were existing clones so basically just the openstack database was updated, but the base image still existed within ceph. > > Try to figure out if that is also the case. If it's something else, check the logs in your openstack environment, maybe they reveal something. Also check the ceph logs. > > Regards, > Eugen > > > Zitat von 徐蕴 <yunxu(a)me.com>: > >> Hello, >> >> My setup is Ceph pike working with OpenStack. When I deleted an image, >> I found that the space was not reclaimed. I checked with rbd ls and >> confirmed that this image was disappeared. But when I check the >> objects with rados ls, most objects named rbd_data.xxx are still >> existed in my cluster. rbd_object_map and rbd_header were already >> deleted. I waited for several hours and there is no further deletion >> happed. Is it a known issue, or something wrong with my configuration? >> >> br, >> Xu Yun >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

4 years, 3 months

1
0
0 0

Objects not removed (completely) when removing a rbd image

by 徐蕴

Hello, My setup is Ceph pike working with OpenStack. When I deleted an image, I found that the space was not reclaimed. I checked with rbd ls and confirmed that this image was disappeared. But when I check the objects with rados ls, most objects named rbd_data.xxx are still existed in my cluster. rbd_object_map and rbd_header were already deleted. I waited for several hours and there is no further deletion happed. Is it a known issue, or something wrong with my configuration? br, Xu Yun

4 years, 3 months

4
6
0 0

dpdk used issue in master

by zhengyin＠cmss.chinamobile.com

Hello everyone If ms_type = async + dpdk, Ceph can work in the master version? I have a problem using dpdk in the master version. Mon cannot be started. Please hep me, thank you very much! 1、My network card configuration information： [root@ebs12 dpdk]# python usertools/dpdk-devbind.py --status Network devices using DPDK-compatible driver ============================================ 0000:05:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=igb_uio unused=ixgbe 0000:05:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=igb_uio unused=ixgbe Network devices using kernel driver =================================== 0000:08:00.0 'I350 Gigabit Network Connection 1521' if=enp8s0f0 drv=igb unused=igb_uio 0000:08:00.1 'I350 Gigabit Network Connection 1521' if=enp8s0f1 drv=igb unused=igb_uio No 'Crypto' devices detected ============================ No 'Eventdev' devices detected ============================== No 'Mempool' devices detected ============================= No 'Compress' devices detected 2、Dpdk can receive and contract normally testpmd> start io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native Logical Core 1 (socket 0) forwards packets on 2 streams: RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01 RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00 io packet forwarding packets/burst=32 nb forwarding cores=1 - nb forwarding ports=2 port 0: RX queue number: 1 Tx queue number: 1 Rx offloads=0x0 Tx offloads=0x0 RX queue: 0 RX desc=256 - RX free threshold=32 RX threshold registers: pthresh=8 hthresh=8 wthresh=0 RX Offloads=0x0 TX queue: 0 TX desc=256 - TX free threshold=32 TX threshold registers: pthresh=32 hthresh=0 wthresh=0 TX offloads=0x0 - TX RS bit threshold=32 port 1: RX queue number: 1 Tx queue number: 1 Rx offloads=0x0 Tx offloads=0x0 RX queue: 0 RX desc=256 - RX free threshold=32 RX threshold registers: pthresh=8 hthresh=8 wthresh=0 RX Offloads=0x0 TX queue: 0 TX desc=256 - TX free threshold=32 TX threshold registers: pthresh=32 hthresh=0 wthresh=0 TX offloads=0x0 - TX RS bit threshold=32 testpmd> stop Telling cores to stop... Waiting for lcores to finish... ---------------------- Forward statistics for port 0 ---------------------- RX-packets: 10 RX-dropped: 0 RX-total: 10 TX-packets: 9 TX-dropped: 0 TX-total: 9 RX-bursts : 10 [100% of 1 pkts] TX-bursts : 9 [100% of 1 pkts] ---------------------------------------------------------------------------- ---------------------- Forward statistics for port 1 ---------------------- RX-packets: 9 RX-dropped: 0 RX-total: 9 TX-packets: 10 TX-dropped: 0 TX-total: 10 RX-bursts : 9 [100% of 1 pkts] TX-bursts : 10 [100% of 1 pkts] ---------------------------------------------------------------------------- +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ RX-packets: 19 RX-dropped: 0 RX-total: 19 TX-packets: 19 TX-dropped: 0 TX-total: 19 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ CPU cycles/packet=2538 (total cycles=48235 / total RX packets=19) Done. 3、When I start Mon, an error will be reported. I set mon addr = 0000:05:00.0 in ceph.conf [root@ebs12 build]# ./bin/ceph-mon -i a -c ceph.conf 2020-01-15T10:54:20.795+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. 2020-01-15T10:54:20.795+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. 2020-01-15T10:54:20.806+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. 2020-01-15T10:54:20.807+0800 7fce87f5c540 -1 asok(0x7fce8bafa000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/tmp/ceph-asok.EUSssX/mon.a.asok': (2) No such file or directory 2020-01-15T10:54:20.972+0800 7fce87f5c540 -1 WARNING: invalid 'mon addr' config option continuing with monmap configuration EAL: Detected 32 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte_mon.a/mp_socket EAL: No available hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: Error - exiting with code: 1 Cause: No Ethernet ports - bye /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: In function 'ceph::mutex_debug_detail::mutex_debug_impl<<anonymous> >::~mutex_debug_impl() [with bool Recursive = false]' thread 7fce2c44b700 time 2020-01-15T10:54:21.019921+0800 /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: 111: FAILED ceph_assert(r == 0) ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1aa) [0x7fce7e8b9030] 2: (()+0x15162b2) [0x7fce7e8b92b2] 3: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 4: (()+0x39c99) [0x7fce7a002c99] 5: (()+0x39ce7) [0x7fce7a002ce7] 6: (()+0x1e245a4) [0x7fce7f1c75a4] 7: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 8: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 9: (()+0x1893ff1) [0x7fce7ec36ff1] 10: (()+0x1895641) [0x7fce7ec38641] 11: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 12: (()+0x1d7444f) [0x7fce7f11744f] 13: (()+0x13ab30e) [0x7fce7e74e30e] 14: (()+0x7e65) [0x7fce7b40de65] 15: (clone()+0x6d) [0x7fce7a0c788d] *** Caught signal (Aborted) ** in thread 7fce2c44b700 thread_name:msgr-worker-0 2020-01-15T10:54:21.043+0800 7fce2c44b700 -1 /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: In function 'ceph::mutex_debug_detail::mutex_debug_impl<<anonymous> >::~mutex_debug_impl() [with bool Recursive = false]' thread 7fce2c44b700 time 2020-01-15T10:54:21.019921+0800 /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: 111: FAILED ceph_assert(r == 0) ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1aa) [0x7fce7e8b9030] 2: (()+0x15162b2) [0x7fce7e8b92b2] 3: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 4: (()+0x39c99) [0x7fce7a002c99] 5: (()+0x39ce7) [0x7fce7a002ce7] 6: (()+0x1e245a4) [0x7fce7f1c75a4] 7: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 8: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 9: (()+0x1893ff1) [0x7fce7ec36ff1] 10: (()+0x1895641) [0x7fce7ec38641] 11: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 12: (()+0x1d7444f) [0x7fce7f11744f] 13: (()+0x13ab30e) [0x7fce7e74e30e] 14: (()+0x7e65) [0x7fce7b40de65] 15: (clone()+0x6d) [0x7fce7a0c788d] ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (()+0x1489a10) [0x7fce89412a10] 2: (()+0xf5f0) [0x7fce7b4155f0] 3: (gsignal()+0x37) [0x7fce79fff337] 4: (abort()+0x148) [0x7fce7a000a28] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x379) [0x7fce7e8b91ff] 6: (()+0x15162b2) [0x7fce7e8b92b2] 7: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 8: (()+0x39c99) [0x7fce7a002c99] 9: (()+0x39ce7) [0x7fce7a002ce7] 10: (()+0x1e245a4) [0x7fce7f1c75a4] 11: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 12: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 13: (()+0x1893ff1) [0x7fce7ec36ff1] 14: (()+0x1895641) [0x7fce7ec38641] 15: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 16: (()+0x1d7444f) [0x7fce7f11744f] 17: (()+0x13ab30e) [0x7fce7e74e30e] 18: (()+0x7e65) [0x7fce7b40de65] 19: (clone()+0x6d) [0x7fce7a0c788d] 2020-01-15T10:54:21.067+0800 7fce2c44b700 -1 *** Caught signal (Aborted) ** in thread 7fce2c44b700 thread_name:msgr-worker-0 ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (()+0x1489a10) [0x7fce89412a10] 2: (()+0xf5f0) [0x7fce7b4155f0] 3: (gsignal()+0x37) [0x7fce79fff337] 4: (abort()+0x148) [0x7fce7a000a28] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x379) [0x7fce7e8b91ff] 6: (()+0x15162b2) [0x7fce7e8b92b2] 7: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 8: (()+0x39c99) [0x7fce7a002c99] 9: (()+0x39ce7) [0x7fce7a002ce7] 10: (()+0x1e245a4) [0x7fce7f1c75a4] 11: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 12: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 13: (()+0x1893ff1) [0x7fce7ec36ff1] 14: (()+0x1895641) [0x7fce7ec38641] 15: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 16: (()+0x1d7444f) [0x7fce7f11744f] 17: (()+0x13ab30e) [0x7fce7e74e30e] 18: (()+0x7e65) [0x7fce7b40de65] 19: (clone()+0x6d) [0x7fce7a0c788d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. -250> 2020-01-15T10:54:20.795+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. -247> 2020-01-15T10:54:20.795+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. -232> 2020-01-15T10:54:20.806+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. -229> 2020-01-15T10:54:20.807+0800 7fce87f5c540 -1 asok(0x7fce8bafa000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/tmp/ceph-asok.EUSssX/mon.a.asok': (2) No such file or directory -40> 2020-01-15T10:54:20.972+0800 7fce87f5c540 -1 WARNING: invalid 'mon addr' config option continuing with monmap configuration -1> 2020-01-15T10:54:21.043+0800 7fce2c44b700 -1 /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: In function 'ceph::mutex_debug_detail::mutex_debug_impl<<anonymous> >::~mutex_debug_impl() [with bool Recursive = false]' thread 7fce2c44b700 time 2020-01-15T10:54:21.019921+0800 /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: 111: FAILED ceph_assert(r == 0) ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1aa) [0x7fce7e8b9030] 2: (()+0x15162b2) [0x7fce7e8b92b2] 3: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 4: (()+0x39c99) [0x7fce7a002c99] 5: (()+0x39ce7) [0x7fce7a002ce7] 6: (()+0x1e245a4) [0x7fce7f1c75a4] 7: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 8: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 9: (()+0x1893ff1) [0x7fce7ec36ff1] 10: (()+0x1895641) [0x7fce7ec38641] 11: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 12: (()+0x1d7444f) [0x7fce7f11744f] 13: (()+0x13ab30e) [0x7fce7e74e30e] 14: (()+0x7e65) [0x7fce7b40de65] 15: (clone()+0x6d) [0x7fce7a0c788d] 0> 2020-01-15T10:54:21.067+0800 7fce2c44b700 -1 *** Caught signal (Aborted) ** in thread 7fce2c44b700 thread_name:msgr-worker-0 ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (()+0x1489a10) [0x7fce89412a10] 2: (()+0xf5f0) [0x7fce7b4155f0] 3: (gsignal()+0x37) [0x7fce79fff337] 4: (abort()+0x148) [0x7fce7a000a28] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x379) [0x7fce7e8b91ff] 6: (()+0x15162b2) [0x7fce7e8b92b2] 7: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 8: (()+0x39c99) [0x7fce7a002c99] 9: (()+0x39ce7) [0x7fce7a002ce7] 10: (()+0x1e245a4) [0x7fce7f1c75a4] 11: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 12: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 13: (()+0x1893ff1) [0x7fce7ec36ff1] 14: (()+0x1895641) [0x7fce7ec38641] 15: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 16: (()+0x1d7444f) [0x7fce7f11744f] 17: (()+0x13ab30e) [0x7fce7e74e30e] 18: (()+0x7e65) [0x7fce7b40de65] 19: (clone()+0x6d) [0x7fce7a0c788d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. -250> 2020-01-15T10:54:20.795+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. -247> 2020-01-15T10:54:20.795+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. -232> 2020-01-15T10:54:20.806+0800 7fce87f5c540 -1 WARNING: all dangerous and experimental features are enabled. -229> 2020-01-15T10:54:20.807+0800 7fce87f5c540 -1 asok(0x7fce8bafa000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/tmp/ceph-asok.EUSssX/mon.a.asok': (2) No such file or directory -40> 2020-01-15T10:54:20.972+0800 7fce87f5c540 -1 WARNING: invalid 'mon addr' config option continuing with monmap configuration -1> 2020-01-15T10:54:21.043+0800 7fce2c44b700 -1 /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: In function 'ceph::mutex_debug_detail::mutex_debug_impl<<anonymous> >::~mutex_debug_impl() [with bool Recursive = false]' thread 7fce2c44b700 time 2020-01-15T10:54:21.019921+0800 /home/zy/data/shequ/bak/ceph/src/common/mutex_debug.h: 111: FAILED ceph_assert(r == 0) ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1aa) [0x7fce7e8b9030] 2: (()+0x15162b2) [0x7fce7e8b92b2] 3: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 4: (()+0x39c99) [0x7fce7a002c99] 5: (()+0x39ce7) [0x7fce7a002ce7] 6: (()+0x1e245a4) [0x7fce7f1c75a4] 7: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 8: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 9: (()+0x1893ff1) [0x7fce7ec36ff1] 10: (()+0x1895641) [0x7fce7ec38641] 11: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 12: (()+0x1d7444f) [0x7fce7f11744f] 13: (()+0x13ab30e) [0x7fce7e74e30e] 14: (()+0x7e65) [0x7fce7b40de65] 15: (clone()+0x6d) [0x7fce7a0c788d] 0> 2020-01-15T10:54:21.067+0800 7fce2c44b700 -1 *** Caught signal (Aborted) ** in thread 7fce2c44b700 thread_name:msgr-worker-0 ceph version 11.0.2-46853-g0e52da0 (0e52da069bc02704f6a7d5f6a1ce5240bfafc8f0) octopus (dev) 1: (()+0x1489a10) [0x7fce89412a10] 2: (()+0xf5f0) [0x7fce7b4155f0] 3: (gsignal()+0x37) [0x7fce79fff337] 4: (abort()+0x148) [0x7fce7a000a28] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x379) [0x7fce7e8b91ff] 6: (()+0x15162b2) [0x7fce7e8b92b2] 7: (ceph::mutex_debug_detail::mutex_debug_impl<false>::~mutex_debug_impl()+0x34) [0x7fce7e763ab2] 8: (()+0x39c99) [0x7fce7a002c99] 9: (()+0x39ce7) [0x7fce7a002ce7] 10: (()+0x1e245a4) [0x7fce7f1c75a4] 11: (create_dpdk_net_device(CephContext*, unsigned int, unsigned char, bool, bool)+0x6b) [0x7fce7f17b809] 12: (DPDKWorker::initialize()+0x210) [0x7fce7f11770a] 13: (()+0x1893ff1) [0x7fce7ec36ff1] 14: (()+0x1895641) [0x7fce7ec38641] 15: (std::function<void ()>::operator()() const+0x32) [0x7fce7ec35a1a] 16: (()+0x1d7444f) [0x7fce7f11744f] 17: (()+0x13ab30e) [0x7fce7e74e30e] 18: (()+0x7e65) [0x7fce7b40de65] 19: (clone()+0x6d) [0x7fce7a0c788d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 云计算产品部郑印中移（苏州）软件技术有限公司中国移动云能力中心 18896726650 zhengyin(a)cmss.chinamobile.com 江苏省苏州市高新区科技城昆仑山路58号中移软件园 215153

4 years, 3 months

1
0
0 0

bluestore_default_buffered_write = true

by Adam Koczarski

Has anyone ever tried using this feature? I've added it to the [global] section of the ceph.conf on my POC cluster but I'm not sure how to tell if it's actually working. I did find a reference to this feature via Google and they had it in their [OSD] section?? I've tried that too.. TIA Adam

4 years, 3 months

1
0
0 0

CephFS ghost usage/inodes

by Florian Pritz

Hi, When we tried putting some load on our test cephfs setup by restoring a backup in artifactory, we eventually ran out of space (around 95% used in `df` = 3.5TB) which caused artifactory to abort the restore and clean up. However, while a simple `find` no longer shows the files, `df` still claims that we have around 2.1TB of data on the cephfs. `df -i` also shows 2.4M used inodes. When using `du -sh` on a top-level mountpoint, I get 31G used, which is data that is still really here and which is expected to be here. Consequently, we also get the following warning: > MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average > pool cephfs_data objects per pg (38711) is more than 231.802 times cluster average (167) We are running ceph 14.2.5. We have snapshots enabled on cephfs, but there are currently no active snapshots listed by `ceph daemon mds.$hostname dump snaps --server` (see below). I can't say for sure if we created snapshots during the backup restore. > { > "last_snap": 39, > "last_created": 38, > "last_destroyed": 39, > "pending_noop": [], > "snaps": [], > "need_to_purge": {}, > "pending_update": [], > "pending_destroy": [] > } We only have a single CephFS. We use the pool_namespace xattr for our various directory trees on the cephfs. `ceph df` shows: > POOL ID STORED OBJECTS USED %USED MAX AVAIL > cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 24.97 3.1 TiB `ceph daemon mds.$hostname perf dump | grep stray` shows: > "num_strays": 0, > "num_strays_delayed": 0, > "num_strays_enqueuing": 0, > "strays_created": 5097138, > "strays_enqueued": 5097138, > "strays_reintegrated": 0, > "strays_migrated": 0, `rados -p cephfs_data df` shows: > POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR > cephfs_data 2.1 TiB 2477540 0 4955080 0 0 0 10699626 6.9 TiB 86911076 35 TiB 0 B 0 B > > total_objects 29718 > total_used 329 GiB > total_avail 7.5 TiB > total_space 7.8 TiB When I combine the usage and the free space shown by `df` we would exceed our cluster size. Our test cluster currently has 7.8TB total space with a replication size of 2 for all pools. With 2.1TB "used" on the cephfs according to `df` + 3.1TB being shows as "free" I get 5.2TB total size. This would mean >10TB of data when accounted for replication. Clearly this can't fit on a cluster with only 7.8TB of capacity. Do you have any ideas why we see so many objects and so much reported usage? Is there any way to fix this without recreating the cephfs? Florian -- Florian Pritz Research Industrial Systems Engineering (RISE) Forschungs-, Entwicklungs- und Großprojektberatung GmbH Concorde Business Park F 2320 Schwechat Austria E-Mail: florian.pritz(a)rise-world.com Web: www.rise-world.com Firmenbuch: FN 280353i Landesgericht Korneuburg UID: ATU62886416

4 years, 3 months

1
0
0 0

centralized config map error

by Frank R

Hi all, When upgrading from Luminous to Nautilus the global configmap options for cluster_network and public_network were inadvertently set to an incorrect value (10.192.80.0/24): ----- [root@ceph-osd134 ceph]# ceph config dump | grep network <snip> global advanced cluster_network 10.192.80.0/24 * global advanced public_network 10.192.80.0/24 ----- Ceph.conf on all nodes is correctly set to 10.0.0.0/8. Even after restarting the mons I see the following errors with every ceph command: ----- 2020-01-10 20:06:30.815 7f6deffff700 -1 set_mon_vals failed to set cluster_network = 10.192.80.0/24: Configuration option 'cluster_network' may not be modified at runtime 2020-01-10 20:06:30.815 7f6deffff700 -1 set_mon_vals failed to set public_network = 10.192.80.0/24: Configuration option 'public_network' may not be modified at runtime ----- How do I safely change/remove the centralized config network settings? Thanks, Frank

4 years, 3 months

2
1
0 0

Kworker 100% with ceph-msgr (after upgrade to 14.2.6?)

by Marc Roos

I think this is new since I upgraded to 14.2.6. kworker/7:3 100% [@~]# echo l > /proc/sysrq-trigger [Tue Jan 14 10:05:08 2020] CPU: 7 PID: 2909400 Comm: kworker/7:0 Not tainted 3.10.0-1062.4.3.el7.x86_64 #1 [Tue Jan 14 10:05:08 2020] Workqueue: ceph-msgr ceph_con_workfn [libceph] [Tue Jan 14 10:05:08 2020] task: ffffa0d2cb9db150 ti: ffffa0d3040f0000 task.ti: ffffa0d3040f0000 [Tue Jan 14 10:05:08 2020] RIP: 0010:[<ffffffffb0192e7e>] [<ffffffffb0192e7e>] generic_swap+0x1e/0x30 [Tue Jan 14 10:05:08 2020] RSP: 0018:ffffa0d3040f3a20 EFLAGS: 00000206 [Tue Jan 14 10:05:08 2020] RAX: 0000000000000072 RBX: 0000000000000060 RCX: 0000000000000072 [Tue Jan 14 10:05:08 2020] RDX: 0000000000000006 RSI: ffffa0c788914a7a RDI: ffffa0c788914a42 [Tue Jan 14 10:05:08 2020] RBP: ffffa0d3040f3a20 R08: 0000000000000028 R09: 0000000000000016 [Tue Jan 14 10:05:08 2020] R10: 0000000000000036 R11: ffffe94984224500 R12: ffffa0c788914a40 [Tue Jan 14 10:05:08 2020] R13: ffffffffc08d7da0 R14: ffffa0c788914a18 R15: ffffa0c788914a78 [Tue Jan 14 10:05:08 2020] FS: 0000000000000000(0000) GS:ffffa0d2cfbc0000(0000) knlGS:0000000000000000 [Tue Jan 14 10:05:08 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Tue Jan 14 10:05:08 2020] CR2: 000055f0404d5e40 CR3: 0000001813010000 CR4: 00000000000627e0 [Tue Jan 14 10:05:08 2020] Call Trace: [Tue Jan 14 10:05:08 2020] [<ffffffffb0193055>] sort+0x1c5/0x260 [Tue Jan 14 10:05:08 2020] [<ffffffffb0192e60>] ? u32_swap+0x10/0x10 [Tue Jan 14 10:05:08 2020] [<ffffffffc08d807b>] build_snap_context+0x12b/0x290 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08d820c>] rebuild_snap_realms+0x2c/0x90 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08d822b>] rebuild_snap_realms+0x4b/0x90 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08d91fc>] ceph_update_snap_trace+0x3ec/0x530 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08e2239>] handle_reply+0x359/0xc60 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08e48ba>] dispatch+0x11a/0xb00 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffb042e56a>] ? kernel_recvmsg+0x3a/0x50 [Tue Jan 14 10:05:08 2020] [<ffffffffc05fcff4>] try_read+0x544/0x1300 [libceph] [Tue Jan 14 10:05:08 2020] [<ffffffffafee13ce>] ? account_entity_dequeue+0xae/0xd0 [Tue Jan 14 10:05:08 2020] [<ffffffffafee4d5c>] ? dequeue_entity+0x11c/0x5e0 [Tue Jan 14 10:05:08 2020] [<ffffffffb042e417>] ? kernel_sendmsg+0x37/0x50 [Tue Jan 14 10:05:08 2020] [<ffffffffc05fdfb4>] ceph_con_workfn+0xe4/0x1530 [libceph] [Tue Jan 14 10:05:08 2020] [<ffffffffb057f568>] ? __schedule+0x448/0x9c0 [Tue Jan 14 10:05:08 2020] [<ffffffffafebe21f>] process_one_work+0x17f/0x440 [Tue Jan 14 10:05:08 2020] [<ffffffffafebf336>] worker_thread+0x126/0x3c0 [Tue Jan 14 10:05:08 2020] [<ffffffffafebf210>] ? manage_workers.isra.26+0x2a0/0x2a0 [Tue Jan 14 10:05:08 2020] [<ffffffffafec61f1>] kthread+0xd1/0xe0 [Tue Jan 14 10:05:08 2020] [<ffffffffafec6120>] ? insert_kthread_work+0x40/0x40 [Tue Jan 14 10:05:08 2020] [<ffffffffb058cd37>] ret_from_fork_nospec_begin+0x21/0x21 [Tue Jan 14 10:05:08 2020] [<ffffffffafec6120>] ? insert_kthread_work+0x40/0x40 [Tue Jan 14 10:05:08 2020] CPU: 7 PID: 2909400 Comm: kworker/7:0 Not tainted 3.10.0-1062.4.3.el7.x86_64 #1 [Tue Jan 14 10:05:08 2020] Workqueue: ceph-msgr ceph_con_workfn [libceph] [Tue Jan 14 10:05:08 2020] task: ffffa0d2cb9db150 ti: ffffa0d3040f0000 task.ti: ffffa0d3040f0000 [Tue Jan 14 10:05:08 2020] RIP: 0010:[<ffffffffb0192200>] [<ffffffffb0192200>] __x86_indirect_thunk_rax+0x0/0x20 [Tue Jan 14 10:05:08 2020] RSP: 0018:ffffa0d3040f3a28 EFLAGS: 00000286 [Tue Jan 14 10:05:08 2020] RAX: ffffffffb0192e60 RBX: 0000000000000010 RCX: 0000000000000010 [Tue Jan 14 10:05:08 2020] RDX: 0000000000000008 RSI: ffffa0d0e108b828 RDI: ffffa0d0e108b818 [Tue Jan 14 10:05:08 2020] RBP: ffffa0d3040f3a98 R08: 0000000000000000 R09: 0000000000000016 [Tue Jan 14 10:05:08 2020] R10: 0000000000000036 R11: ffffe949a9842280 R12: ffffa0d0e108b818 [Tue Jan 14 10:05:08 2020] R13: ffffffffc08d7da0 R14: ffffa0d0e108b818 R15: ffffa0d0e108b828 [Tue Jan 14 10:05:08 2020] FS: 0000000000000000(0000) GS:ffffa0d2cfbc0000(0000) knlGS:0000000000000000 [Tue Jan 14 10:05:08 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Tue Jan 14 10:05:08 2020] CR2: 000055f0404d5e40 CR3: 0000001813010000 CR4: 00000000000627e0 [Tue Jan 14 10:05:08 2020] Call Trace: [Tue Jan 14 10:05:08 2020] [<ffffffffb0193055>] ? sort+0x1c5/0x260 [Tue Jan 14 10:05:08 2020] [<ffffffffb0192e60>] ? u32_swap+0x10/0x10 [Tue Jan 14 10:05:08 2020] [<ffffffffc08d807b>] build_snap_context+0x12b/0x290 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08d820c>] rebuild_snap_realms+0x2c/0x90 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08d822b>] rebuild_snap_realms+0x4b/0x90 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08d91fc>] ceph_update_snap_trace+0x3ec/0x530 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08e2239>] handle_reply+0x359/0xc60 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffc08e48ba>] dispatch+0x11a/0xb00 [ceph] [Tue Jan 14 10:05:08 2020] [<ffffffffb042e56a>] ? kernel_recvmsg+0x3a/0x50 [Tue Jan 14 10:05:08 2020] [<ffffffffc05fcff4>] try_read+0x544/0x1300 [libceph] [Tue Jan 14 10:05:08 2020] [<ffffffffafee13ce>] ? account_entity_dequeue+0xae/0xd0 [Tue Jan 14 10:05:08 2020] [<ffffffffafee4d5c>] ? dequeue_entity+0x11c/0x5e0 [Tue Jan 14 10:05:08 2020] [<ffffffffb042e417>] ? kernel_sendmsg+0x37/0x50 [Tue Jan 14 10:05:08 2020] [<ffffffffc05fdfb4>] ceph_con_workfn+0xe4/0x1530 [libceph] [Tue Jan 14 10:05:08 2020] [<ffffffffb057f568>] ? __schedule+0x448/0x9c0 [Tue Jan 14 10:05:08 2020] [<ffffffffafebe21f>] process_one_work+0x17f/0x440 [Tue Jan 14 10:05:08 2020] [<ffffffffafebf336>] worker_thread+0x126/0x3c0 [Tue Jan 14 10:05:08 2020] [<ffffffffafebf210>] ? manage_workers.isra.26+0x2a0/0x2a0 [Tue Jan 14 10:05:08 2020] [<ffffffffafec61f1>] kthread+0xd1/0xe0 [Tue Jan 14 10:05:08 2020] [<ffffffffafec6120>] ? insert_kthread_work+0x40/0x40 [Tue Jan 14 10:05:08 2020] [<ffffffffb058cd37>] ret_from_fork_nospec_begin+0x21/0x21 [Tue Jan 14 10:05:08 2020] [<ffffffffafec6120>] ? insert_kthread_work+0x40/0x40

4 years, 3 months

1
0
0 0

unset centralized config read only global setting

by Frank R

Hi All, Sorry for the repost. How do you unset a global config setting from the centralized config with mimic+ (specifically public_network and cluster_network) "ceph config rm global public_network" doesn't seem to do the trick. These were set inadvertently during an upgrade with: "ceph config assimilate-conf" https://ceph.io/community/new-mimic-centralized-configuration-management/ The settings I wish to unset: ----- [root@ceph-mon001 ceph]# ceph config dump | grep network global advanced cluster_network 10.192.80.0/24 * global advanced public_network 10.192.80.0/24 * ----- thx Frank

4 years, 3 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users January 2020