June 2020 - ceph-users - lists.ceph.io

by Seena Fallah

Hi all. Is there anyway to completely health check one OSD host or instance? For example rados bech just on that OSD or do some checks for disk and front and back netowrk? Thanks.

3 years, 9 months

3
2
0 0

Issue with ceph-ansible installation, No such file or directory

by sachin.nicky＠gmail.com

Error occurs here: - name: look up for ceph-volume rejected devices ceph_volume: cluster: "{{ cluster }}" action: "inventory" register: rejected_devices environment: CEPH_VOLUME_DEBUG: 1 CEPH_CONTAINER_IMAGE: "{{ ceph_docker_registry + '/' + ceph_docker_image + ':' + ceph_docker_image_tag if containerized_deployment else None }}" CEPH_CONTAINER_BINARY: "{{ container_binary }}" PYTHONIOENCODING: utf-8 With Error: fatal: [18.225.11.17]: FAILED! => changed=false cmd: ceph-volume inventory --format=json msg: '[Errno 2] No such file or directory' rc: 2 I think ansible is not able to found the ceph-volume command. How can i fix this?

3 years, 9 months

2
1
0 0

Suspicious memory leakage

by XuYun

Hi, We’ve observed some suspicious memory leak problems of MGR since upgraded to Nautilus. Yesterday I upgrade our cluster to the latest 14.2.10 and this problem seems still reproducible. According to the monitoring chart (memory usage of the active mgr node), the memory consumption started to increase with higher velocity at about 21:40pm. Up to now the reserved memory of mgr is about 8.3G according to top. I also checked the log of mgr and found that from 21:38:40, a message "client.0 ms_handle_reset on v2:10.3.1.3:6800/6” was produced every second: 2020-06-29 21:38:24.173 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9979: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 8.1 MiB/s rd, 21 MiB/s wr, 673 op/s 2020-06-29 21:38:26.180 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9980: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 9.6 MiB/s rd, 26 MiB/s wr, 764 op/s 2020-06-29 21:38:28.183 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9981: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 9.3 MiB/s rd, 23 MiB/s wr, 667 op/s 2020-06-29 21:38:30.186 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9982: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 9.3 MiB/s rd, 22 MiB/s wr, 661 op/s 2020-06-29 21:38:32.191 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9983: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 8.1 MiB/s rd, 21 MiB/s wr, 683 op/s 2020-06-29 21:38:34.195 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9984: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 4.0 MiB/s rd, 17 MiB/s wr, 670 op/s 2020-06-29 21:38:36.200 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9985: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 2.4 MiB/s rd, 15 MiB/s wr, 755 op/s 2020-06-29 21:38:38.203 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9986: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 1.2 MiB/s rd, 12 MiB/s wr, 668 op/s 2020-06-29 21:38:40.207 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9987: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 1.1 MiB/s rd, 12 MiB/s wr, 681 op/s 2020-06-29 21:38:40.887 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:41.887 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:42.213 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9988: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 1.2 MiB/s rd, 13 MiB/s wr, 735 op/s 2020-06-29 21:38:42.887 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:43.887 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:44.216 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9989: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 982 KiB/s rd, 12 MiB/s wr, 687 op/s 2020-06-29 21:38:44.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:45.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:46.222 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9990: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 1.2 MiB/s rd, 17 MiB/s wr, 789 op/s 2020-06-29 21:38:46.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:47.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:48.225 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9991: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 1.2 MiB/s rd, 15 MiB/s wr, 684 op/s 2020-06-29 21:38:48.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:49.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:50.228 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9992: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 940 KiB/s rd, 16 MiB/s wr, 674 op/s 2020-06-29 21:38:50.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:51.888 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:52.235 7f2ecab7a700 0 log_channel(cluster) log [DBG] : pgmap v9993: 1824 pgs: 3 active+clean+scrubbing+deep, 1821 active+clean; 54 TiB data, 169 TiB used, 153 TiB / 322 TiB avail; 1.3 MiB/s rd, 19 MiB/s wr, 752 op/s 2020-06-29 21:38:52.889 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 2020-06-29 21:38:53.889 7f2edbec3700 0 client.0 ms_handle_reset on v2:10.3.1.3:6800/6 I enable the debug log of another cluster which has the same problem, the log is attached Any thoughts about this issue? BR, Xu Yun

3 years, 9 months

1
1
0 0

Re: Re layout help: need chassis local io to minimize net links

by Anthony D'Atri

What does “traffic” mean? Reads? Writes will have to hit the net regardless of any machinations. > On Jun 29, 2020, at 7:31 PM, Harry G. Coin <hgcoin(a)gmail.com> wrote: > > I need exactly what ceph is for a whole lot of work, that work just > doesn't represent a large fraction of the total local traffic. Ceph is > the right choice. Plainly ceph has tremendous support for replication > within a chassis, among chassis and among racks. I just need > intra-chassis traffic to not hit the net much. Seems not such an > unreasonable thing given the intra-chassis crush rules and all. After > all.. ceph's name wasn't chosen for where it can't go.... > >>>> On 6/29/20 1:57 PM, Marc Roos wrote: >> I wonder if you should not have chosen a different product? Ceph is >> meant to distribute data across nodes, racks, data centers etc. For a >> nail use a hammer, for a screw use a screw driver. >> -----Original Message----- >> To: ceph-users(a)ceph.io >> Subject: *****SPAM***** [ceph-users] layout help: need chassis local io >> to minimize net links >> Hi >> I have a few servers each with 6 or more disks, with a storage workload >> that's around 80% done entirely within each server. From a >> work-to-be-done perspective there's no need for 80% of the load to >> traverse network interfaces, the rest needs what ceph is all about. So >> I cooked up a set of crush maps and pools, one map/pool for each server >> and one map/pool for the whole. Skipping the long story, the >> performance remains network link speed bound and has got to change. >> "Chassis local" io is too slow. I even tried putting a mon within each >> server. I'd like to avoid having to revert to some other HA >> filesystem per server with ceph at the chassis layer if I can help >> it. >> Any notions that would allow 'chassis local' rbd traffic to avoid or >> mostly avoid leaving the box? >> Thanks! >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an >> email to ceph-users-leave(a)ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

3 years, 9 months

2
3
0 0

Re: Re layout help: need chassis local io to minimize net links

by Jeff Welling

I've not been replying to the list, apologies. > just the write metadata to the mon, with the actual write data content not having to cross a physical ethernet cable but directly to the chassis-local osds via the 'virtual' internal switch? This is my understanding as well, yes. I've not explored the ceph source yet though. On 2020-06-29 8:37 p.m., Harry G. Coin wrote: > > Jeff, thanks for the lead. When a user space rbd write has as a > destination three replica osds in the same chassis, does the whole > write get shipped out to the mon and then back, or just the write > metadata to the mon, with the actual write data content not having to > cross a physical ethernet cable but directly to the chassis-local osds > via the 'virtual' internal switch? I thought when I read the layout > of how ceph works only the control traffic goes to the mons, the data > directly from the generator to the osds. Did I get that wrong? > > > On 6/29/20 10:32 PM, Jeff W wrote: >> You mentioned setting up pools per host but still hitting network >> limits, did you try tcpdumping the NIC to see who's talking to who? >> Perhaps something isn't configured the way you expect? That may help >> you narrow down what is using the NIC as well, Mon or osd or what >> not. If it's local, I would think that the NIC wouldn't be a >> bottleneck and if it is a bottleneck I would suspect my own configs, >> but that's just my 2c. >> >> Off the top of my head im thinking it's the Mon, because even if you >> setup multiple pools I can't think of a way to have multiple groups >> of mons maintaining their own shards of consensus. Unless your >> workload is largely read only, then .. I'm not sure what the >> bottleneck would be. >> >> >> On Mon., Jun. 29, 2020, 7:32 p.m. Harry G. Coin, <hgcoin(a)gmail.com >> <mailto:hgcoin@gmail.com>> wrote: >> >> I need exactly what ceph is for a whole lot of work, that work just >> doesn't represent a large fraction of the total local traffic. >> Ceph is >> the right choice. Plainly ceph has tremendous support for >> replication >> within a chassis, among chassis and among racks. I just need >> intra-chassis traffic to not hit the net much. Seems not such an >> unreasonable thing given the intra-chassis crush rules and all. >> After >> all.. ceph's name wasn't chosen for where it can't go.... >> >> On 6/29/20 1:57 PM, Marc Roos wrote: >> > I wonder if you should not have chosen a different product? >> Ceph is >> > meant to distribute data across nodes, racks, data centers etc. >> For a >> > nail use a hammer, for a screw use a screw driver. >> > >> > >> > -----Original Message----- >> > To: ceph-users(a)ceph.io <mailto:ceph-users@ceph.io> >> > Subject: *****SPAM***** [ceph-users] layout help: need chassis >> local io >> > to minimize net links >> > >> > Hi >> > >> > I have a few servers each with 6 or more disks, with a storage >> workload >> > that's around 80% done entirely within each server. From a >> > work-to-be-done perspective there's no need for 80% of the load to >> > traverse network interfaces, the rest needs what ceph is all >> about. So >> > I cooked up a set of crush maps and pools, one map/pool for >> each server >> > and one map/pool for the whole. Skipping the long story, the >> > performance remains network link speed bound and has got to >> change. >> > "Chassis local" io is too slow. I even tried putting a mon >> within each >> > server. I'd like to avoid having to revert to some other HA >> > filesystem per server with ceph at the chassis layer if I can help >> > it. >> > >> > Any notions that would allow 'chassis local' rbd traffic to >> avoid or >> > mostly avoid leaving the box? >> > >> > Thanks! >> > >> > >> > >> > >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users(a)ceph.io >> <mailto:ceph-users@ceph.io> To unsubscribe send an >> > email to ceph-users-leave(a)ceph.io <mailto:ceph-users-leave@ceph.io> >> > >> > >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> <mailto:ceph-users@ceph.io> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >> <mailto:ceph-users-leave@ceph.io> >>

3 years, 9 months

1
0
0 0

Debian install

by Rafael Quaglio

Hi, We have already installed a new Debian (10.4) server and I need put it in a Ceph cluster. When I execute the command to install ceph on this node: ceph-deploy install --release nautilus node1 It starts to install a version 12.x in my node... (...) [serifos][DEBUG ] After this operation, 183 MB of additional disk space will be used. [serifos][DEBUG ] Selecting previously unselected package python-cephfs. (Reading database ... 30440 files and directories currently installed.) [serifos][DEBUG ] Preparing to unpack .../python-cephfs_12.2.11+dfsg1-2.1+b1_amd64.deb ... [serifos][DEBUG ] Unpacking python-cephfs (12.2.11+dfsg1-2.1+b1) ... [serifos][DEBUG ] Selecting previously unselected package ceph-common. [serifos][DEBUG ] Preparing to unpack .../ceph-common_12.2.11+dfsg1-2.1+b1_amd64.deb ... [serifos][DEBUG ] Unpacking ceph-common (12.2.11+dfsg1-2.1+b1) ... (...) How do I upgrade this packages? Even installed packages in this version, the installation completes without erros. The question is due to an error message that I'm recieving when deploy a new osd. ceph-deploy osd create --data /dev/sdb node1 At this point: [ceph_deploy.osd][INFO ] Distro info: debian 10.4 buster [ceph_deploy.osd][DEBUG ] Deploying osd to node1 [node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [node1][DEBUG ] find the location of an executable [node1][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb [node1][WARNIN] --> RuntimeError: Unable to create a new OSD id [node1][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key [node1][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 76da6c51-8385-4ffc-9a8e-0dfc11e31feb [node1][DEBUG ] stderr: /build/ceph-qtARip/ceph-12.2.11+dfsg1/src/mon/MonMap.cc: In function 'void MonMap::sanitize_mons(std::map<std::__cxx11::basic_string<char>, entity_addr_t>&)' thread 7f2bc7fff700 time 2020-06-29 06:56:17.331350 [node1][DEBUG ] stderr: /build/ceph-qtARip/ceph-12.2.11+dfsg1/src/mon/MonMap.cc: 77: FAILED assert(mon_info[p.first].public_addr == p.second) [node1][DEBUG ] stderr: ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable) [node1][DEBUG ] stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0xf5) [0x7f2bdaff5f75] [node1][DEBUG ] stderr: 2: (MonMap::sanitize_mons(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, entity_addr_t, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, entity_addr_t> > >&)+0x568) [0x7f2bdb050038] [node1][DEBUG ] stderr: 3: (MonMap::decode(ceph::buffer::list::iterator&)+0x4da) [0x7f2bdb05500a] [node1][DEBUG ] stderr: 4: (MonClient::handle_monmap(MMonMap*)+0x216) [0x7f2bdb042a06] [node1][DEBUG ] stderr: 5: (MonClient::ms_dispatch(Message*)+0x4ab) [0x7f2bdb04729b] [node1][DEBUG ] stderr: 6: (DispatchQueue::entry()+0xeba) [0x7f2bdb06bf5a] [node1][DEBUG ] stderr: 7: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f2bdb1576fd] [node1][DEBUG ] stderr: 8: (()+0x7fa3) [0x7f2be499dfa3] [node1][DEBUG ] stderr: 9: (clone()+0x3f) [0x7f2be45234cf] [node1][DEBUG ] stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. [node1][ERROR ] RuntimeError: command returned non-zero exit status: 1 [ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs I think this error occurs because the wrong package that was installed. Thanks, Rafael

3 years, 9 months

4
4
0 0

Octopus missing rgw-orphan-list tool

by Andrei Mikhailovsky

Hello, I have been struggling a lot with radosgw buckets space wastage, which is currently stands at about 2/3 of utilised space is wasted and unaccounted for. I've tried to use the tools to find the orphan objects, but these were running in loop for weeks on without producing any results. Wido and a few others pointed out that this function is broke in was deprecated and that instead the rgw-orphan-list should be used instead. I have upgraded to Octopus and I have been following the documentation [ https://docs.ceph.com/docs/master/radosgw/orphans/ | https://docs.ceph.com/docs/master/radosgw/orphans/ ] . However, the ceph and radon packages for Ubuntu 18.04 do not seem to have this tool. The same applies to the bucket radoslist option to the radosgw-admin command. root@arh-ibstorage1-ib:~# radosgw-admin bucket radoslist ERROR: Unrecognized argument: 'radoslist' Expected one of the following: check chown limit link list reshard rewrite rm stats sync unlink root@arh-ibstorage1-ib:~# dpkg -l *rados\* Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-=======================-================-================-==================================================== un librados <none> <none> (no description available) ii librados2 15.2.3-1bionic amd64 RADOS distributed object store client library ii libradosstriper1 15.2.3-1bionic amd64 RADOS striping interface ii python3-rados 15.2.3-1bionic amd64 Python 3 libraries for the Ceph librados library ii radosgw 15.2.3-1bionic amd64 REST gateway for RADOS distributed object store I am running Ubuntu 18.04 with version 15.2.3 of ceph and radosgw. Please suggest what should I do to remove the wasted space that radosgw is creating? I've calculated the wasted space by adding up the reported usage of all the buckets and checking it agains the output of the rados df command. The buckets are using around 11TB. The rados df reports 68TB of usage with replica of 2. Rather alarming! Thanks for you help

3 years, 9 months

1
0
0 0

Nautilus latest builds for CentOS 8

by Victoria Martinez de la Cruz

Hi all, I'm wondering if there is a plan to add Nautilus builds for CentOS 8 [0]. Right now I see there are builds for CentOS 7, but for CentOS 8 there are only builds for Octopus and master. Thanks, V [0] https://shaman.ceph.com/api/search/?project=ceph&distros=centos/8&ref=nauti…

3 years, 9 months

6
14
0 0

Octopus Grafana inside the dashboard

by Simon Sutter

Hello I'm trying to get Grafana working inside the Dashboard. If I press on "Overall Performance" tab, I get an error, because the iframe tries to connect to the internal hostname, which cannot be resolved from my machine. If I directly open grafana, everything works. How can I tell the dashboard, to use the full domain name? I have tried to set "ceph dashboard set-grafana-api-url https://node01.mycorp.local:3000" but that does not work and always sets itself back to "https://node01:3000". Thanks in advance, Simon

3 years, 9 months

2
1
0 0

Nautilus 14.2.10 mon_warn_on_pool_no_redundancy

by Wout van Heeswijk

Hi All, I really like the idea of warning users against using unsafe practices. Wouldn't it make sense to warn against using min_size=1 instead of size=1. I've seen data loss happen with size=2 min_size=1 when multiple failures occur and write have been done between the failures. Effectively the new warning below says "It is not considered safe to run with no redundancy". Which is true, but when failure occurs or maintenance is executed, with size=2 and min_size=1, as soon as data is written, there might not be data redundancy for that newly written data. A failure of an OSD at that moment would result in data loss. Since you cannot run size=1 with min_size > 1, this use-case would also be covered. I understand this has implications for size=2 when executing maintenance, but I think most people are not aware of the risks they are taking with min_size=1. Those that are aware can suppress the warning. * Ceph will issue a health warning if a RADOS pool's `size` is set to 1 or in other words the pool is configured with no redundancy. This can be fixed by setting the pool size to the minimum recommended value with:: ceph osd pool set <pool-name> size <num-replicas> The warning can be silenced with:: ceph config set global mon_warn_on_pool_no_redundancy false -- kind regards, Wout 42on

3 years, 9 months

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2020