September 2020 - ceph-users

cephadm daemons vs cephadm services -- what's the difference?

by John Zachary Dover

What is the difference between services and daemons? Specifically, what does it mean that "orch ps" lists cephadm daemons and "orch ls" lists cephadm services? This question will help me close this bug: https://tracker.ceph.com/issues/47142 Zac Dover Upstream Docs Ceph

3 years, 7 months

2
1
0 0

osd regularly wrongly marked down

by Francois Legrand

Hi all, We have a ceph cluster in production with 6 osds servers (with 16x8TB disks), 3 mons/mgrs and 3 mdss. Both public and cluster networks are in 10GB and works well. After a major crash in april, we turned the option bluefs_buffered_io to false to workaround the large write bug when bluefs_buffered_io was true (we were in version 14.2.8 and the default value at this time was true). Since that time, we regularly have some osds wrongly marked down by the cluster after heartbeat timeout (heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15). Generally the osd restart and the cluster is back healthy, but several time, after many of these kick-off the osd reach the osd_op_thread_suicide_timeout and goes down definitely. We increased the osd_op_thread_timeout and osd_op_thread_suicide_timeout... The problems still occurs (but less frequently). Few days ago, we upgraded to 14.2.11 and revert the timeout to their default value, hoping that it will solve the problem (we thought that it should be related to this bug https://tracker.ceph.com/issues/45943), but it didn't. We still have some osds wrongly marked down. Can somebody help us to fix this problem ? Thanks. Here is an extract of an osd log at failure time: --------------------------------- 2020-08-28 02:19:05.019 7f03f1384700 0 log_channel(cluster) log [DBG] : 44.7d scrub starts 2020-08-28 02:19:25.755 7f040e43d700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15 2020-08-28 02:19:25.755 7f040dc3c700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15 this last line is repeated more than 1000 times ... 2020-08-28 02:20:17.484 7f040d43b700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15 2020-08-28 02:20:17.551 7f03f1384700 0 bluestore(/var/lib/ceph/osd/ceph-16) log_latency_fn slow operation observed for _collection_list, latency = 67.3532s, lat = 67s cid =44.7d_head start GHMAX end GHMAX max 25 ... 2020-08-28 02:20:22.600 7f040dc3c700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15 2020-08-28 02:21:20.774 7f03f1384700 0 bluestore(/var/lib/ceph/osd/ceph-16) log_latency_fn slow operation observed for _collection_list, latency = 63.223s, lat = 63s cid =44.7d_head start #44:beffc78d:::rbd_data.1e48e8ab988992.00000000000011bd:0# end #MAX# max 2147483647 2020-08-28 02:21:20.774 7f03f1384700 1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15 2020-08-28 02:21:20.805 7f03f1384700 0 log_channel(cluster) log [DBG] : 44.7d scrub ok 2020-08-28 02:21:21.099 7f03fd997700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.16 down, but it is still running 2020-08-28 02:21:21.099 7f03fd997700 0 log_channel(cluster) log [DBG] : map e609411 wrongly marked me down at e609410 2020-08-28 02:21:21.099 7f03fd997700 1 osd.16 609411 start_waiting_for_healthy 2020-08-28 02:21:21.119 7f03fd997700 1 osd.16 609411 start_boot 2020-08-28 02:21:21.124 7f03f0b83700 1 osd.16 pg_epoch: 609410 pg[36.3d0( v 609409'481293 (449368'478292,609409'481293] local-lis/les=609403/609404 n=154651 ec=435353/435353 lis/c 609403/609403 les/c/f 609404/609404/0 609410/609410/608752) [25,72] r=-1 lpr=609410 pi=[609403,609410)/1 luod=0'0 lua=609392'481198 crt=609409'481293 lcod 609409'481292 active mbc={}] start_peering_interval up [25,72,16] -> [25,72], acting [25,72,16] -> [25,72], acting_primary 25 -> 25, up_primary 25 -> 25, role 2 -> -1, features acting 4611087854031667199 upacting 4611087854031667199 ... 2020-08-28 02:21:21.166 7f03f0b83700 1 osd.16 pg_epoch: 609411 pg[36.56( v 609409'480511 (449368'477424,609409'480511] local-lis/les=609403/609404 n=153854 ec=435353/435353 lis/c 609403/609403 les/c/f 609404/609404/0 609410/609410/609410) [103,102] r=-1 lpr=609410 pi=[609403,609410)/1 crt=609409'480511 lcod 609409'480510 unknown NOTIFY mbc={}] state<Start>: transitioning to Stray 2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity public network em1 numa node 0 2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity cluster network em2 numa node 0 2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity objectstore and network numa nodes do not match 2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity not setting numa affinity 2020-08-28 02:21:21.566 7f040a435700 1 osd.16 609413 tick checking mon for new map 2020-08-28 02:21:22.515 7f03fd997700 1 osd.16 609414 state: booting -> active 2020-08-28 02:21:22.515 7f03f0382700 1 osd.16 pg_epoch: 609414 pg[36.20( v 609409'483167 (449368'480117,609409'483167] local-lis/les=609403/609404 n=155171 ec=435353/435353 lis/c 609403/609403 les/c/f 609404/609404/0 609414/609414/609361) [97,16,72] r=1 lpr=609414 pi=[609403,609414)/1 crt=609409'483167 lcod 609409'483166 unknown NOTIFY mbc={}] start_peering_interval up [97,72] -> [97,16,72], acting [97,72] -> [97,16,72], acting_primary 97 -> 97, up_primary 97 -> 97, role -1 -> 1, features acting 4611087854031667199 upacting 4611087854031667199 ... 2020-08-28 02:21:22.522 7f03f1384700 1 osd.16 pg_epoch: 609414 pg[36.2f3( v 609409'479796 (449368'476712,609409'479796] local-lis/les=609403/609404 n=154451 ec=435353/435353 lis/c 609403/609403 les/c/f 609404/609404/0 609414/609414/609414) [16,34,21] r=0 lpr=609414 pi=[609403,609414)/1 crt=609409'479796 lcod 609409'479795 mlcod 0'0 unknown NOTIFY mbc={}] start_peering_interval up [34,21] -> [16,34,21], acting [34,21] -> [16,34,21], acting_primary 34 -> 16, up_primary 34 -> 16, role -1 -> 0, features acting 4611087854031667199 upacting 4611087854031667199 2020-08-28 02:21:22.522 7f03f1384700 1 osd.16 pg_epoch: 609414 pg[36.2f3( v 609409'479796 (449368'476712,609409'479796] local-lis/les=609403/609404 n=154451 ec=435353/435353 lis/c 609403/609403 les/c/f 609404/609404/0 609414/609414/609414) [16,34,21] r=0 lpr=609414 pi=[609403,609414)/1 crt=609409'479796 lcod 609409'479795 mlcod 0'0 unknown mbc={}] state<Start>: transitioning to Primary 2020-08-28 02:21:24.738 7f03f1384700 0 log_channel(cluster) log [DBG] : 36.2f3 scrub starts 2020-08-28 02:22:18.857 7f03f1384700 0 log_channel(cluster) log [DBG] : 36.2f3 scrub ok

3 years, 7 months

3
4
0 0

Avail Online Home Help Services From Qualified Experts

by amarasmith488＠gmail.com

Are you unable to complete your homework assignment? Are you looking for a reliable online homework help service provider? LiveWebTutors is one of the best and most reliable companies when it comes to providing an assignment writing service. You can connect with the experts whether you are in need of a dissertation writing service or coursework help service. The professionals will ensure that your writing needs are covered within the given deadline and that too as per the instructions stated by the college professor! So, get your grades better by connecting with the professionals of LiveWebTutors now! Visit us: https://www.livewebtutors.com/usa/coursework-help

3 years, 7 months

1
0
0 0

RandomCrashes on OSDs Attached to Mon Hosts with Octopus

by Denis Krienbühl

Hi! We've recently upgraded all our clusters from Mimic to Octopus (15.2.4). Since then, our largest cluster is experiencing random crashes on OSDs attached to the mon hosts. This is the crash we are seeing (cut for brevity, see links in post scriptum): { "ceph_version": "15.2.4", "utsname_release": "4.15.0-72-generic", "assert_condition": "r == 0", "assert_func": "void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)", "assert_file": "/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc <http://bluestore.cc/>", "assert_line": 11430, "assert_thread_name": "bstore_kv_sync", "assert_msg": "/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc <http://bluestore.cc/>: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7fc56311a700 time 2020-08-26T08:52:24.917083+0200\n/build/ceph-15.2.4/src/os/bluestore/BlueStore.cc <http://bluestore.cc/>: 11430: FAILED ceph_assert(r == 0)\n", "backtrace": [ "(()+0x12890) [0x7fc576875890]", "(gsignal()+0xc7) [0x7fc575527e97]", "(abort()+0x141) [0x7fc575529801]", "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a5) [0x559ef9ae97b5]", "(ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x559ef9ae993f]", "(BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)+0x3a0) [0x559efa0245b0]", "(BlueStore::_kv_sync_thread()+0xbdd) [0x559efa07745d]", "(BlueStore::KVSyncThread::entry()+0xd) [0x559efa09cd3d]", "(()+0x76db) [0x7fc57686a6db]", "(clone()+0x3f) [0x7fc57560a88f]" ] } Right before the crash occurs, we see the following message in the crash log: -3> 2020-08-26T08:52:24.787+0200 7fc569b2d700 2 rocksdb: [db/db_impl_compaction_flush.cc:2212 <http://db_impl_compaction_flush.cc:2212/>] Waiting after background compaction error: Corruption: block checksum mismatch: expected 2548200440, got 2324967102 in db/815839.sst offset 67107066 size 3808, Accumulated background error counts: 1 -2> 2020-08-26T08:52:24.852+0200 7fc56311a700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 2548200440, got 2324967102 in db/815839.sst offset 67107066 size 3808 code = 2 Rocksdb transaction: In short, we see a Rocksdb corruption error after background compaction, when this happens. When an OSD crashes, which happens about 10-15 times a day, it restarts and resumes work without any further problems. We are pretty confident that this is not a hardware issue, due to the following facts: * The crashes occur on 5 different hosts over 3 different racks. * There is no smartctl/dmesg output that could explain it. * It usually happens to a different OSD that did not crash before. Still we checked the following on a few OSDs/hosts: * We can do a manual compaction, both offline and online. * We successfully ran "ceph-bluestore-tool fsck --deep yes" on one of the OSDs. * We manually compacted a number of OSDs, one of which crashed hours later. The only thing we have noticed so far: It only happens to OSDs that are attached to a mon host. *None* of the non-mon host OSDs have had a crash! Does anyone have a hint what could be causing this? We currently have no good theory that could explain this, much less have a fix or workaround. Any help would be greatly appreciated. Denis Crash: https://public-resources.objects.lpg.cloudscale.ch/osd-crash/meta.txt <https://public-resources.objects.lpg.cloudscale.ch/osd-crash/meta.txt> Log: https://public-resources.objects.lpg.cloudscale.ch/osd-crash/log.txt <https://public-resources.objects.lpg.cloudscale.ch/osd-crash/log.txt>

3 years, 7 months

7
11
0 0

rgw.none vs quota

by Jean-Sebastien Landry

Hi everyone, a bucket was overquota, (default quota of 300k objects per bucket), I enabled the object quota for this bucket and set a quota of 600k objects. We are on Luminous (12.2.12) and dynamic resharding is disabled, I manually do the resharding from 3 to 6 shards. Since then, radosgw-admin bucket stats report a `rgw.none` in the usage section for this bucket. I search the mailing-lists, bugzilla, github, it's look like I can ignore the rgw.none stats. (0 byte object, entry left in the index marked as cancelled...) but, the num_object in rgw.none is part of the quota usage. I bump the quota to 800k object to workaround the problem. (without resharding) Is there a way I can garbage collect the rgw.none? Is this problem fixed in Mimic/Nautilus/Octopus? "usage": { "rgw.none": { "size": 0, "size_actual": 0, "size_utilized": 0, "size_kb": 0, "size_kb_actual": 0, "size_kb_utilized": 0, "num_objects": 417827 }, "rgw.main": { "size": 1390778138502, "size_actual": 1391581007872, "size_utilized": 1390778138502, "size_kb": 1358181776, "size_kb_actual": 1358965828, "size_kb_utilized": 1358181776, "num_objects": 305637 } }, Thanks!

3 years, 7 months

3
2
0 0

Re: cephfs needs access from two networks

by Simon Sutter

Hello again So I have changed the network configuration. Now my Ceph is reachable from outside, this also means all osd’s of all nodes are reachable. I still have the same behaviour which is a timeout. The client can resolve all nodes with their hostnames. The mon’s are still listening on the internal network so the nat rule is still there. I have set “public bind addr” to the external ip and restarted the mon but it’s still not working. [root@testnode1 ~]# ceph config get mon.public_bind_addr WHO MASK LEVEL OPTION VALUE RO mon advanced public_bind_addr v2:[ext-addr]:0/0 * Do I have to change them somewhere else too? Thanks in advance, Simon Von: Janne Johansson [mailto:icepic.dz@gmail.com] Gesendet: 27 August 2020 20:01 An: Simon Sutter <ssutter(a)hosttech.ch> Betreff: Re: [ceph-users] cephfs needs access from two networks Den tors 27 aug. 2020 kl 12:05 skrev Simon Sutter <ssutter(a)hosttech.ch<mailto:ssutter@hosttech.ch>>: Hello Janne Oh I missed that point. No, the client cannot talk directly to the osds. In this case it’s extremely difficult to set this up. This is an absolute requirement to be a ceph client. How is the mon telling the client, which host and port of the osd, it should connect to? The same port and ip that the ODS called into the mon with when it started up and joined the clusster. Can I have an influence on it? Well, you set the ip on the OSD hosts, and the port ranges in use for OSDs are changeable/settable, but it would not really help the above-mentioned client. Von: Janne Johansson [mailto:icepic.dz@gmail.com<mailto:icepic.dz@gmail.com>] Gesendet: 26 August 2020 15:09 An: Simon Sutter <ssutter(a)hosttech.ch<mailto:ssutter@hosttech.ch>> Cc: ceph-users(a)ceph.io<mailto:ceph-users@ceph.io> Betreff: Re: [ceph-users] cephfs needs access from two networks Den ons 26 aug. 2020 kl 14:16 skrev Simon Sutter <ssutter(a)hosttech.ch<mailto:ssutter@hosttech.ch>>: Hello, So I know, the mon services can only bind to just one ip. But I have to make it accessible to two networks because internal and external servers have to mount the cephfs. The internal ip is 10.99.10.1 and the external is some public-ip. I tried nat'ing it with this: "firewall-cmd --zone=public --add-forward-port=port=6789:proto=tcp:toport=6789:toaddr=10.99.10.1 -permanent" So the nat is working, because I get a "ceph v027" (alongside with some gibberish) when I do a telnet "telnet *public-ip* 6789" But when I try to mount it, I get just a timeout: mount -vvvv -t ceph *public-ip*:6789:/testing /mnt -o name=test,secretfile=/root/ceph.client. test.key mount error 110 = Connection timed out The tcpdump also recognizes a "Ceph Connect" packet, coming from the mon. How can I get around this problem? Is there something I have missed? Any ceph client will need direct access to all OSDs involved also. Your mail doesn't really say if the cephfs-mounting client can talk to OSDs? In ceph, traffic is not shuffled via mons, mons only tell the client which OSDs it needs to talk to, then all IO goes directly from client to any involved OSD servers. -- May the most significant bit of your life be positive. -- May the most significant bit of your life be positive.

3 years, 7 months

2
1
0 0

Re: cephfs needs access from two networks]

by Marcel Kuiper

The mons get their bind address from the monmap I believe. So this means changing in the monmap the ip-addresses of the monitors with the monmaptool. Regards Marcel > Hello again > > So I have changed the network configuration. > Now my Ceph is reachable from outside, this also means all osdâs of all > nodes are reachable. > I still have the same behaviour which is a timeout. > > The client can resolve all nodes with their hostnames. > The monâs are still listening on the internal network so the nat rule is > still there. > I have set âpublic bind addrâ to the external ip and restarted the mon > but itâs still not working. > > [root@testnode1 ~]# ceph config get mon.public_bind_addr > WHO MASK LEVEL OPTION VALUE RO > mon advanced public_bind_addr v2:[ext-addr]:0/0 * > > Do I have to change them somewhere else too? > > Thanks in advance, > Simon >

3 years, 7 months

1
0
0 0

setting bucket quota using admin API does not work

by Youzhong Yang

Hi all, I tried to set bucket quota using admin API as shown below: admin/user?quota&uid=bse&bucket=test&quota-type=bucket with payload in json format: { "enabled": true, "max_size": 1099511627776, "max_size_kb": 1073741824, "max_objects": -1 } it returned success but the quota change did not happen, as confirmed by 'radosgw-admin bucket stats --bucket=test' command. Am I missing something obvious? Please kindly advise/suggest. By the way, I am using ceph mimic (v13.2.4). Setting quota by radosgw-admin quota set --bucket=${BUCK} --max-size=1T --quota-scope=bucket works, but I want to do it programmatically. Thanks in advance, -Youzhong

3 years, 7 months

1
1
0 0

Delete OSD spec (mgr)?

by Darrell Enns

Is there a way to remove an OSD spec from the mgr? I've got one in there that I don't want. It shows up when I do "ceph orch osd spec --preview", and I can't find any way to get rid of it.

3 years, 7 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users September 2020