- ceph-users - lists.ceph.io

by Stefan Priebe - Profihost AG

Hello, is there anything else needed beside running: ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-${OSD} bluefs-bdev-new-db --dev-target /dev/vgroup/lvdb-1 I did so some weeks ago and currently i'm seeing that all osds originally deployed with --block-db show 10-20% I/O waits while all those got converted using ceph-bluestore-tool show 80-100% I/O waits. Also is there some tuning available to use more of the SSD? The SSD (block-db) is only saturated at 0-2%. Greets, Stefan

3 years, 11 months

2
21
0 0

Re: Write Caching to hot tier not working as expected

by Steve Hughes

Thank you Eric. That 'sounds like' exactly my issue. Though I'm surprised to bump into something like that on such a small system and at such low bandwidth. But the information I can find on those parameters is sketchy to say the least. Can you point me at some doco that explains what they do, how to read the current values and how to set them? Cheers, Steve -----Original Message----- From: Eric Smith <Eric.Smith(a)vecima.com> Sent: Monday, 11 May 2020 8:00 PM To: Steve Hughes <steveh(a)scalar.com.au>; ceph-users(a)ceph.io Subject: RE: [ceph-users] Re: Write Caching to hot tier not working as expected It sounds like you might be bumping up against the default objecter_inflight_ops (1024) and/or objecter_inflight_op_bytes (100MB). -----Original Message----- From: steveh(a)scalar.com.au <steveh(a)scalar.com.au> Sent: Monday, May 11, 2020 5:48 AM To: ceph-users(a)ceph.io Subject: [ceph-users] Re: Write Caching to hot tier not working as expected Interestingly, I have found that if I limit the rate at which data is written the tiering behaves as expected. I'm using a robocopy job from a Windows VM to copy large files from my existing storage array to a test Ceph volume. By using the /IPG parameter I can roughly control the rate at which data is written. I've found that if I limit the write rate to around 30MBytes/sec the data all goes to the hot tier, zero data goes to the HDD tier, and the observed write latency is about 5msec. If I go any higher than this I see data being written to the HDDs and the observed write latency goes way up. _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io --

3 years, 11 months

1
0
0 0

ceph-volume/batch fails in non-interactive mode

by Michał Nasiadka

Hello, I stumbled across that weird behaviour today - after initially creating disks and DB LVs on NVMe (using ceph-ansible) - second run of ceph-volume ends up with an error: --> RuntimeError: 4 devices were filtered in non-interactive mode, bailing out What is interesting - the same command in interactive mode (without —yes) ends up without that error. I raised a bug in Tracker - https://tracker.ceph.com/issues/45461 <https://tracker.ceph.com/issues/45461> I found a PR fixing a similar bug here: https://github.com/ceph/ceph/pull/33202 <https://github.com/ceph/ceph/pull/33202> - is that related? Kind regards, Michal

3 years, 11 months

1
0
0 0

Write Caching to hot tier not working as expected

by steveh＠scalar.com.au

Hi all, I'm a newbie to Ceph. I'm an MSP and a small-scale cloud hoster. I'm intending to use Ceph as production storage for a small-scale private hosting cloud. We run ESXi as our HVs so we want to present Ceph as iSCSI. We've got Ceph Nautilus running on a 3-node cluster. Each node contains a pair of Bronze XEONs, 128G RAM, 6 x 10G NICs, 8 x 10TB spinners, 2 x 2TB SATA SSDs and a 4TB NVME. The HDDs and the 2TB SSDs give me an rbd pool of 24 OSDs (1024 PGs) with the SSDs partitioned and used to hold the DB and WAL. Each SSD hold the DB/WAL for 4 HDDs. The NVMes give me a cache pool of 3 OSDs (128 PGs) which I want to use as a hot tier. As far as I can tell I have followed the guidance given here: https://docs.ceph.com/docs/nautilus/rados/operations/cache-tiering/ The cluster is working, the iSCSI is working, and generally everything is looking pretty good. My only problem at this stage is that the tiering is not handling writes in the way I expect, and I really can't get my brain around why. For my test I start with the hot tier empty. I drained it by setting dirty_ratio=dirty_high_ratio=full_ratio = 0. I then set dirty_ratio = 0.5, dirty_high_ratio = 0.6 and full_ratio = 0.7 and start writing data to it at high speed (using a simple file copy) from a VM on ESXi. My expectation is that all inbound writes will land initially on the hot tier, resulting in low latency writes as seen by the ESXi hosts. I expect that once the hot tier fills to 50% that it will start to flush these writes down to the HDD storage. What I actually see is that as soon as I start throwing data at the cluster, the Ceph dashboard shows writes going to both the NVMEs and to the HDDs, and the write latency seen by ESX hits several hundred milliseconds. It seems that the hot tier is absorbing only a fraction of the writes. Here are my pool settings. [root@ceph00 ~]# ceph osd pool ls detail pool 4 'rbd' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 6971 lfor 6971/6971/6971 flags hashpspool,selfmanaged_snaps tiers 11 read_tier 11 write_tier 11 stripe_width 0 application rbd removed_snaps [1~3] pool 11 'cache' replicated size 3 min_size 2 crush_rule 3 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn last_change 7065 lfor 6971/6971/6971 flags hashpspool,incomplete_clones,selfmanaged_snaps tier_of 4 cache_mode writeback target_bytes 3078632557772 hit_set bloom{false_positive_probability: 0.001, target_size: 0, seed: 0} 3600s x12 decay_rate 0 search_last_n 0 min_read_recency_for_promote 2 stripe_width 0 application rbd removed_snaps [1~3] pool 12 'pure_hdd' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 7059 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd removed_snaps [1~3] Is anyone able to point me towards a solution? Thanks, Steve

3 years, 11 months

1
1
0 0

radosgw Swift bulk upload

by Martin Zurowietz

Hi, I'm interested in the Swift extract archive/bulk upload feature [1]. First I thought that this feature is not implemented in radosgw, as it is not mentioned anywhere in the docs [2] and some test requests of mine were not successful. However, while browsing the issue tracker, I found some issues that mention a bulk upload feature [3, 4]. Can someone please clarify whether this feature exists in radosgw or not? Thanks a lot, Martin [1]: https://docs.openstack.org/swift/latest/middleware.html#module-swift.common… [2]: https://docs.ceph.com/docs/master/radosgw/swift/ [3]: https://tracker.ceph.com/issues/19645 [4]: https://tracker.ceph.com/issues/8945 -- Biodata Mining Group Faculty of Technology Bielefeld University https://biodatamining.cebitec.uni-bielefeld.de/ Office: V10-103 +49 521 106 60 58

3 years, 11 months

1
0
0 0

OSD Inbalance - upmap mode

by Ashley Merrick

I have a cluster running 15.2.1, was originally running 14.x, the cluster is running the balance module in upmap mode (I have tried crush-compat in the past) Most OSD's are around the same & used give or take 0.x, however there is one OSD that is down a good few % and a few that are above average by 1 or 2 %, I have been trying to get the balance to fix this. I have tried running a manual osdmaptool command on an export of my map, but it lists no fixed however does display the underfall OSD in it's output (overfull 3,4,5,6,7,8,9,10,11,12,13,14,15,18,19,20 underfull [36]) The debug output is just lots of: 2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 trying 2.55 2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 2.55 [12,3,7,6,33,34,30,35,21,18] -> [12,3,7,6,33,34,30,35,21,16] 2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 will try adding new remapping pair 18 -> 16 for 2.55 NOT selected osd 2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 stddev 528.667 -> 528.667 2020-05-05T06:15:39.172+0000 7f3dfb0c3c40 10 Overfull search osd.7 target 170.667 deviation 9.33327 Is there anything I can to try and balance the overfull onto the underful OSDs to balance out the last bit.

3 years, 11 months

1
1
0 0

Elasticsearch Sync module bug ?

by Luca Cervigni

Hello, I have a two zones cluster. Default zone + a ES sync zone where I would like to index metadata and do searches via the ES tier gateway. Struggling to get this service running (documentation is bad and very, very minimal) I thought that in the end I made it work. The requests are getting correctly to the ES zone, I can see the payload sent to ES with debug rgw = 20, ES returns OK then an error is trigger, apparently a JSON parsing error taking the ES search and giving it back to the user. I tried with Nautilus, Octopus, and different versions of ES, from 6 to 7. I have no possibilities of downgrades as our production cluster is on nautilus. Here the logs: https://pastebin.com/hnj8YysK I would think this is a bug, but would like your opinion. If not a bug how shall I solve this? I am using a handmade S3 tool to query as boto is not supported anymore, obo just does not work, and boto3 cannot send any arbitrary requests anymore. Also I would like to know if those Sync modules are still supported or not. The documentation is very poor and if they still are in the roadmap, it should be improved. Thanks Cheers, Luca Cervigni Pawsey Supercomputing centre

3 years, 11 months

1
0
0 0

RGW crashed

by Zhenshi Zhou

Hi all, I deployed a multi-site cluster in order to sync object from an old cluster to a brand new cluster. It seems good cause I can see the data syncing. However, when I check the cluster health, it shows warn messages "2 daemons have recently crashed". I get the crash info by 'sudo ceph crash info $id': { "os_version_id": "7", "utsname_release": "3.10.0-957.27.2.el7.x86_64", "os_name": "CentOS Linux", "entity_name": "client.rgw.ceph-node7", "timestamp": "2020-05-09 15:17:59.482502Z", "process_name": "radosgw", "utsname_machine": "x86_64", "utsname_sysname": "Linux", "os_version": "7 (Core)", "os_id": "centos", "utsname_version": "#1 SMP Mon Jul 29 17:46:05 UTC 2019", "backtrace": [ "(()+0xf5f0) [0x7f32b1bdf5f0]", "(RGWCoroutine::set_sleeping(bool)+0xc) [0x555eeb1351ac]", "(RGWOmapAppend::flush_pending()+0x2d) [0x555eeb13acad]", "(RGWOmapAppend::finish()+0x10) [0x555eeb13acd0]", "(RGWDataSyncShardCR::stop_spawned_services()+0x2b) [0x555eeb0a185b]", "(RGWDataSyncShardCR::incremental_sync()+0x72a) [0x555eeb0a9baa]", "(RGWDataSyncShardCR::operate()+0x9d) [0x555eeb0ab33d]", "(RGWCoroutinesStack::operate(RGWCoroutinesEnv*)+0x60) [0x555eeb136520]", "(RGWCoroutinesManager::run(std::list<RGWCoroutinesStack*, std::allocator<RGWCoroutinesStack*> >&)+0x236) [0x555eeb137196]", "(RGWCoroutinesManager::run(RGWCoroutine*)+0x78) [0x555eeb138098]", "(RGWRemoteDataLog::run_sync(int)+0x1cf) [0x555eeb08851f]", "(RGWDataSyncProcessorThread::process()+0x46) [0x555eeb1e71a6]", "(RGWRadosThread::Worker::entry()+0x115) [0x555eeb1b6195]", "(()+0x7e65) [0x7f32b1bd7e65]", "(clone()+0x6d) [0x7f32b10e188d]" ], "utsname_hostname": "ceph-node7", "crash_id": "2020-05-09_15:17:59.482502Z_b80d7bee-faa0-4d2f-9d86-a1b3f4d4802e", "ceph_version": "14.2.8" } AND { "os_version_id": "7", "utsname_release": "3.10.0-957.27.2.el7.x86_64", "os_name": "CentOS Linux", "entity_name": "client.rgw.ceph-node7", "timestamp": "2020-05-10 16:23:13.375063Z", "process_name": "radosgw", "utsname_machine": "x86_64", "utsname_sysname": "Linux", "os_version": "7 (Core)", "os_id": "centos", "utsname_version": "#1 SMP Mon Jul 29 17:46:05 UTC 2019", "backtrace": [ "(()+0xf5f0) [0x7f409f42e5f0]", "(RGWCoroutine::set_sleeping(bool)+0xc) [0x55e3f45e01ac]", "(RGWOmapAppend::flush_pending()+0x2d) [0x55e3f45e5cad]", "(RGWOmapAppend::finish()+0x10) [0x55e3f45e5cd0]", "(RGWDataSyncShardCR::stop_spawned_services()+0x2b) [0x55e3f454c85b]", "(RGWDataSyncShardCR::incremental_sync()+0x72a) [0x55e3f4554baa]", "(RGWDataSyncShardCR::operate()+0x9d) [0x55e3f455633d]", "(RGWCoroutinesStack::operate(RGWCoroutinesEnv*)+0x60) [0x55e3f45e1520]", "(RGWCoroutinesManager::run(std::list<RGWCoroutinesStack*, std::allocator<RGWCoroutinesStack*> >&)+0x236) [0x55e3f45e2196]", "(RGWCoroutinesManager::run(RGWCoroutine*)+0x78) [0x55e3f45e3098]", "(RGWRemoteDataLog::run_sync(int)+0x1cf) [0x55e3f453351f]", "(RGWDataSyncProcessorThread::process()+0x46) [0x55e3f46921a6]", "(RGWRadosThread::Worker::entry()+0x115) [0x55e3f4661195]", "(()+0x7e65) [0x7f409f426e65]", "(clone()+0x6d) [0x7f409e93088d]" ], "utsname_hostname": "ceph-node7", "crash_id": "2020-05-10_16:23:13.375063Z_9e70a0c0-929e-445f-b4cd-8d29e909fe2f", "ceph_version": "14.2.8" } So I fetch and check the file "ceph-client.rgw.ceph-node7.log". The log has huge amount of errors like: -732> 2020-05-09 23:17:53.476 7f328b7ff700 0 RGW-SYNC:data:sync:shard[98]:entry[harbor-registry:f70a5eb9-d88d-42fd-ab4e-d300e97094de.4620.94:23]:bucket[harbor-registry:f70a5eb9-d88d-42fd-ab4e-d300e97094de.4620.94:23]:inc_sync[harbor-registry:f70a5 eb9-d88d-42fd-ab4e-d300e97094de.4620.94:23]: ERROR: lease is not taken, abort AND -723> 2020-05-09 23:17:56.388 7f328b7ff700 5 RGW-SYNC:data:sync:shard[88]:entry[harbor-registry:f70a5eb9-d88d-42fd-ab4e-d300e97094de.4620.94:13]:bucket[harbor-registry:f70a5eb9-d88d-42fd-ab4e-d300e97094de.4620.94:13]: incremental sync on bucket fa iled, retcode=-125 AND -215> 2020-05-09 23:17:58.809 7f328b7ff700 5 RGW-SYNC:data:sync:shard[10]:entry[pf2-harbor-swift:f70a5eb9-d88d-42fd-ab4e-d300e97094de.4608.101:113]:bucket[pf2-harbor-swift:f70a5eb9-d88d-42fd-ab4e-d300e97094de.4608.101:113]: full sync on bucket failed, retcode=-125 AND 2020-05-09 23:18:24.048 7f4085867700 1 robust_notify: If at first you don't succeed: (110) Connection timed out 2020-05-09 23:18:24.048 7f4083863700 0 ERROR: failed to distribute cache for shubei.rgw.log:datalog.sync-status.shard.f70a5eb9-d88d-42fd-ab4e-d300e97094de.5 2020-05-09 23:28:49.181 7f407e859700 1 heartbeat_map reset_timeout 'RGWAsyncRadosProcessor::m_tp thread 0x7f407e859700' had timed out after 600 2020-05-10 03:12:01.905 7f409708a700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror And finally it crashed. I'm not sure where the problem is. Were the crashes caused by the network? Thanks

3 years, 11 months

1
0
0 0

octopus cluster deploy with cephadm failed on bootstrap

by Zhenshi Zhou

Hi all, I'm deploying a new octopus cluster using cephadm, follow docs <https://ceph.io/ceph-management/introducing-cephadm/>. However it failed on the bootstrap step. According to the logs, key generating failed because of the lack of directory and file. Did I miss something? here is the logs: [root@ceph-mon1 ~]# cephadm bootstrap --mon-ip 172.24.202.119 INFO:cephadm:Verifying podman|docker is present... INFO:cephadm:Verifying lvm2 is present... INFO:cephadm:Verifying time synchronization is in place... INFO:cephadm:Unit ntpd.service is enabled and running INFO:cephadm:Repeating the final host check... INFO:cephadm:podman|docker (/usr/bin/docker) is present INFO:cephadm:systemctl is present INFO:cephadm:lvcreate is present INFO:cephadm:Unit ntpd.service is enabled and running INFO:cephadm:Host looks OK INFO:root:Cluster fsid: 828e6b7a-91c2-11ea-8644-005056885573 INFO:cephadm:Verifying IP 172.24.202.119 port 3300 ... INFO:cephadm:Verifying IP 172.24.202.119 port 6789 ... INFO:cephadm:Mon IP 172.24.202.119 is in CIDR network 172.24.202.0/24 INFO:cephadm:Pulling latest docker.io/ceph/ceph:v15 container... INFO:cephadm:Extracting ceph user uid/gid from container image... INFO:cephadm:Creating initial keys... INFO:cephadm:Creating initial monmap... INFO:cephadm:Creating mon... INFO:cephadm:Waiting for mon to start... INFO:cephadm:Waiting for mon... INFO:cephadm:Assimilating anything we can from ceph.conf... INFO:cephadm:Generating new minimal ceph.conf... INFO:cephadm:Restarting the monitor... INFO:cephadm:Setting mon public_network... INFO:cephadm:Creating mgr... INFO:cephadm:Wrote keyring to /etc/ceph/ceph.client.admin.keyring INFO:cephadm:Wrote config to /etc/ceph/ceph.conf INFO:cephadm:Waiting for mgr to start... INFO:cephadm:Waiting for mgr... INFO:cephadm:mgr not available, waiting (1/10)... INFO:cephadm:mgr not available, waiting (2/10)... INFO:cephadm:mgr not available, waiting (3/10)... INFO:cephadm:mgr not available, waiting (4/10)... INFO:cephadm:mgr not available, waiting (5/10)... INFO:cephadm:Enabling cephadm module... INFO:cephadm:Waiting for the mgr to restart... INFO:cephadm:Waiting for Mgr epoch 4... INFO:cephadm:Setting orchestrator backend to cephadm... INFO:cephadm:Generating ssh key... INFO:cephadm:Non-zero exit code 22 from /usr/bin/docker run --rm --net=host -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=ceph-mon1 -v /var/log/ceph/828e6b7a-91c2-11ea-8644-005056885573:/var/log/ceph:z -v /tmp/ceph-tmp8ydndiho:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp9yl7hx5u:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph docker.io/ceph/ceph:v15 cephadm generate-key INFO:cephadm:/usr/bin/ceph:stderr Error EINVAL: Traceback (most recent call last): INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/cephadm/module.py", line 1438, in _generate_key INFO:cephadm:/usr/bin/ceph:stderr with open(path, 'r') as f: INFO:cephadm:/usr/bin/ceph:stderr FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpxvc18jh3/key' INFO:cephadm:/usr/bin/ceph:stderr INFO:cephadm:/usr/bin/ceph:stderr During handling of the above exception, another exception occurred: INFO:cephadm:/usr/bin/ceph:stderr INFO:cephadm:/usr/bin/ceph:stderr Traceback (most recent call last): INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command INFO:cephadm:/usr/bin/ceph:stderr return self.handle_command(inbuf, cmd) INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command INFO:cephadm:/usr/bin/ceph:stderr return dispatch[cmd['prefix']].call(self, cmd, inbuf) INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call INFO:cephadm:/usr/bin/ceph:stderr return self.func(mgr, **kwargs) INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda> INFO:cephadm:/usr/bin/ceph:stderr wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper INFO:cephadm:/usr/bin/ceph:stderr return func(*args, **kwargs) INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/cephadm/module.py", line 1443, in _generate_key INFO:cephadm:/usr/bin/ceph:stderr os.unlink(path) INFO:cephadm:/usr/bin/ceph:stderr FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpxvc18jh3/key' INFO:cephadm:/usr/bin/ceph:stderr Traceback (most recent call last): File "./cephadm", line 4579, in <module> r = args.func() File "./cephadm", line 1122, in _default_image return func() File "./cephadm", line 2521, in command_bootstrap cli(['cephadm', 'generate-key']) File "./cephadm", line 2409, in cli ).run(timeout=timeout) File "./cephadm", line 2142, in run self.run_cmd(), desc=self.entrypoint, timeout=timeout) File "./cephadm", line 837, in call_throws raise RuntimeError('Failed command: %s' % ' '.join(command)) RuntimeError: Failed command: /usr/bin/docker run --rm --net=host -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=ceph-mon1 -v /var/log/ceph/828e6b7a-91c2-11ea-8644-005056885573:/var/log/ceph:z -v /tmp/ceph-tmp8ydndiho:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp9yl7hx5u:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph docker.io/ceph/ceph:v15 cephadm generate-key OS: CentOS Linux release 7.8.2003 (Core) Kernel: 3.10.0-514.6.1.el7.x86_64 Docker version: 19.03.8 cephadm: Using recent ceph image ceph/ceph:v15 ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable) Thanks

3 years, 11 months

1
1
0 0

Cluster rename procedure

by Anthony D'Atri

I’ve inherited a couple of clusters with non-default (ie, not “ceph”) internal names, and I want to rename them for the usual reasons. I had previously developed a full list of steps - which I no longer have access to. Anyone done this recently? Want to be sure I’m not missing something. * Nautilus, CentOS 7, RGW and RBD * Rename OSD mountpoints with mount —move * Rename systemd resources / mounts? * Rename /var/lib/ceph/{mon,osd} directories * Rename ceph*conf files on backend and client systems * Rename keyrings — just the filenames? * Rename log files * Ajust `ceph config` paths for admin socket, keyring, logs, mgr/mds/mon data, osd journal, rgw_data * Restart daemons * Ensure /var/run/ceph sockets are appropriately named Thanks — aad

3 years, 11 months

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users