[ceph-users] Re: Ceph RBD - High IOWait during the Writes

12 Nov 2020

From different search results I read, disabling cephx can help.

Also https://static.linaro.org/connect/san19/presentations/san19-120.pdf
recommended some settings changes for the bluestore cache.

[osd]
bluestore cache autotune = 0
bluestore_cache_kv_ratio = 0.2
bluestore_cache_meta_ratio = 0.8
bluestore rocksdb options =
compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M
bluestore_cache_size_hdd = 536870912 # This is size of the Cache on the HDD
osd_min_pg_log_entries = 10
osd_max_pg_log_entries = 10
osd_pg_log_dups_tracked = 10
osd_pg_log_trim_min = 10

But nothing changed much.

It looks like it is mostly with the small files, when I tested the same
with  128k or even 64k block size, the results were much better.

Any suggestions?

Thanks and Regards,

Athreya

On Tue, Nov 10, 2020 at 8:51 PM &lt;athreyavc(a)gmail.com&gt; wrote:

...
  Hi,

 We have recently deployed a Ceph cluster with

 12 OSD nodes(16 Core + 200GB RAM + 30 disks each of 14TB) Running CentOS 8
 3 Monitoring Nodes (8 Core + 16GB RAM) Running CentOS 8

 We are using Ceph Octopus and we are using RBD block devices.

 We have three Ceph client nodes(16core + 30GB RAM, Running CentOS 8)
 across which RBDs are mapped and mounted, 25 RBDs each on each client node.
 Each RBD size is 10TB. Each RBD is formatted as EXT4 file system.

 From network side, we have 10Gbps Active/Passive Bond on all the Ceph
 cluster nodes, including the clients. Jumbo frames enabled  and MTU is 9000

 This is a new cluster and cluster health reports Ok. But we see high IO
 wait during the writes.

 From one of the clients,

 15:14:30        CPU     %user     %nice   %system   %iowait    %steal
  %idle
 15:14:31        all      0.06      0.00      1.00     45.03      0.00
  53.91
 15:14:32        all      0.06      0.00      0.94     41.28      0.00
  57.72
 15:14:33        all      0.06      0.00      1.25     45.78      0.00
  52.91
 15:14:34        all      0.00      0.00      1.06     40.07      0.00
  58.86
 15:14:35        all      0.19      0.00      1.38     41.04      0.00
  57.39
 Average:        all      0.08      0.00      1.13     42.64      0.00
  56.16

 and the system load shows very high

 top - 15:19:15 up 34 days, 41 min,  2 users,  load average: 13.49, 13.62,
 13.83

 From 'atop'

 one of the CPUs shows this

 CPU | sys       7%  | user      1% |  irq       2% |  idle   1394% | wait
   195%  | steal     0% |  guest     0% | ipc  initial  | cycl initial  |
 curf  806MHz |  curscal   ?%

 On the OSD nodes, don't see much %utilization of the disks.

 RBD caching values are default.

 Are we overlooking some configuration item ?

 Thanks and Regards,

 At
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Ceph RBD - High IOWait during the Writes