[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10

7 Aug 2020

Yeah, there are cases where enabling it will improve performance as 
rocksdb can then used the page cache as a (potentially large) secondary 
cache beyond the block cache and avoid hitting the underlying devices 
for reads.  Do you have a lot of spare memory for page cache on your OSD 
nodes? You may be able to improve the situation with 
bluefs_buffered_io=false by increasing the osd_memory_target which 
should give the rocksdb block cache more memory to work with directly.  
One downside is that we currently double cache onodes in both the 
rocksdb cache and bluestore onode cache which hurts us when memory 
limited.  We have some experimental work that might help in this area by 
better balancing bluestore onode and rocksdb block caches but it needs 
to be rebased after Adam's column family sharding work.

The reason we had to disable bluefs_buffered_io again was that we had 
users with certain RGW workloads where the kernel started swapping large 
amounts of memory on the OSD nodes despite seemingly have free memory 
available.  This caused huge latency spikes and IO slowdowns (even 
stalls).  We never noticed it in our QA test suites and it doesn't 
appear to happen with RBD workloads as far as I can tell, but when it 
does happen it's really painful.

Mark

On 8/6/20 6:53 AM, Manuel Lausch wrote:
...
  Hi,

 I found the reasen of this behavior change.
 With 14.2.10 the default value of "bluefs_buffered_io" was changed from
 true to false.
 https://tracker.ceph.com/issues/44818

 configureing this to true my problems seems to be solved.

 Regards
 Manuel

 On Wed, 5 Aug 2020 13:30:45 +0200
 Manuel Lausch &lt;manuel.lausch(a)1und1.de&gt; wrote:

  Hello Vladimir,

 I just tested this with a single node testcluster with 60 HDDs (3 of
 them with bluestore without separate wal and db).

 With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while
 snaptrimming. With 14.2.9 this was not an issue.

 I wonder if this would explain the huge amount of slowops on my big
 testcluster (44 Nodes 1056 OSDs) while snaptrimming. I
 cannot test a downgrade there, because there are no packages of older
 releases for CentOS 8 available.

 Regards
 Manuel
  _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10