Hi All,

I've just completed a small investigation on this issue.

Short summary for Abutalib's benchmark scenario in my environment:

- when "vstarted" cluster has WAL/DB at NVMe drive - master is ~65% faster  (130 MB/s vs. 75MB/s)

- in spinner-only setup (WAL/DB are at the same spinner) - octopus is 70 % faster than master ( 88MB/s vs. 52 MB/s)

Looks like it is bluestore_max_blob_size_hdd set to 64K by default which causes performance drop in this case (commit sha a8733598eddf57dca86bf002653e630a7cf4db6e).

Setting it back to 512K results in a better performance for master: 113 MB/s vs. 88 MB/s

But I presume these numbers are tightly bound to writing 4MB chunks benchmark scenario. Different chunk size will most probably cause pretty different results. Hence reverting bluestore_max_blob_size_hdd back to 512K by default is IMO still questionable.


Wondering if the above results are in-line with somebody's else ones....


Thanks,

Igor


On 8/4/2020 6:29 AM, Abutalib Aghayev wrote:
Hi all,

There appears to be a performance regression going from 15.2.4 to HEAD.  I first realized this when testing my patches to Ceph on an 8-node cluster, but it is easily reproducible on *vanilla* Ceph with vstart as well, using the following steps:

$ git clone https://github.com/ceph/ceph.git && cd ceph
$ ./do_cmake.sh -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_MANPAGE=OFF -DWITH_BABELTRACE=OFF -DWITH_MGR_DASHBOARD_FRONTEND=OFF && cd build && make -j32 vstart
$ MON=1 OSD=1 MDS=0 ../src/vstart.sh --debug --new --localhost --bluestore --bluestore-devs /dev/xxx
$ sudo ./bin/ceph osd pool create foo 32 32
$ sudo ./bin/rados bench -p foo 100 write --no-cleanup

With the old hard drive that I have (Hitachi HUA72201), I'm getting an average throughput of 60 MiB/s.  When I switch to v15.2.4 (git checkout v15.2.4), rebuild, and repeat the experiment, and I get an average throughput of 90 MiB/s.  I've reliably reproduced similar difference between 15.2.4 and HEAD by building release packages and running them on an 8-node cluster.

Is this expected or is this a performance regression?

Thanks!

_______________________________________________
Dev mailing list -- dev@ceph.io
To unsubscribe send an email to dev-leave@ceph.io