Hi Mark,

Given that it is a pain and takes forever, it would be great to automate it, like some other projects already do: https://chromium.googlesource.com/chromium/src/+/master/docs/speed/addressing_performance_regressions.md.  I realize that it is a nontrivial amount of work, but perhaps something to keep in mind for a GSOC project.

On Tue, Aug 4, 2020 at 9:29 AM Mark Nelson <mnelson@redhat.com> wrote:
Hi Abutalib,


Given that you are on HDDs, I'd look closely at Igor's bluestore disk
allocator work, though I'm not sure how many of his changes are active
right now.  Especially look closely at the min_alloc_size change.


https://github.com/ceph/ceph/pull/33365

https://github.com/ceph/ceph/pull/34588


In the past I've found that the most reliable way to figure out these
kind of issues is to do a git bisect and run the tests each time.  It's
a pain and can take forever but it usually does a pretty good job of
pinpointing what's going on (even if it sometimes points out that it was
my fault due to a test error!). I'd check out the allocator work first
though.


Mark


On 8/3/20 10:29 PM, Abutalib Aghayev wrote:
> Hi all,
>
> There appears to be a performance regression going from 15.2.4 to
> HEAD.  I first realized this when testing my patches to Ceph on an
> 8-node cluster, but it is easily reproducible on *vanilla* Ceph with
> vstart as well, using the following steps:
>
> $ git clone https://github.com/ceph/ceph.git && cd ceph
> $ ./do_cmake.sh -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_MANPAGE=OFF
> -DWITH_BABELTRACE=OFF -DWITH_MGR_DASHBOARD_FRONTEND=OFF && cd build &&
> make -j32 vstart
> $ MON=1 OSD=1 MDS=0 ../src/vstart.sh --debug --new --localhost
> --bluestore --bluestore-devs /dev/xxx
> $ sudo ./bin/ceph osd pool create foo 32 32
> $ sudo ./bin/rados bench -p foo 100 write --no-cleanup
>
> With the old hard drive that I have (Hitachi HUA72201), I'm getting an
> average throughput of 60 MiB/s.  When I switch to v15.2.4 (git
> checkout v15.2.4), rebuild, and repeat the experiment, and I get an
> average throughput of 90 MiB/s.  I've reliably reproduced similar
> difference between 15.2.4 and HEAD by building release packages and
> running them on an 8-node cluster.
>
> Is this expected or is this a performance regression?
>
> Thanks!
>
> _______________________________________________
> Dev mailing list -- dev@ceph.io
> To unsubscribe send an email to dev-leave@ceph.io
_______________________________________________
Dev mailing list -- dev@ceph.io
To unsubscribe send an email to dev-leave@ceph.io