Degradation of write-performance after upgrading to Octopus - ceph-users

4 Jun 2020

We have deployed a small test cluster consisting of three nodes. Each node is running a
mon/mgr and two osds (Samsung PM983 3,84TB NVMe split into two partitions), so six osds in
total. We started with Ceph 14.2.7 some weeks ago (upgraded to 14.2.9 later) and ran
different tests using fio against some rbd volumes in order to get an overview what
performance we could expect. The configuration is unchanged compared to the defaults, we
only set several debugging options to 0/0.

Yesterday we upgraded the whole cluster following the upgrade guidelines to Ceph 15.2.3,
which worked without any problems so far. Nevertheless when running the same tests as
before with Ceph 14.2.9, we are seeing some clear degradations in write-performance
(beside some performance improvements, which shall also be mentioned).

Here the results of concern (each with the relevant fio settings used):

Test "read-latency-max"
(rw=randread, iodepth=64, bs=4k)
read_iops: 32500 -> 87000

Test "write-latency-max"
(rw=randwrite, iodepth=64, bs=4k)
write_iops: 22500 -> 11500

Test "write-throughput-iops-max"
(rw=write, iodepth=64, bs=4k)
write_iops: 7000 -> 14000

Test "usecase1"
(rw=randrw,
bssplit=4k/40:8k/5:16k/20:32k/5:64k/10:128k/10:256k/,4k/50:8k/20:16k/20:32k/5:64k/2:128k/:256k/,
rwmixread=1, rate_process=poisson, iodepth=64)
write_iops: 21000 -> 8500

Test "usecase1-readonly"
(rw=randread, bssplit=4k/40:8k/5:16k/20:32k/5:64k/10:128k/10:256k/, rate_process=poisson,
iodepth=64)
read_iops: 28000 -> 58000

The last two tests represent a typical use case on our systems. Therefore we are
especially concerned by the drop in performance from 21000 w/ops to 8500 w/ops (about 60%)
after upgrading to Ceph 15.2.3. 

We ran all tests several times, the values are averaged over all iterations and fairly
consistent and reproducible. We even tried wiping the whole cluster, downgrading to Ceph
14.2.9 again, setting up a new cluster/pool, running the tests and upgrading to Ceph
15.2.3 again. The tests have been performed on one of the three cluster nodes using a 50G
rbd volume, which had been prefilled with random data before each test-run.

Have any changes been introduced with Octopus that could explain the observed changes in
performance?

What we already tried:

- Disabling rbd cache
- Reverting rbc cache policy to writeback (default in 14.2)
- Setting rbd io scheduler to none
- Deploying a fresh cluster starting with Ceph 15.2.3

Kernel is 5.4.38 … I don't know if some other system specs would be helpful besides
the already mentioned (since we are talking about a relative change in performance after
upgrading Ceph without any further changes) - if so, please let us know.