On Mon, Aug 12, 2019 at 10:03 PM yangjun(a)cmss.chinamobile.com
<yangjun(a)cmss.chinamobile.com> wrote:
Hi Jason,
I was recently testing the RBD mirror feature(ceph12.2.8), my test environment is a
single-node cluster, which including 10 3T hdd OSDs + 800G pcie ssd + bluestore, and the
wal and db partition of the OSD is 30G.
The test result of a 100G image is as follows:
disable journal enable journal decline percentage
iops: 1000 877 12.3%
bw: 402MB/s 129MB/s 67%
Why does the bandwidth decline so much after starting journal of the RBD image? I'm
very appreciate if you could give me some suggestions for optimization. Thank you very
much.
The use of the journal requires first writing to the journal and, once
committed, writing to the image (i.e. doubling the latency).
Therefore, the expected worst-case performance should be around 2x
slower [1]. There was a recent bug fix [2] in the master branch that
will be backported to older releases which greatly increases small IO
journal performance -- since it was nearly 10x slower due to the bug
instead of the expected 2x [3].
________________________________
yangjun(a)cmss.chinamobile.com
[1]
https://www.slideshare.net/JasonDillaman/disaster-recovery-and-ceph-block-s…
[2]
https://tracker.ceph.com/issues/40072
[3]
https://youtu.be/ZifNGprBUTA?t=1687
--
Jason