Our goal is to put up a high performance ceph cluster that can deal with 100 very active
clients. So for us, testing with iodepth=256 is actually fairly realistic.
but it does also exhibit the problem with iodepth=32
[root@irviscsi03 ~]# fio --filename=/dev/rbd0 --direct=1 --rw=randwrite --bs=4k
--ioengine=libaio --iodepth=32 --numjobs=1 --time_based --group_reporting
--name=iops-test-job --runtime=120 --eta-newline=1
iops-test-job: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B,
ioengine=libaio, iodepth=32
fio-3.7
Starting 1 process
fio: file /dev/rbd0 exceeds 32-bit tausworthe random generator.
fio: Switching to tausworthe64. Use the random_generator= option to get rid of this
warning.
Jobs: 1 (f=1): [w(1)][2.5%][r=0KiB/s,w=20.5MiB/s][r=0,w=5258 IOPS][eta 01m:58s]
Jobs: 1 (f=1): [w(1)][4.1%][r=0KiB/s,w=41.1MiB/s][r=0,w=10.5k IOPS][eta 01m:56s]
Jobs: 1 (f=1): [w(1)][5.8%][r=0KiB/s,w=45.7MiB/s][r=0,w=11.7k IOPS][eta 01m:54s]
Jobs: 1 (f=1): [w(1)][7.4%][r=0KiB/s,w=55.3MiB/s][r=0,w=14.2k IOPS][eta 01m:52s]
Jobs: 1 (f=1): [w(1)][9.1%][r=0KiB/s,w=54.4MiB/s][r=0,w=13.9k IOPS][eta 01m:50s]
Jobs: 1 (f=1): [w(1)][10.7%][r=0KiB/s,w=53.4MiB/s][r=0,w=13.7k IOPS][eta 01m:48s]
Jobs: 1 (f=1): [w(1)][12.4%][r=0KiB/s,w=53.7MiB/s][r=0,w=13.7k IOPS][eta 01m:46s]
Jobs: 1 (f=1): [w(1)][14.0%][r=0KiB/s,w=55.7MiB/s][r=0,w=14.3k IOPS][eta 01m:44s]
Jobs: 1 (f=1): [w(1)][15.7%][r=0KiB/s,w=54.4MiB/s][r=0,w=13.9k IOPS][eta 01m:42s]
Jobs: 1 (f=1): [w(1)][17.4%][r=0KiB/s,w=51.6MiB/s][r=0,w=13.2k IOPS][eta 01m:40s]
Jobs: 1 (f=1): [w(1)][19.0%][r=0KiB/s,w=38.1MiB/s][r=0,w=9748 IOPS][eta 01m:38s]
Jobs: 1 (f=1): [w(1)][20.7%][r=0KiB/s,w=24.1MiB/s][r=0,w=6158 IOPS][eta 01m:36s]
Jobs: 1 (f=1): [w(1)][22.3%][r=0KiB/s,w=12.4MiB/s][r=0,w=3178 IOPS][eta 01m:34s]
Jobs: 1 (f=1): [w(1)][24.0%][r=0KiB/s,w=31.5MiB/s][r=0,w=8056 IOPS][eta 01m:32s]
Jobs: 1 (f=1): [w(1)][25.6%][r=0KiB/s,w=48.6MiB/s][r=0,w=12.4k IOPS][eta 01m:30s]
Jobs: 1 (f=1): [w(1)][27.3%][r=0KiB/s,w=52.2MiB/s][r=0,w=13.4k IOPS][eta 01m:28s]
Jobs: 1 (f=1): [w(1)][28.9%][r=0KiB/s,w=54.3MiB/s][r=0,w=13.9k IOPS][eta 01m:26s]
Jobs: 1 (f=1): [w(1)][30.6%][r=0KiB/s,w=52.6MiB/s][r=0,w=13.5k IOPS][eta 01m:24s]
Jobs: 1 (f=1): [w(1)][32.2%][r=0KiB/s,w=55.1MiB/s][r=0,w=14.1k IOPS][eta 01m:22s]
Jobs: 1 (f=1): [w(1)][33.9%][r=0KiB/s,w=34.3MiB/s][r=0,w=8775 IOPS][eta 01m:20s]
Jobs: 1 (f=1): [w(1)][35.5%][r=0KiB/s,w=52.5MiB/s][r=0,w=13.4k IOPS][eta 01m:18s]
Jobs: 1 (f=1): [w(1)][37.2%][r=0KiB/s,w=52.7MiB/s][r=0,w=13.5k IOPS][eta 01m:16s]
Jobs: 1 (f=1): [w(1)][38.8%][r=0KiB/s,w=53.9MiB/s][r=0,w=13.8k IOPS][eta 01m:14s]
.. etc.
----- Original Message -----
From: "Jason Dillaman" <jdillama(a)redhat.com>
To: "Philip Brown" <pbrown(a)medata.com>
Cc: "ceph-users" <ceph-users(a)ceph.io>
Sent: Monday, December 14, 2020 10:19:48 AM
Subject: Re: [ceph-users] performance degredation every 30 seconds
On Mon, Dec 14, 2020 at 12:46 PM Philip Brown <pbrown(a)medata.com> wrote:
Further experimentation with fio's -rw flag, setting to rw=read, and rw=randwrite, in
addition to the original rw=randrw, indicates that it is tied to writes.
Possibly some kind of buffer flush delay or cache sync delay when using rbd device, even
though fio specified --direct=1 ?
It might be worthwhile testing with a more realistic io-depth instead
of 256 in case you are hitting weird limits due to an untested corner
case? Does the performance still degrade with "--iodepth=16" or
"--iodepth=32"?