I am happy to say, this seems to have been the solution.
After running
ceph config set global rbd_cache false
I can now run the full 256 thread varient,
fio --direct=1 --rw=randwrite --bs=4k --ioengine=libaio --filename=/dev/rbd0
--iodepth=256 --numjobs=1 --time_based --group_reporting --name=iops-test-job
--runtime=120 --eta-newline=1
and there is no longer a noticeable performance dip.
Thanks Sebastian
----- Original Message -----
From: "Sebastian Trojanowski" <sebcio.t(a)gazeta.pl>
To: "ceph-users" <ceph-users(a)ceph.io>
Sent: Tuesday, December 15, 2020 1:34:39 AM
Subject: [ceph-users] Re: performance degredation every 30 seconds
Hi,
check your rbd cache, by default it's enabled, for ssd/nvme better is to
disable it. Looks like your cache/buffers are full and need flush. It
could harmful your env.
BR,
Sebastian