Responses inline.
I have a last question. Why is the bench performed using writes of 4 KiB.
Is any reason to choose that over another another
value?
Yes, the mClock scheduler considers this as a baseline in order to
estimate costs
for operations involving other block sizes.
This is again an internal implementation detail.
On my lab, I tested with various values, and I have mainly two type of
disks. Some Seagates and Toshiba.
If I do bench with 4KiB, what I get from Seagate is a result around 2000
IOPS. While the Toshiba is more arround 600.
If I do bench with 128KiB, I still have results arround 2000 IOPS for
Seagate, but Toshiba also bench arround 2000 IOPS. And from the rados
experiment I did, having osd_mclock_max_capacity_iops_hdd set to 2000 on
that lab setup is the value I get the most performance from my rados
experiments, both with Segate and Toshiba disks.
I would currently suggest setting osd_mclock_max_capacity_iops_hdd to
values you
measured with fio as that is more realistic.
Like I mentioned, there are some improvements coming around this area that
would allow users to have greater control on
setting a realistic benchmark value.
-Sridhar