Hi Luis,
The thing that is odd, is that doing some tests with fio tool, I have
similar results on all disks, and doing the rados
bench during 5 minutes as
well. But the OSD bench at startup of the OSD, for mClock to configure
osd_mclock_max_capacity_iops_hdd gives me a very big difference between
disks. (600 vs 2200).
I am running Pacific on this test cluster.
Is there anywhere documentation of how this works? Or if anyone could
explain that would be great.
I did not found any documentation on how OSD benchmark works, only how to
used it. But playing a little bit with it, it seems the results we get is
highly dependent on the block sizes we use. Same for rados bench, results
are dependent, at least on my tests, of the block size we use, which I
found a little bit weird to be honest.
OSD bench performs IOs at the objectstore level and the stats are reported
based on the response from those transactions. It performs either sequential
or random IOs (i.e. a random offset into an object) based on the arguments
passed to it. IIRC if number of objects and object size is provided, a
random
offset into an object is written.
Therefore, depending on the parameters passed, sequential or random offset
is determined and this obviously would result in different measurements.
And as mClock depends on that, it is impactful performance wise. On our
cluster we can reach a lot better performances if we teak those values,
instead of letting the cluster do proper measurements. And this looks to
impact certain disk vendors more than others.
As far as choosing between fio and OSD bench to set the
osd_mclock_max_capacity_iops_hdd, fio would be a better choice. OSD bench
has
very limited capability and its result can be affected by the drive
characteristics and
other external drive settings (for e.g. caching). There are some
improvements planned
to prevent setting unrealistic numbers for
osd_mclock_max_capacity_iops_[ssd|hdd]
and give more control to the user to set realistic values.
-Sridhar