Benoit, thanks for the update.
for the sake of completeness one more experiment please if possible:
turn off write cache for HGST drives and measure commit latency once again.
Kind regards,
Igor
On 6/24/2020 3:53 PM, BenoƮt Knecht wrote:
> Thank you all for your answers, this was really helpful!
>
> Stefan Priebe wrote:
>> yes we have the same issues and switched to seagate for those reasons.
>> you can fix at least a big part of it by disabling the write cache of
>> those drives - generally speaking it seems the toshiba firmware is
>> broken.
>> I was not able to find a newer one.
> Good to know that we're not alone :) I also looked for a newer firmware, to no
> avail.
>
> Igor Fedotov wrote:
>> Benoit, wondering what are the write cache settings in your case?
>>
>> And do you see any difference after disabling it if any?
> Write cache is enabled on all our OSDs (including the HGST drives that don't
> have a latency issue).
>
> To see if disabling write cache on the Toshiba drives would help, I turned it
> off on all 12 drives in one of our OSD nodes:
>
> ```
> for disk in /dev/sd{a..l}; do hdparm -W0 $disk; done
> ```
>
> and left it on in the remaining nodes. I used `rados bench write` to create
> some load on the cluster, and looked at
>
> ```
> avg by (hostname) (ceph_osd_commit_latency_ms * on (ceph_daemon) group_left
(hostname) ceph_osd_metadata)
> ```
>
> in Prometheus. The hosts with write cache _enabled_ had a commit latency around
> 145ms, while the host with write cache _disabled_ had a commit latency around
> 25ms. So it definitely helps!
>
> Mark Nelson wrote:
>> This isn't the first time I've seen drive cache cause problematic
>> latency issues, and not always from the same manufacturer.
>> Unfortunately it seems like you really have to test the drives you
>> want to use before deploying them them to make sure you don't run into
>> issues.
> That's very true! Data sheets and even public benchmarks can be quite
> deceiving, and two hard drives that seem to have similar performance profiles
> can perform very differently within a Ceph cluster. Lesson learned.
>
> Cheers,
>
> --
> Ben