[ceph-users] Re: Do not use SSDs with (small) SLC cache

21 Feb 2023

Dear Michael,

I don't have an explanation for your problem unfortunately, but I just 
wondered that you experience a drop in performance, that this SSD 
shouldn't have. Your SSDs drives (Samsung 870 EVO) should not get slower 
on large writes. You can verify this on the post you've attached [1] or 
here [3].

I am curious if replacing them with other disks will improve it.

[3] 
https://www.anandtech.com/show/16480/the-samsung-870-evo-ssd-1tb-4tb-review…

Best

Ken

On 21.02.23 08:53, Michael Wodniok wrote:
...
  Hi all,

 digging around debugging, why our (small: 10 Hosts/~60 OSDs) cluster is so slow even
while recovering I found out one of our key issues are some SSDs with SLC cache (in our
case Samsung SSD 870 EVO) - which we just recycled from other use cases in the hope to
speed up our mainly hdd based cluster. We know it's a little bit random which objects
get accelerated when not used as cache.

 However the opposite was the case. These type's ssds are only fast when operating in
their SLC cache, which is only several Gigabytes in a multi-TB ssd [1]. When doing a big
write or a backfill onto these SSDs we got really low IO-rates (around 10 MB/s even with
4M-objects).

 But it got even worse. Disclaimer: This is my view as a user, maybe a more technically
involved person is able to correct me. Cause seems to be the mclock-scheduler which
measures the iops an osd is able to do. As in the blog measured [2], this is usually a
good thing as there is done some profiling and queing is done different. But in our case
the osd_mclock_max_capacity_iops_ssd for most of the corresponding osds was very low. But
not for everyone. I assume that it depends when mclock-scheduler measured the iops
capacity. That led to a broken scheduling where backfills were at low speed and the ssd
itself had nearly no disk usage because it was operating in it's cache again and could
work faster. That issue could be solved by switching back to wpq scheduler for the
affected SSDs. This scheduler seems to just queue up ios without throttling because of
maximum iops reached. Now we see a still bad IO situation because of the slow SSDs but at
least they are operating at their maximum (having typical settings like
osd_recovery_max_active and osd_recovery_sleep* tuned).

 We are going to replace the SSDs to hopefully more consistent performing ones (even if
their peak performance would be not as good).

 I hope this may help somebody in the future when being stuck in low performance
recoverys.

 Refs:

 [1]
https://www.tomshardware.com/reviews/samsung-870-evo-sata-ssd-review-the-be…
 [2] https://ceph.io/en/news/blog/2022/mclock-vs-wpq-testing-with-background-ops…

 Happy Storing!
 Michael Wodniok

 --

 Michael Wodniok M.Sc.
 WorNet AG
 Bürgermeister-Graf-Ring 28
 82538 Geretsried

 Simply42 und SecuMail sind Marken der WorNet AG.
 http://www.wor.net/

 Handelsregister Amtsgericht München (HRB 129882)
 Vorstand: Christian Eich
 Aufsichtsratsvorsitzender: Dirk Steinkopf

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Do not use SSDs with (small) SLC cache