Hi Mehmet,
thanks for your response. I somehow mixed up the versions, we encountered the problems
when updating from octopus -> pacific, not nautilus -> pacific. But I will
nevertheless try out your suggestions tomorrow.
As far as I can tell, without snapshots, our latencies are not optimal (op_w_latency at
around 30-40ms, peaks at 1-2s sometimes), but somewhat stable.
Greetings,
Jan
Von: ceph(a)elchaka.de <ceph(a)elchaka.de>
Datum: Samstag, 1. April 2023 um 01:12
An: ceph-users(a)ceph.io <ceph-users(a)ceph.io>io>, Jan-Tristan Kruse
<j.kruse(a)profihost.ag>
Betreff: Re: [ceph-users] Re: avg apply latency went up after update from octopus to
pacific
Hello Jan,
I had the same on two cluster from nautlus to pacific.
On both it did help to fire
Ceph tell osd.* compact
If this had not help, i would go for a recreate of the osds...
Hth
Mehmet
Am 31. März 2023 10:56:42 MESZ schrieb j.kruse(a)profihost.ag:
Hi,
we have a very similar situation. We updated from nautilus -> pacific (16.2.11) and saw
a rapid increase in the commit_latency and op_w_latency (>10s on some OSDs) after a few
hours. We also have nearly exclusive rbd workload.
After deleting old snapshots we saw an improvenent, and after recreating snapshots the
numbers went up again. Without snapshots the numbers are slowly getting higher but not as
fast as before with existing snapshots. We also use SAS connected NVMe-SSDs.
bluefs_buffered_io made no difference. We compacted the rocksdb on a single OSD yesterday,
and funnily enough this is now the OSD with the highest op_w_latency. I generated a perf
graph for this single OSD and can generate more, but I'm not sure how to share this
data with you...?
I saw in the thread that Boris redeployed all OSDs. Could that be a more permanent
solution or is this also just temporarily (like deleting the snapshots)?
Greetings,
Jan
________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io