Hi all,
I am having trouble with our cluster getting consistent RBD latencies to our KVM virtual machines connected via the KRBD driver. When measuring with tools like rbd perf image iotop, we constantly see latency spike up from around 1-2ms to 100+ms. This seems to kill our Windows VM SQL performance. I essentially have 2 questions:
1) Am I missing something with my configuration that should be applied to get consistent low latency to the VM guests?
2) When measuring the disks, it seems that sequential IO results in higher latency vs random IO. Is this correct or is there a way to tweak this using the KRBD driver?
Configuration:
3 x MON/MGR nodes
12 x OSD nodes (24 x HDD, 2 x NVMe for DB and WAL)
KVM clients attaching the RBD images via KRBD
1 pool w/ 16384 PGs
Ceph version 14.2.1
Ceph.conf: