Hi Roman,
On Thu, Jan 9, 2020 at 2:51 PM Roman Penyaev <rpenyaev(a)suse.de> wrote:
First thing that catches my eye is that for small
blocks there is no big
difference at all, but as the block increases, crimsons iops starts to
decline. Can it be the transport issue? Can be tested as well.
This is a known issue with the Seastar's POSIX-based network
stack. As Kefu pointed out, even large payloads are retrieved from
kernel with multiple small, fixed-size chunks. That's a matter of how
the internal interfaces were shaped. My personal impression is their
design favors the native stack / DPDK while avoiding differentiated
behavior among the stacks (likely to not surprise developers).
Moreover, crimson-osd imposes on Seastar additional memcpy to
reconcile those tiny chunks into a flat buffer.
Here is a more detailed gist:
https://gist.github.com/rzarzynski/a1d67dc39b0ef4d49cb522179b1f3c89.
There are branches (for both Seastar and crimson) with PoC for
the "input buffer factory" that targets those issues. Performance
comparison is here:
https://gist.github.com/rzarzynski/ad0aaa80b26603bc1a803ce0d209ac87.
Also, when narrowing the comparison to async-msgr vs crimson-msgr
(with ibf) I wouldn't expect too much of a difference. In read tests we're
observing pretty similar IPC for both crimson-osd and msgr-worker-n
(single thread profiling). The thing that might change a lot is the native
stack. In Intel's testing it significantly (up to 30-40% IIRC) improved
IPC of crimson-msgr. Glued with good SPDK support in Seastore
it might draw the POSIX stack (and thus the need for ibf) a bit obsolete.
Quick note on the saturation: please be careful when judging it with
top, pidstat or even perf stat. In contrast to the legacy OSD, Seastar
does busy-wait for awhile. This greatly exaggerates the CPU utilisation
for modest workloads. And yes, **crimson is all about the computational
efficiency**. We're much more interested in cycles/op than in raw IOPS,
to be honest.
Regards,
Radek