On 04/04/2021 20:24, Joe Mario wrote:
Hi Loïc:
Looking further, there is something in those files. It's just one
small
cacheline, but there is something there. I guess I'm not used
to seeing c2c being run on a laptop, nor am I used to seeing so few samples, or even so
little in the kernel.
And I apologize for my quick initial mistaken
analysis.
Here's what it looks like is happening. Correct me if I'm wrong.
In your "without-sharding" version of Ceph, you had 8 threads in the
ceph_test_c2c binary all contending for the same lock located at offset 0 in a cacheline.
And then, in the "with-sharding" version of Ceph, you changed it so that each
thread would act on its own copy of the 4-byte lock.
Unfortunately, the "with-sharding" version likely didn't help, because all
those 8 locks are packed into the same cacheline, with each lock being 4 bytes away from
the last one.
If that is true, then you need to rewrite the code such that all the locks are located by
themselves in their own cacheline.
Is the above assumption correct?
Yes, absolutely right. I changed the variable to
be 128 bytes aligned[0],
is it ok? Maybe there is a constant somewhere that provides this number (number of bytes
to be "cache aligned") so it is not hard coded? The output is uploaded in
ceph-c2c-jmario-2021-04-04-22-13.tar.gz and hopefully looks better.
[0]
https://lab.fedeproxy.eu/ceph/ceph/-/commit/54f3a4d0ece0e817bf9308617040f8b…
P.S. no need to apologize: I'm grateful for the quick feedback on a Sunday :-)
Also, I'm going to update the "run_c2c_ceph.sh" script I gave you.
The "-g" flag isn't working as it should (known issue).
Joe
On Sun, Apr 4, 2021 at 12:21 PM Loïc Dachary <loic(a)dachary.org
<mailto:loic@dachary.org>> wrote:
I uploaded the /boot/config-5.10.0-5-amd64 file to
dropbox.redhat.com
<http://dropbox.redhat.com> : it looks like perf is compiled in:
#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
and all the options below are =y (no =n).
Maybe a module should be loaded ?
--
Loïc Dachary, Artisan Logiciel Libre