Hi Dustin,
This is an issue that will happen regardless of pubsub configuration.
Tracked here:
Yuval
On Sun, May 31, 2020 at 11:00 AM Yuval Lifshitz <ylifshit(a)redhat.com> wrote:
Hi Dustin,
Did you create a pubsub zone [1] in your cluster?
(note that this is currently not supported in rook, so it had to be done
manually).
Yuval
[1]
https://docs.ceph.com/docs/master/radosgw/pubsub-module/#pubsub-zone-config…
On Fri, May 29, 2020 at 7:16 PM Dustin Guerrero <2140378(a)gmail.com> wrote:
Hey all,
We’ve been running some benchmarks against Ceph which we deployed using
the Rook operator in Kubernetes. Everything seemed to scale linearly until
a point where I see a single OSD receiving much higher CPU load than the
other OSDs (nearly 100% saturation). After some investigation we noticed a
ton of pubsub traffic in the strace coming from the RGW pods like so:
[pid 22561] sendmsg(77, {msg_name(0)=NULL,
msg_iov(3)=[{"\21\2)\0\0\0\10\0:\1\0\0\10\0\0\0\0\0\10\0\0\0\0\0\0\20\0\0-\321\211K"...,
73}, {"\200\0\0\0pubsub.user.ceph-user-wwITOk"..., 314},
{"\0\303\34[\360\314\233\2138\377\377\377\377\377\377\377\377", 17}],
msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL|MSG_MORE <unfinished …>
I’ve checked other OSDs and only a single OSD receives these messages. I
suspect its creating a bottleneck. Does anyone have an idea on why these
are being generated or how to stop them? The pubsub sync module doesn’t
appear to be enabled, and our benchmark is doing simple gets/puts/deletes.
We’re running Ceph 14.2.5 nautilus
Thank you!
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io