Anthony asked about the 'use case'. Well, I haven't gone into details
because I worried it wouldn't help much. From a 'ceph' perspective, the
sandbox layout goes like this: 4 pretty much identical old servers,
each with 6 drives, and a smaller server just running a mon to break
ties. Usual front-side lan, separate back-side networking setup. Each
of the servers is running a few vms, all more or less identical for the
test case. Each of the vms is supported by a rbd via user space libvirt
(not kernel mapped). Each rbd belongs to a pool that is entirely local
to the chassis, presently a replica on 3 of the osds. One of the
littler vms runs a mon+mgr per chassis. Of course what's important is
there's a pool that spans the chassis and does all the usual things for
userland ceph is good at. But for these tests I just unplugged all
that. So, do any process that involves a bunch of little writes -- like
installing a package or updating a initramfs and be ready to sit for a
long time. All the drives are 7200 rpm SATA spinners. CPU's are not
overloaded (fewer vms than cores), no swapping, memory left over. All
write-back caching, virtio drives. Ceph octopus latest, though it's no
better than nautilus performance wise in this case. Ubuntu
LTS/focal/20.04 I think. Checked all the networking stats, no dropped
packets, no overflow buffers and anyhow there shouldn't be any important
traffic on the front side and only ceph owns the back end. No ceph
problems reported, all pgs active, nothing misplaced, no erasure coded
pools.
So, there's a tiny novel, thanks for sticking with it!
On 6/29/20 11:12 PM, Anthony D'Atri wrote:
Thanks for the
thinking. By 'traffic' I mean: when a user space rbd
write has as a destination three replica osds in the same chassis
eek.
does the whole write get shipped out to the mon
and then back
Mons are control-plane only.
All the 'usual suspects' like lossy
ethernets and miswirings, etc. have
been checked. It's actually painful to sit and wait while
'update-initramfs' can take over a minute when the vm is chassis-local
to the osds getting the write info.
You have shared almost none of your hardware
or use-case. We know that you’re doing convergence, with unspecified CPU, memory, drives.
We also don’t know how heavy your colocated compute workload is. Since you mention
update-initramfs, I’ll guess that your workload is VMs with RBD volumes attached to
libvirt/QEMU? With unspecified RBD cache configuration. We also know nothing of your
network setup and saturation.
I have to suspect that either you’re doing something fundamentally wrong, or should just
set up a RAID6 volume and carve out LVMs.
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io