That's pretty much the advice I've been giving
people since the Inktank days. It costs more and is lower density, but the design is
simpler, you are less likely to under provision CPU, less likely to run into memory
bandwidth bottlenecks, and you have less recovery to do when a node fails.
Agreed. I’ve seen very clever presentations extolling the benefits of pinning within,
say, a 4-socket server — OSDs to cores to HBAs to NICs to NUMA domains. A lot of the
diagrams start looking like 4x 1-socket servers in the same chassis, but with more work.
With today’s networking, _maybe_ a super-dense NVMe box needs 100Gb/s where a less-dense
probably is fine with 25Gb/s. And of course PCI lanes.
https://cephalocon2019.sched.com/event/M7uJ/affordable-nvme-performance-on-…
Especially now with how many NVMe drives you can fit
in a single 1U server!
I’ve seen 10 for a conventional server, though depending on CPU choice some of those
options don’t treat dual PSUs as redundant. EDSFF, the “ruler” form factor, shows a lot
of promise in this space. Especially once the drives are available from more than one
manufacturer. With TLC flash and the right Epyc P CPU it seems like a killer OSD node for
RBD use. And for some Object / RGW use cases QLC drives start looking like a viable
alternative to HDDs.