Hi,all
we use openstack + ceph(hammer) in my production
Hammer is soooooo 2015.
There are 22 osds on a host and 11 osds share one ssd
for osd journal.
I can’t imagine a scenario in which this strategy makes sense, the documentation and books
are quite clear on why this is a bad idea. Assuming that your OSDs are HDD and the
journal devices are SATA SSD, the journals are going to be a bottleneck, and you’re going
to wear through them quickly. If you have a read-mostly workload, colocating them would
be safer.
I also suspect that something is amiss with your CRUSH topology that is preventing
recovery, and/or you actually have multiple overlapping failures.