Hi Patrick,
The actual recommendation is to use a replicated pool
for the default data pool. Regular hard drives are fine for the storage device.
But would it not be a better recommendation to put the default pool on SSDs?
* If you expect to have many small files on CephFS on HDD, seeking can be a huge
bottleneck for balancing/recovery/scrubbing. E.g. on 500 million small files it easily
takes 2 months to recover. My understanding is that CephFS makes at least 2 RADOS objects
per file: One for the file contents, one for the inode info. By putting half of those
objects on SSD, one could reduce the seek bottleneck by 2x. And from my other post, each
such inode object only seems to be ~512 Bytes (though in Loïcs' post it's 32 KiB,
not sure why that is, maybe some different minimum object size?).
* If you do not expect to have many small files, inode information should be so small that
it doesn't hurt to put it on SSD either.
So it seems like a sane recommendation either way.
Related thread I posted:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/VKVENC3VP3L…
What do you think?
Would it make sense to recommend this in the Ceph docs for all filesystems, even non-EC
replicated HDD?