Hi Cephers,
I have a lot of questions after reading a lot about bluestore, the
min_alloc_size and the impact of writing small files to it through cepfs
with erasure coding.
In a setup using VMware ESX with its VMFS6 and its 1 MB block size on an
ISCSI-LUN mapped from Ceph RBD (replicated)
1. Wouldnt it be better to reduce RBD block size to 1 MB also?
2. When a file with a size smaller than 4k is written to a filesystem
within a virtual machine (and all the layers downwards) will the consumed
space be 4k? So does the 4MB block size of RBD combine a lot of small files
to one big object?
When using Cephfs and erasure coding:
1. I assume using a 4k min_alloc_size_hdd would reduce wasted space, but
increases fragmentation as Igor wrote.
2. How is the official way to deal with fragmentation in bluestore? Is
there a defrag tool available or planned?
From a performance perspective: My cluster runs on good old filestore using
nvme journals. I am about to migrate to bluestore.
1. With a MaxIOSize of 512KB in VMware, wouldnt
bluestore_prefer_deferred_size_hdd = 524288 give me a filestore like
behavior? My aim is to have the write latency like in filestore because we
have a lot of databases.
2. Are there any tradeoffs doing this?
Regards,
Dennis
Show replies by date