RBD snapshots very slow - Dev

20 Mar 2020

Hi.

For a long time I was under an impression that clones are as efficient 
in bluestore as snapshots.

But today I finally decided to test it and ... I discovered it was an 
utterly wrong impression :) RBD copies the whole 4 MB object even when a 
small 4 KB block is modified within it in the child image. In my 
all-NVMe cluster this leads to 40 (40!!!) random write iops (bs=4k 
iodepth=1) in a fresh RBD clone, which is terrible.

Question of the day: is it possible to reimplement RBD clones using 
"sparse objects"? As I understand the support for sparse objects 
themselves is already there. So maybe librbd could only write the 
modified part to the child image when writing and read "holes" from 
parents when reading?

-- 
Vitaliy Filippov