On Sat, Jun 27, 2020 at 6:02 PM Ning Yao <zay11022(a)gmail.com> wrote:
I find the in cephfs kernel module fs/ceph/file.c, the function ceph_fallocate return
-EOPNOTSUPP，when mode != (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)。
Recently，we try to use cephfs but need supporting fallocate syscall to generate the file
writing not failed after reserved space. But we find the cephfs kernel module does not
support this right now. Can anyone explain why we don't implement this now?
It used to be supported (as in it wasn't rejected with EOPNOTSUPP),
but never actually worked. There is no easy way to preallocate/reserve
space in the cluster without explicitly zeroing it. If the space isn't
actually reserved, subsequent writes could fail with ENOSPC which would
violate POSIX and break applications, so we chose to disable it.
We also find out ceph-fuse can support the falllocate syscall but endwith a pool writing
performance vs cephfs kernel mount. There is a large performance gap under fio.cfg below:
ceph-fuse is broken in the same way and it should be disabled there