There is a desperate need for something like DKMS packaging for the CephFS kernel client. This is especially true for HPC use cases.  I don’t know of a single site that doesn’t need to mount multiple lustre and gpfs filesystems with strict (older) kernel version requirements to keep those supported.

Sent from my iPad

On Mar 31, 2020, at 12:47 PM, Robert LeBlanc <robert@leblancnet.us> wrote:


Thanks for the info, we were running our own ior and mdtest profiles, unfortunately the libcephfs won't help us for our real applications, but it's a great data point. Getting the performance kinks worked out of Ceph FUSE would allow is to stay on a stable kernel, but run a newer client, prevent us from having to reboot boxes when we fill up our filesystem and have to deal with corrupted metadata and lose long running jobs on those boxes and allow us to only mount Ceph in containers that actually need the storage. Seems like some really good features to me.

Sent from a mobile device, please excuse any typos.

On Tue, Mar 31, 2020, 4:44 AM Mark Nelson <mnelson@redhat.com> wrote:
Hi Robert,


I have some interest in this area.  I noticed that the kernel client was
performing poorly with sequential reads while testing the io500
benchmark a couple of months ago.  I ended up writing a libcephfs
backend for ior/mdtest so we could run io500 directly with libcephfs
which ended performing similarly in most tests and significantly faster
for sequential reads.  Ilya and I have done some work since then trying
to debug why the kernel client is slow in this case and all we've been
able to track down so far is that it may be related to aggregate iodepth
with fewer than expected kworker threads doing work.  Next step is
probably to try and look at kernel lock statistics.


In any event, I would very much like to get a generic fast userland
client implementation working.  FWIW you can see my PR for mdtest/ior here:


https://github.com/hpc/ior/pull/217


Mark


On 3/30/20 12:14 AM, Robert LeBlanc wrote:
> On Sun, Mar 29, 2020 at 7:35 PM Yan, Zheng <ukernel@gmail.com
> <mailto:ukernel@gmail.com>> wrote:
>
>     On Mon, Mar 30, 2020 at 5:38 AM Robert LeBlanc
>     <robert@leblancnet.us <mailto:robert@leblancnet.us>> wrote:
>     >
>     > We have been evaluating other cluster storage solutions and one
>     of them is just about as fast as Ceph, but only uses FUSE. They
>     mentioned that recent improvement in the FUSE code allows for
>     similar performance to kernel code. So, I'm doing some tests
>     between CephFS kernel and FUSE and that is not true in the Ceph case.
>     >
>
>     do they mention which improvement?
>
> I think it was in regards to removing some serial code paths. I don't
> know if it was async work or parallel thread work. And recent as in
> like 4.9 recent.
>
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
> _______________________________________________
> Dev mailing list -- dev@ceph.io
> To unsubscribe send an email to dev-leave@ceph.io
_______________________________________________
Dev mailing list -- dev@ceph.io
To unsubscribe send an email to dev-leave@ceph.io
_______________________________________________
Dev mailing list -- dev@ceph.io
To unsubscribe send an email to dev-leave@ceph.io