@Haomai,

      Does HAVE_IBV_EXP still work with any RNIC in current Ceph repository?

 

 

@Nasution:

       I have never used below options yet

ms_async_rdma_roce_ver = 0   #RoCEv1, all nodes with same networks. Should I use RoCEv2?
ms_async_rdma_local_gid = fe80:0000:0000:0000:****:****:****:****   #should I use  0000:0000:0000:0000:0000 :****:****:**** one? 

        To use RDMA, you may need:
         1) configure “ulimit -l” to be unlimited
         2) For RNIC with SRQ function:
               a. below configuration should be OK
                    
ms_async_rdma_device_name = mlx5_bond_0
                          ms_cluster_type = async+rdma
                          ms_public_type = async+posix
                   b. If you need to different RoCEv1 or RoCEv2, you need to configure “ms_async_rdma_gid_idx”
                       Reference: https://github.com/ceph/ceph/pull/31517/commits/b971cff51a9179c02f85a27cc191731a18e39876

 

From: Lazuardi Nasution <mrxlazuardin@gmail.com>
Sent: Thursday, September 10, 2020 12:23 AM
To: Liu, Changcheng <changcheng.liu@intel.com>
Subject: Ceph with RDMA

 

Hi,

 

I'm reading your post regarding Ceph with RDMA. Have you solved your problem? I'm trying the same way, but currently I'm facing a problem that some OSDs are automatically down not so long after it up due to no heartbeat reply, even for the newly installed cluster. I'm using the following RDMA related configuration.

 

[global]
.......
ms_async_rdma_device_name = mlx5_bond_0
ms_cluster_type = async+rdma
ms_public_type = async+posix
#/rbd does not support rdma
ms_async_rdma_polling_us = 0
ms_async_rdma_roce_ver = 0   #RoCEv1, all nodes with same networks. Should I use RoCEv2?
ms_async_rdma_local_gid = fe80:0000:0000:0000:****:****:****:****   #should I use  0000:0000:0000:0000:0000 :****:****:**** one? 

[mgr]
ms_type = async+posix

 

I have put "LimitMEMLOCK on OSD (because it is the only one that failed to start without it) systemd unit file. "Would you mind sharing your configuration of working Ceph with RDMA? Do I miss something?

 

Best regards,