> 2) I'll confirm with my colleague that whether cluster network is really used in 14.2.4. We also hit similar problem these days even using TCP async messenger.
[Changcheng]:
1) The problem should be already sovled in 14.2.4. We hit the problem in 14.2.1
2) I'll try to verify your problem when I have time(I'm working on other
affairs). There should be no problem when unifying both public/cluster
network with RDMA device.
On 23:22 Wed 30 Oct, Liu, Changcheng wrote:
> I'm working on master branch and deploy two nodes cluster. Data is transferring over RDMA.
> [admin@server0 ~]$ sudo ceph daemon osd.0 perf dump AsyncMessenger::RDMAWorker-1
> {
> "AsyncMessenger::RDMAWorker-1": {
> "tx_no_mem": 0,
> "tx_parital_mem": 0,
> "tx_failed_post": 0,
> "tx_chunks": 26966,
> "tx_bytes": 52789637,
> "rx_chunks": 26916,
> "rx_bytes": 52812278,
> "pending_sent_conns": 0
> }
> }
>
> The only difference is that I don’t differentiate public/cluster network in my cluster.
> You can try to make all public/cluster network use RDMA.
> Note:
> 1) If both public/cluster use RDMA, we can’t differentiate them in different subnetwork. This is feature limited. I'm planning to solve it in future)
> 2) I'll confirm with my colleague that whether cluster network is really used in 14.2.4. We also hit similar problem these days even using TCP async messenger.
>
> Below is my cluster's ceph configuration.
> I also attach the systemd patch used in my side.
> [admin@server0 ~]$ cat /etc/ceph/ceph.conf
> [global]
> cluster = ceph
> fsid = 24280750-d4f7-4d4f-89e4-f95b8fab87ff
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
>
> osd pool default size = 2
> osd pool default min size = 2
> osd pool default pg num = 64
> osd pool default pgp num = 128
>
> osd pool default crush rule = 0
> osd crush chooseleaf type = 1
>
> mon_allow_pool_delete=true
> osd_pool_default_pg_autoscale_mode=off
>
> ms_type = async+rdma
> ms_async_rdma_device_name = mlx5_0
>
> mon_initial_members = server0
> mon_host = 172.16.1.4
>
> [mon.rdmarhel0]
> host = server0
> mon addr = 172.16.1.4
> [admin@server0 ~]$
>
> B.R.
> Changcheng
>
> On 13:07 Wed 30 Oct, Mason-Williams, Gabryel (DLSLtd,RAL,LSCI) wrote:
> > 1. The current problem is that it still sending data over the ethernet
> > instead of ib.
> > 2. [global]
> > fsid=xxxx
> > mon_initial_members = node1, node2, node3
> > mon_host = xxx.xx.xxx.ab,xxx.xx.xxx.ac, xxx.xx.xxx.ad
> > auth_cluster_required = cephx
> > auth_service_required = cephx
> > auth_client_required = cephx
> > public_network = xxx.xx.xxx.0/24
> > cluster_network = xx.xxx.0.0/16
> > ms_cluster_type = async+rdma
> > ms_type = async+rdma
> > ms_public_type = async+posix
> > [mgr]
> > ms_type = async+posix
> > 3. The ceph cluster is deployed using ceph-deploy then once up all of
> > the daemons are turned off the rdma cluster config is then sent
> > around then once that is complete the daemons are turned back on.
> > The ulimit is set to unlimited, LimitMEMLOCK=infinity is set on the
> > ceph-disk@.service, ceph-mds@.service, ceph-mon@.service,
> > ceph-osd@.service, ceph-radosgw@.service, aswell as
> > PrivateDevices=no on ceph-mds@.service, ceph-mon@.service and
> > ceph-radosgw@.service. The ethernet mtu is set to 1000
> > __________________________________________________________________
> >
> > From: Liu, Changcheng <changcheng.liu@intel.com>
> > Sent: 30 October 2019 12:24
> > To: Mason-Williams, Gabryel (DLSLtd,RAL,LSCI)
> > <gabryel.mason-williams@diamond.ac.uk>
> > Cc: dev@ceph.io <dev@ceph.io>
> > Subject: Re: RMDA Bug?
> >
> > 1. What's the problem do you hit when using RDMA in 14.2.4? Any log
> > shows the error?
> > 2. What's your ceph.conf?
> > 3. How do you deploy the ceph cluster? RDMA need lock some memory. So,
> > it needs change some system configuration to meet with this
> > requirement?
> > On 11:21 Wed 30 Oct, Gabryel Mason-Williams wrote:
> > > Liu, Changcheng wrote:
> > > > On 07:31 Mon 28 Oct, Mason-Williams, Gabryel (DLSLtd,RAL,LSCI)
> > wrote:
> > > > > I am using ceph version 12.2.8
> > > > > (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable).
> > > > >
> > > > > I have not checked the master branch do you think this is an
> > issue in
> > > > > luminous that has been removed in later versions? I
> > haven't hit problem
> > > > on master branch. Ceph/RDMA changed a lot
> > > > from luminous to master branch.
> > > >
> > > > Is below configuration really needed in luminous/ceph.conf?
> > > > > ms_async_rdma_local_gid = xxxx On master branch,
> > this
> > > > parameter is not needed at all.
> > > > B.R.
> > > > Changcheng
> > > > >
> > __________________________________________________________________
> > >
> > > Thanks, the issue of the OSD's falling over seems to have gone away
> > updating to Nautilus 14.2.4. However, I am still unable to get it to
> > properly communicate over RDMA even with removing
> > ms_async_rdma_local_gid.
> > > _______________________________________________
> > > Dev mailing list -- dev@ceph.io
> > > To unsubscribe send an email to dev-leave@ceph.io
> >
> >
> > --
> >
> > This e-mail and any attachments may contain confidential, copyright and
> > or privileged material, and are for the use of the intended addressee
> > only. If you are not the intended addressee or an authorised recipient
> > of the addressee please notify us of receipt by returning the e-mail
> > and do not use, copy, retain, distribute or disclose the information in
> > or attached to the e-mail.
> > Any opinions expressed within this e-mail are those of the individual
> > and not necessarily of Diamond Light Source Ltd.
> > Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> > attachments are free from viruses and we cannot accept liability for
> > any damage which you may sustain as a result of software viruses which
> > may be transmitted in or with the message.
> > Diamond Light Source Limited (company no. 4375679). Registered in
> > England and Wales with its registered office at Diamond House, Harwell
> > Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United
> > Kingdom
>
> > _______________________________________________
> > Dev mailing list -- dev@ceph.io
> > To unsubscribe send an email to dev-leave@ceph.io
>
> From 40fa0d7096364b410e8242c46967029fb949876a Mon Sep 17 00:00:00 2001
> From: Changcheng Liu <changcheng.liu@aliyun.com>
> Date: Tue, 23 Jul 2019 18:50:57 +0800
> Subject: [PATCH] rdma systemd: grant access to /dev and unlimit mem
>
> Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
>
> diff --git a/systemd/ceph-fuse@.service.in b/systemd/ceph-fuse@.service.in
> index d603042b12..ff2e9072f6 100644
> --- a/systemd/ceph-fuse@.service.in
> +++ b/systemd/ceph-fuse@.service.in
> @@ -12,6 +12,7 @@ ExecStart=/usr/bin/ceph-fuse -f --cluster ${CLUSTER} %I
> LockPersonality=true
> MemoryDenyWriteExecute=true
> NoNewPrivileges=true
> +LimitMEMLOCK=infinity
> # ceph-fuse requires access to /dev fuse device
> PrivateDevices=no
> ProtectControlGroups=true
> diff --git a/systemd/ceph-mds@.service.in b/systemd/ceph-mds@.service.in
> index 39a2e63105..0e58dfeeea 100644
> --- a/systemd/ceph-mds@.service.in
> +++ b/systemd/ceph-mds@.service.in
> @@ -14,7 +14,8 @@ ExecReload=/bin/kill -HUP $MAINPID
> LockPersonality=true
> MemoryDenyWriteExecute=true
> NoNewPrivileges=true
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
> ProtectControlGroups=true
> ProtectHome=true
> ProtectKernelModules=true
> diff --git a/systemd/ceph-mgr@.service.in b/systemd/ceph-mgr@.service.in
> index c98f6378b9..682c7ecef3 100644
> --- a/systemd/ceph-mgr@.service.in
> +++ b/systemd/ceph-mgr@.service.in
> @@ -18,7 +18,8 @@ LockPersonality=true
> MemoryDenyWriteExecute=false
>
> NoNewPrivileges=true
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
> ProtectControlGroups=true
> ProtectHome=true
> ProtectKernelModules=true
> diff --git a/systemd/ceph-mon@.service.in b/systemd/ceph-mon@.service.in
> index c95fcabb26..51854fad96 100644
> --- a/systemd/ceph-mon@.service.in
> +++ b/systemd/ceph-mon@.service.in
> @@ -21,7 +21,8 @@ LockPersonality=true
> MemoryDenyWriteExecute=true
> # Need NewPrivileges via `sudo smartctl`
> NoNewPrivileges=false
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
> ProtectControlGroups=true
> ProtectHome=true
> ProtectKernelModules=true
> diff --git a/systemd/ceph-osd@.service.in b/systemd/ceph-osd@.service.in
> index 1b5c9c82b8..06c20d7c83 100644
> --- a/systemd/ceph-osd@.service.in
> +++ b/systemd/ceph-osd@.service.in
> @@ -16,6 +16,8 @@ LockPersonality=true
> MemoryDenyWriteExecute=true
> # Need NewPrivileges via `sudo smartctl`
> NoNewPrivileges=false
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
> ProtectControlGroups=true
> ProtectHome=true
> ProtectKernelModules=true
> diff --git a/systemd/ceph-radosgw@.service.in b/systemd/ceph-radosgw@.service.in
> index 7e3ddf6c04..fe1a6b9159 100644
> --- a/systemd/ceph-radosgw@.service.in
> +++ b/systemd/ceph-radosgw@.service.in
> @@ -13,7 +13,8 @@ ExecStart=/usr/bin/radosgw -f --cluster ${CLUSTER} --name client.%i --setuser ce
> LockPersonality=true
> MemoryDenyWriteExecute=true
> NoNewPrivileges=true
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
> ProtectControlGroups=true
> ProtectHome=true
> ProtectKernelModules=true
> diff --git a/systemd/ceph-volume@.service b/systemd/ceph-volume@.service
> index c21002cecb..e2d1f67b85 100644
> --- a/systemd/ceph-volume@.service
> +++ b/systemd/ceph-volume@.service
> @@ -9,6 +9,7 @@ KillMode=none
> Environment=CEPH_VOLUME_TIMEOUT=10000
> ExecStart=/bin/sh -c 'timeout $CEPH_VOLUME_TIMEOUT /usr/sbin/ceph-volume-systemd %i'
> TimeoutSec=0
> +LimitMEMLOCK=infinity
>
> [Install]
> WantedBy=multi-user.target
> --
> 2.17.1
>
> _______________________________________________
> Dev mailing list -- dev@ceph.io
> To unsubscribe send an email to dev-leave@ceph.io