HI,
I think ad_select is only relevant in the scenario below I.e where you have more than one
port-channel being presented to the Linux bond. So below, you have 2 port channels, one
from each switch, but at the Linux side all the ports involved are slaves in the same
bond. In your scenario it sounds like you just have one switch with one port-channel to
one bond on Linux. So in the case of ad_select, I doubt it has any impact. The main
thing will be the xmit-hash-policy on both the switches and Linux. FWIW, I use layer3+4
on Linux and something very close to that on my S series switches, and both 10G links get
used pretty well. (below was lifted from a stackexchange thread)
.-----------. .-----------.
| Switch1 | | Switch2 |
'-=-------=-' '-=-------=-'
| | | |
| | | |
.-=----.--=---.---=--.----=-.
| eth0 | eth1 | eth2 | eth3 |
|---------------------------|
| bond0 |
'---------------------------'
Where each switch has its two ports configured in a PortChannel, the Linux end with the
LACP bond will negotiate two Aggregator IDs:
Aggregator ID 1
- eth0 and eth1
Aggregator ID 2
- eth2 and eth3
And the switches will have a view completely separate of each other.
Switch 1 will think:
Switch 1
PortChannel 1
- port X
- port Y
Switch 2 will think:
Switch 2
PortChannel 1
- port X
- port Y
From the Linux system with the bond, only one Aggregator will be used at a given time, and
will fail over depending on ad_select.
So assuming Aggregator ID 1 is in use, and you pull eth0's cable out, the default
behaviour is to stay on Aggregator ID 1.
However, Aggregator ID 1 only has 1 cable, and there's a spare Aggregator ID 2 with 2
cables - twice the bandwidth!
If you use ad_select=count or ad_select=bandwidth, the active Aggregator ID fails over to
an Aggregator with the most cables or the most bandwidth.
Note that LACP mandates an Aggregator's ports must all be the same speed and duplex,
so I believe you could configure one Aggregator with 1Gbps ports and one Aggregator with
10Gbps ports, and have intelligent selection depending on whether you have 20/10/2/1Gbps
available.
Sent from
Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
From: mhnx<mailto:morphinwithyou@gmail.com>
Sent: 28 June 2021 18:46
To: Marc 'risson' Schmitt<mailto:risson@cri.epita.fr>
Cc: Ceph Users<mailto:ceph-users@ceph.io>
Subject: [ceph-users] Re: Nic bonding (lacp) settings for ceph
Thanks for the answer.
I'm into ad_select bandwitdh because we use osd nodes as rgw gateways, VMs
and different applications.
I have seperate cluster (10+10Gbe) and public (10+10Gbe) network.
I tested stable, bandwitdh and count. Results are clearly good with
bandwitdh. Count is the worst option.
But I wonder if bandwitdh calculation has any effect on the network delay?
If it is then I will return to stable. I don't know now but when i think
about it if every time bonding driver needs to calculate bandwitdh and
decide it should add some cpu power and delay. If it has no effect then
bandwitdh will improve distribution better.
Now I know that I have to use 3+4 but still couldn't decide on ad_select.
Bandwitdh or stable?
Can we discuss it please?
28 Haz 2021 Pzt 20:15 tarihinde Marc 'risson' Schmitt <risson(a)cri.epita.fr>
şunu yazdı:
Hi,
On Sat, 26 Jun 2021 16:47:19 +0300
mhnx <morphinwithyou(a)gmail.com> wrote:
I've changed ad_select to bandwitdh and both
nic is in use now but
layer2 hash prevents dual nic usage for between two nodes (because
layer2 using only Mac ).
As I understand it, setting ad_select to bandwidth is only going to be
useful if you have several link aggregates in the same bond, like when
you are connected in LACP to multiple (non-stacked) switches.
People advice using layer2+3 for best performance
but it has no
effect on osds because mac and ip is the same.
I've tried layer3+4 to split by ports instead mac and it works. But i
dont know what will the effect and also my switch is layer2.
We are setting layer3+4 on both our servers and our switches.
Regards,
--
Marc 'risson' Schmitt
CRI - EPITA
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io