Hi,
Em qui., 6 de fev. de 2020 às 18:56, Mike Christie <mchristi(a)redhat.com
<mailto:mchristi@redhat.com>> escreveu:
On 02/05/2020 07:03 AM, Gesiel Galvão Bernardes wrote:
Em dom., 2 de fev. de 2020 às 00:37, Gesiel
Galvão Bernardes
<gesiel.bernardes(a)gmail.com <mailto:gesiel.bernardes@gmail.com>
<mailto:gesiel.bernardes@gmail.com
<mailto:gesiel.bernardes@gmail.com>>> escreveu:
Hi,
Just now was possible continue this. Below is the information
required. Thanks advan
Hey, sorry for the late reply. I just back from PTO.
esxcli storage nmp device list -d
naa.6001405ba48e0b99e4c418ca13506c8e
naa.6001405ba48e0b99e4c418ca13506c8e
Device Display Name: LIO-ORG iSCSI Disk
(naa.6001405ba48e0b99e4c418ca13506c8e)
Storage Array Type: VMW_SATP_ALUA
Storage Array Type Device Config: {implicit_support=on;
explicit_support=off; explicit_allow=on; alua_followover=on;
action_OnRetryErrors=on; {TPG_id=1,TPG_state=ANO}}
Path Selection Policy: VMW_PSP_MRU
Path Selection Policy Device Config: Current
Path=vmhba68:C0:T0:L0
Path Selection Policy Device Custom
Config:
Working Paths: vmhba68:C0:T0:L0
Is USB: false
........
Failed: H:0x0 D:0x2 P:0x0 Valid sense data:
0x2 0x4 0xa.
Act:FAILOVER
Are you sure you are using tcmu-runner 1.4? Is that the actual daemon
reversion running? Did you by any chance install the 1.4 rpm, but you/it
did not restart the daemon? The error code above is returned in 1.3 and
earlier.
You are probably hitting a combo of 2 issues.
We had only listed ESX 6.5 in the docs you probably saw, and in 6.7 the
value of action_OnRetryErrors defaulted to on instead of off. You should
set this back to off.
You should also upgrade to the current version of tcmu-runner 1.5.x. It
should fix the issue you are hitting, so non IO commands like inquiry,
RTPG, etc are executed while failing over/back, so you would not hit the
problem where path initialization and path testing IO is failed causing
the path to marked as failed.
I updated tcmu-runner to 1.5.2, and change action_OnRetryErrors to off,
but the problem continue 😭
Attached is vmkernel.log.
When you stopped the iscsi gw at around 2020-02-09T01:51:25.820Z, how
many paths did your device have? Did:
esxcli storage nmp path list -d your_device
report only one path? Did
esxcli iscsi session connection list
show a iscsi connection to each gw?
The logs look like when you brought the gw down, we lost the only path
we had. We then went into all paths down, so IO could not execute. It
looks like the gw was brought back up at the end of the log and the path
seem to have got added back.