SCSI events that can trigger ESX server to fail a LUN over to another path (1003433)
- 0/1 0x0 0x0 0x0 - DID_NO_CONNECT
When the fabric returns DID_NO_CONNECT status, the ESX host detected that a target is no longer present.TheDID_NO_CONNECT status is caused by a fabric switch failure, a disconnected physical cable, or a zoning change that no longer allows the ESX host to see the array.
Note: This is the only sense code that affects both Active/Active and Active/Passive arrays. The other sense codes only trigger fail over on Active/Passive arrays.
- 2/0 0x3 0x4 0x3 - MEDIUM ERROR - LOGICAL UNIT NOT READY
2/0 0x5 0x4 0x3 - ILLEGAL REQUEST - LOGICAL UNIT NOT READY
The medium error and illegal request indicate that the LUN is not in a ready state. Manual intervention is required on the array to correct this issue.
- 2/0 0x2 0x4 0xa - NOT_READY - LUN IS NOT READY AND TARGET PORT IS IN TRANSITION
The LUN is not ready and target port is in transition error can occur under the following conditions:
- The ownership of a LUN is being transitioned between storage processors on an active/passive array.
- The LUN is in a transitional state. For example, when a LUN is being created for the first time.
A new failover condition has been introduced in ESX 3.5 that allows us to recognize when an EMC Clariion SP is hung and issue additional commands to verify its status. When this SCSI code is captured, the ESX host queries the peer SP to see if the original one is alive. If the peer SP cannot get a response, a failover is initiated and the SP is marked as hung/dead.
Note: It is possible to see the 0x7 0x0 0x0 0x0 code for other arrays however this does not necessarily mean the storage controller is offline or not functioning. A storage initiator error can be seen for a number of reasons and should be investigated as a separate issue with our support team if you feel these messages are causing an issue with your environment. In the vmkernel.log file you may see the errors similar to:
H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0
Note: There is a known issue with some Emulex firmware which can result in this host code being returned. For more/related information, see When using Emulex HBAs, SCSI commands fail with the status: Storage Initiator Error (1029456).
This code is specific to LSI based arrays (IBM FastT, IBM DS4000 series, SUN StorageTek) and implies that a request has been made to the non-owning storage controller. Since AVT (Auto-Volume Transfer) is disabled, the ESX host handles the condition and failover to the other controller.
ESXi 5.0 introduced a new sense code to deal with Permanent Device Loss (PDL) condition. LOGICAL UNIT NOT SUPPORTED is usually set when a LUN is no longer available or is unmapped.
2/0 0x2 0x4 0x1 - NOT READY - LOGICAL UNIT IS IN PROCESS OF BECOMING READY
IBM FAStT LSI based arrays use this sense code during a Non-Disruptive Firmware Upgrade to tell the host to sop using each SP in turn.If the ESX host receives a SCSI code other than those listed above, a failover does not occur.
2/0 0x6 0x2a 0x6 - UNIT ATTENTION - ASYMMETRIC ACCESS STATE CHANGED
This sense code is specific to ALUA environments configured with alua_followover enabled. The path changed from optimized to unoptimizied, so a failover occurs.
For more information about ALUA environments, see: