The VMware Knowledge Base provides support solutions, error messages and troubleshooting guides
SCSI events that can trigger ESX server to fail a LUN over to another path (1003433)
Several conditions trigger the ESX server to failover to another available path:
0/1 0x0 0x0 0x0 - DID_NO_CONNECT
When the fabric returns DID_NO_CONNECT status, the ESX host detected that a target is no longer present. The DID_NO_CONNECT status is caused by a fabric switch failure, a disconnected physical cable, or a zoning change that no longer allows the ESX host to see the array.
Note: This is the only sense code that affects both Active/Active and Active/Passive arrays. The other sense codes only trigger fail over on Active/Passive arrays.
2/0 0x2 0x4 0x3 - MEDIUM ERROR - LOGICAL UNIT NOT READY
2/0 0x5 0x4 0x3 - ILLEGAL REQUEST - LOGICAL UNIT NOT READY
The medium error and illegal request indicate that the LUN is not in a ready state. Manual intervention is required on the array to correct this issue.
2/0 0x2 0x4 0xa - NOT_READY - LUN IS NOT READY AND TARGET PORT IS IN TRANSITION
The LUN is not ready and target port is in transition error can occur under the following conditions:
The ownership of a LUN is being transitioned between storage processors on an active/passive array.
The LUN is in a transitional state. For example, when a LUN is being created for the first time.
0/7 0x0 0x0 0x0 - INTERNAL ERROR - DID_ERROR (Storage Initiator Error)
A new failover condition has been introduced in ESX 3.5 that allows us to recognize when an EMC Clariion SP is hung and issue additional commands to verify its status. When this SCSI code is captured, the ESX host queries the peer SP to see if the original one is alive. If the peer SP cannot get a response, a failover is initiated and the SP is marked as hung/dead.
Note: It is possible to see the 0x7 0x0 0x0 0x0 code for other arrays however this does not necessarily mean the storage controller is offline or not functioning. A storage initiator error can be seen for a number of reasons and should be investigated as a separate issue with our support team if you feel these messages are causing an issue with your environment. In the vmkernel.log file you may see the errors similar to:
H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0
Note: There is a known issue with some Emulex firmware which can result in this host code being returned. For more/related information, see When using Emulex HBAs, SCSI commands fail with the status: Storage Initiator Error (1029456).
2/0 0x5 0x94 0x1 - ILLEGAL REQUEST - SCSI_ASC_INVALID_REQ_DUE_TO_CURRENT_LU_OWNERSHIP
This code is specific to LSI based arrays (IBM FastT, IBM DS4000 series, SUN StorageTek) and implies that a request has been made to the non-owning storage controller. Since AVT (Auto-Volume Transfer) is disabled, the ESX host handles the condition and failover to the other controller.
- 2/0 0x5 0x25 0x0 - ILLEGAL REQUEST - LOGICAL UNIT NOT SUPPORTED
ESXi 5.0 introduced a new sense code to deal with Permanent Device Loss (PDL) condition. LOGICAL UNIT NOT SUPPORTED is usually set when a LUN is no longer available or is unmapped.
If the ESX host receives a SCSI code other than those listed above, a failover does not occur.
For translated versions of this article, see:
Request a Product Feature
To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.