Knowledge Base
The VMware Knowledge Base provides support solutions, error messages and troubleshooting guides

|
Virtual machines stop responding when any LUN on the host is in an all-paths-down (APD) condition (1016626)
Solution
The issue is resolved in ESX/ESXi 4.1 Update 1 and the fix has also been included with ESXi 5.0.
This issue is resolved in the patch release for ESX 4.0. For more information see, VMware ESX 4.0, Patch ESX400-200912401-BG: Updates vmkernel, vmklinux, tools, CIM, and perftools (1016291).
This issue is resolved in the patch release for ESX 4.0. For more information see, VMware ESX 4.0, Patch ESX400-200912401-BG: Updates vmkernel, vmklinux, tools, CIM, and perftools (1016291).
Notes:
-
With ESX/ESXi 4.1 Update 1 and ESX/ESXi 4.0 Update 3, you no longer have to make the modification to the advanced setting. Virtual machines that are not associated with the APD Volume(s) do not become unresponsive upon a rescan.
-
When unpresenting a LUN containing a datastore, follow the instructions in Removing a LUN containing a datastore from VMware ESXi/ESX 4.x (1029786). If the issue still persists, contact VMware Technical Support.
Workaround
ESX/ESXi 4.x can list all of the LUNs it detects, as well as the state of these LUNs. If none of the paths to a storage device are in the ACTIVE state, then ESX/ESXi considers the device to be in an all-paths-down state. If an all-paths-down state does exist, then this is likely the issue causing LUNs to be unresponsive, either for a limited period of time or permanently, when a rescan occurs. For more information, see Identifying disks when working with VMware ESX (1014953).
If virtual machines are not responding on an ESX/ESXi 4.0 host, determine if an all-paths-down condition exists by executing:
# esxcfg-mpath --list-paths --device <device naa> | grep state
or
# esxcfg-mpath --list-paths --device <device mpx> | grep statewhere:
- <device naa> is the Network Addressing Authority (NAA) unique address for the full storage device
- <device mpx> is the identifier if a NAA ID is not available
Note: For information about using the command line with ESXi, see Tech Support Mode for Emergency Support (1003677).
Starting with ESX/ESXi 4.0 Update 1, you can set an advanced configuration option on all hosts in the vCenter Server cluster to reduce rescan times and to prevent virtual machines from not responding. By default this option is disabled.
Caution: Not every all-paths-down condition is permanent. Some all-paths-down conditions, such as those that occur briefly during a network re-configuration, are transient. Enabling this option can cause devices in a transient all-paths-down state to become unavailable. It is recommended to disable this option after the rescan operation completes.
To enable this option, execute:
To disable and reset to the default value without requiring downtime, execute:# esxcfg-advcfg -s 1 /VMFS3/FailVolumeOpenIfAPD
# esxcfg-advcfg -s 0 /VMFS3/FailVolumeOpenIfAPD
To check the value of this option, execute:
# esxcfg-advcfg -g /VMFS3/FailVolumeOpenIfAPD
To minimize the amount of time that the virtual machines are unresponsive, apply patch:
Note: This does not apply for ESX/ESXi 4.0 Update 2 and 4.1, because the patch is integrated in these versions.
With ESX 4.1 Update 1 and ESX 4.0 Update 3, you no longer have to make the modification to the advanced setting. Virtual machines that are not associated with the APD Volume(s) do not become unresponsive upon a rescan.
For more information, see:
Additional Information
For translated versions of this article, see:
Tags
Update History
Request a Product Feature
To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.
Actions
KB:
- Updated:
- Categories:
- Languages:
- Product Family:
- Product(s):
- Product Version(s):

