Intermittent NFS APDs on VMware ESXi 5.5 U1 (2076392)
- Intermittent APDs for NFS datastores are reported, with consequent potential blue screen errors for Windows virtual machine guests and read-only file systems in Linux virtual machines.
Note: NFS volumes include VSA datastores.
- For the duration of the APD condition and after, the array still responds to ping and netcat tests are also successful, and there is no evidence to indicate a physical network or a NFS storage array issue.
- The NFS storage array logs and traces do not indicate an issue.
- Hosts that are not running ESXi 5.5 U1 continue to work and can read and write to the NFS share.
vobd.logfiles (located at /var/log/) contain entries similar to:
Note: These log entries use a volume named
12345678-abcdefg0as an example:
YYYY-04-01T14:35:08.074Z: [APDCorrelator] 9413898746us: [vob.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
YYYY-04-01T14:35:08.075Z: [APDCorrelator] 9414268686us: [esx.problem.storage.apd.start] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down state.
YYYY-04-01T14:36:55.274Z: No correlator for vob.vmfs.nfs.server.disconnect
YYYY-04-01T14:36:55.274Z: [vmfsCorrelator] 9521467867us: [esx.problem.vmfs.nfs.server.disconnect] 192.168.1.1/NFS-DS1 12345678-abcdefg0-0000-000000000000 NFS-DS1
YYYY-04-01T14:37:28.081Z: [APDCorrelator] 9553899639us: [vob.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.
YYYY-04-01T14:37:28.081Z: [APDCorrelator] 9554275221us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [12345678-abcdefg0] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.
This article describes a specific issue. If you experience all the symptoms described in the Symptoms section, see the sections listed herein. If you experience some but not all of these symptoms, your issue is not related to this article. Remember to evaluate the storage to ensure that NFS storage is configured according to the VMware best practice and the network connections between hosts and SAN are not routed.
If you are unable to upgrade, VMware recommends using ESXi 5.5 GA with all appropriate security patches. For more information on patching ESXi 5.5 GA for the Heartbleed vulnerability, see Resolving OpenSSL Heartbleed for ESXi 5.5 - CVE-2014-0160 (2076665).
Note: Use the esxcli software vib list command to list all applied patches on a host.
For more information, see Restarting the Management agents on an ESXi or ESX host (1003490).
- Virtual machines stop responding when any LUN on the host is in an all-paths-down (APD) condition (1016626)