This article provides information on understanding the lost access to volume related messages in ESXi.
The VMFS datastores are monitored through the heartbeats that are issued in the form of write operations approximately once in every 3 seconds to the VMFS volumes from the hosts. Each ESXi host accessing the VMFS datastores expects these heartbeat write I/O operations to complete within a 8 second window. If the heartbeat I/O does not complete within an 8 second window, the I/O is timed out and a subsequent heartbeat I/O is issued. If the total time of the heartbeat I/O does not complete within a 16 second window, the datastore is marked offline and a Lost access to volume log message is generated by hostd to reflect this behavior.
After a VMFS datastore is marked in an offline state, ESXi issues heartbeat I/O to the datastore approximately every 1 second until connectivity is restored. If a heartbeat I/O completes, the datastore is marked back online and host I/O is allowed to continue.
Symptoms:
- Virtual machines display as inaccessible.
- In the /var/log/hostd.log file, you see entries similar to:
2015-07-02T02:00:11.675Z [4F1E1B70 info 'Vimsvc.ha-eventmgr'] Event 205 : Lost access to volume 54f89e21-4427e506-b968-a0369f519998 (228.154.ds3) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
2015-07-02T02:00:37.055Z [4F480B70 info 'Vimsvc.ha-eventmgr'] Event 210 : Successfully restored access to volume 54f89e21-4427e506-b968-a0369f519998 (example datastore) following connectivity issues.
- In the /var/log/vobd.log file, you see entries similar to:
2015-07-02T02:00:11.673Z: [vmfsCorrelator] 115715089142us: [esx.problem.vmfs.heartbeat.timedout] 54f89e21-4427e506-b968-a0369f519998 example datastore
2015-07-02T02:00:37.054Z: [vmfsCorrelator] 115740470730us: [esx.problem.vmfs.heartbeat.recovered] 54f89e21-4427e506-b968-a0369f519998 example datastore
- In the /var/log/vmkernel.log file, you see entries similar to:
2015-07-02T02:00:11.282Z cpu10:36273)HBX: 2832: Waiting for timed out [HB state abcdef02 offset 3444736 gen 549 stampUS 115704005679 uuid 5592d754-21d7d8a7-0a7e-a0369f519998 jrnl <FB 779600> drv 14.60] on vol 'example datastore'
2015-07-02T02:00:37.054Z cpu26:32873)HBX: 258: Reclaimed heartbeat for volume 54f89e21-4427e506-b968-a0369f519998 (example datastore): [Timeout] Offset 3444736
- In vCenter Server, you see an event similar to:
Lost access to volume 54f89e21-4427e506-b968-a0369f519998 (example datastore) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.