Race condition on ESXi 6.0 and 6.5 causing disruption to VMs or hostd management service
search cancel

Race condition on ESXi 6.0 and 6.5 causing disruption to VMs or hostd management service

book

Article ID: 317641

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • ESXi on 6.0 and  6.5 VM fails to start or consolidate with error: File system specific implementation of LookupAndOpen[file] failed.
  • In the /var/run/log/vmkernel.log file, you see entries similar to:
           "CBT Disconnect Node failed for [...] with Busy" 
2017-09-20T06:12:40.917Z cpu69:68776 opID=c721f607)FDS: 586: Enabling IO coalescing on driver 'deltadisks' device 'c0060fe-ISOM-HTML-19-sesparse.vmdk'
2017-09-20T06:12:41.088Z cpu69:68776 opID=c721f607)DD: 726: Destroying cow device c0060fe-ISOM-HTML-19-sesparse.vmdk failed: Busy

     
     

          The var/run/hostd.log , you will see entries to similar
  • 2017-09-20T06:12:41.088Z info hostd[ABC2B70] [Originator@6876 sub=DiskLib opID=6e44f205-bb-4683 user=vpxuser:vpxuser] DISKLIB-CHAINESX : ChainESXCloseSubChain: failed to unlink '/vmfs/devices/deltadisks/c0060fe-ISOM-HTML-19-sesparse.vmdk' : error 16
     
NOTE: The preceding log excerpts are only examples. Date,time and environmental variables may vary depending on your environment

Cause

The child disk inherits descriptors of files opened by the parent with exclusive locks, and causes a race condition.

Resolution

The issue is also resolved in ESXi 6.5 Release patch ESXi650-201710401 and ESXi 6.0 Patch release ESXi600-201909001 . To download go to Customer Connect

Workaround:
To workaround use any of the below workarounds
1. Restart ESXi Management agents
2. Delete and re-provision the VMs from Horizon View.

Additional Information

Impact/Risks:
None