Search the VMware Knowledge Base (KB)
Products:
View by Article ID

VMware ESXi 6.0, Patch ESXi600-201611401-BG: Updates esx-base, vsanhealth, vsan VIBs (2146985)

  • 0 Ratings
Language Editions

Details

Release date: Nov 22, 2016

Patch Category Bugfix
Patch Severity Critical
Build For build information, see KB 2146984.
Host Reboot Required Yes
Virtual Machine Migration or Shutdown Required Yes
Affected Hardware N/A
Affected Software N/A
VIBs Included
  • VMware:esx-base:6.0.0-2.52.4600944
  • VMware:vsanhealth:6.0.0-3000000.3.0.2.52.4527730
  • VMware:vsan:6.0.0-2.52.4590152
PRs Fixed 1330822, 1469999, 1493315, 1573217, 1594773, 1595237, 1603125, 1619422, 1621662, 1629888, 1645433, 1646568, 1650027, 1652234, 1654155, 1655064, 1656412, 1662046, 1663640, 1667416, 1668717, 1670367, 1672017, 1674528, 1676797, 1677228, 1680966, 1681955, 1686092, 1689017, 1689846, 1690686, 1693509, 1694463, 1694997, 1698685, 1700411, 1703090, 1704706, 1705219, 1719673, 1725122, 1732575, 1745380, 1753167
Related CVE numbers NA

Solution

Summaries and Symptoms

This patch updates the esx-base, vSANhealth, and vSAN VIBs to resolve the following issues:

  • When you ping the local host in a Windows VM, the VM displays negative ping times.

  • Attempts to perform a vMotion for a VM with large memory size might fail with an Admission check failure. The issue occurs due to failure in increasing the memory reservation for regular swap bitmap page pool for the copy-on-write (COW) blocks. You can see an error log similar to the following in the vmware.log file:

    nnnn-nn-bbT14:54:39.nnnZ| vcpu-0| W110: MigratePlatformSaveMemory: Admission check failed for memory resource

  • The hostd service might stop responding and fail due to the memory leak in vmkctl, which causes the hostd to take up more memory. An error message similar to the following is displayed:

    Memory exceeds hard limit

  • Attempts to collect the output of the Python SMX provider scripts after you install vSphere 6.0 Update 1 (Build 3029758) on BL920s Gen8 (8 Blade partition) server displays a backtrace similar to the following:

    Traceback (most recent call last):
    File "SMXBaseServerFT.py", line 105, in
    assocs = conn.Associators( computerSystem, ResultClass
    'SMX_OperatingSystem', AssocClass
    = 'SMX_RunningOS')
    File "/build/mts/release/bora-3029758/bora/build/esx/release/vmvisor/sys/lib/python2.7/site-packages/pywbem/cim_operations.py", line 696, in Associators
    File "/build/mts/release/bora-3029758/bora/build/esx/release/vmvisor/sys/lib/python2.7/site-packages/pywbem/cim_operations.py", line 173, in imethodcall
    pywbem.cim_operations.CIMError: (0, "The web server returned a bad status line: ''''")


    For more information, see KB 2147525.

  • The lsu hpsa plugin does not support the Get disk location option in the HBA mode for the HP Smart Array Controller.

  • Windows 2016 Hyper-V VMs might stop responding when booted up on an AMD Opteron system.

  • Support enabled for virtual machines with VM DirectPath I/O (passthrough) PCIe devices to use a total of more than 32GB of the 64-bit Memory-mapped I/O (MMIO) space. The required MMIO space is now user configurable, for more details see KB 2142307.

  • In the vSphere Web Client, the Hardware Status tab under the Monitor option does not capture proper data for Sensor, Alerts and warning and system event log fields.

  • The ESXi host joined to a Active Directory domain having large number of users and groups might become unresponsive and you are unable to connect to the host using the SSH or the vSphere Client. Also, the SSH and the DCUI Console access might become unresponsive. The issue occurs because the likewise cache file consumes all the available space.

  • When the VAAI plugins are attached to VVol Protocol Endpoint (PE) luns, the xCopy SCSI commands do not send the VVol details properly in the source and target device designator fields.

    The issue occurs because PEs are the Input and Output end points for the VVols, and with VAAI plugin attached to the PE, it is not possible to pass the VVol details of both the source and the target devices while doing xCOPY from VMFS to VVol or from VVol to VVol.

    Note: Some of the parameters associated with multi-segment xCOPY support are passed through plugin options. To continue to issue multi-segment xCOPY, you need to retain those claim rules. In such cases, even if plugin is not attached, the plugin options are used.

  • The vSAN Health check plug-in might report the Component metadata health test as Failed. For more information, KB 2145347.

  • The web GUI for the vSAN datastore summary displays the values on the vSAN Capacity page as zero when all the hosts in the cluster have a vm in the non vSAN datastore.

  • Attempts to perform memory hot add on virtual machines configured with virtual graphics processing unit (vGPU) fixed passthrough device might fail. The failure seen and the resulting error message do not relate to the issue. An error message similar to the following is seen:

    the hot-plug operation failed" ; "invalid configuration for device '0'

  • ESXi host might disconnect from the vCenter Server and might not respond through DCUI or vSphere Client when hostd attempts to query the available disk for VMFS. The issue occurs due to a deadlock between some hostd threads resulting in the hostd service to stop responding.

  • ESXi 6.0 virtual machines might fail to power on during an HA fail over because it failed to lock virtual machine files.

  • After you perform a vMotion, the vCenter Server Data Object VirtualMachineRuntimeInfo property bootTime value is reset to Unset.

  • When you use the MegaCLI LSI RAID controllers, the openwsmand service might stop responding due to memory leak in the ESXi WSMAN agent (Openwsman).

  • Attempts to perform Storage vMotion in an environment where the backend storage is VMAX which supports hardware acceleration (XCOPY) might fail. The issue occurs due to incorrect timeout value for SCSI commands after failing over from XCOPY.

  • An ESXi host, where the vvold (daemon) is not in a running state, might timeout when attempting to update the vCenter Server GUID to vvold and the host disconnects from the vCenter Server. You can see error messages similar to the following in the hostd.log file:

    nnnn-nn-nnT19:38:53.550Z verbose hostd[nnnnnnnn] [Originator@6876 sub=Default opID=nnnnnnnn-nnnnnnnn-nn-nn-eb-d8be user=vpxuser] AdapterServer: target='vim.VasaVvolManager:ha-vasa-manager', method='updateVasaClientContext'
    nnnn-nn-nnT19:38:53.551Z info hostd[nnnnnnnn] [Originator@6876 sub=Vimsvc.TaskManager opID=nnnnnnnn-nnnnnnnn-nn-nn-eb-d8be user=vpxuser] Task Created : haTask--vim.VasaVvolManager.updateVasaClientContext-511597
    nnnn-nn-nnT19:38:53.551Z verbose hostd[nnnnnnnn] [Originator@6876 sub=PropertyProvider opID=nnnnnnnn-nnnnnnnn-nn-nn-eb-d8be user=vpxuser] RecordOp ADD: recentTask["haTask--vim.VasaVvolManager.updateVasaClientContext-511597"], ha-taskmgr. Applied change to temp map.

  • Attempts to perform vMotion fail if the physical NICS of the Link Aggregation Groups (LAG) uplinks are moved to normal distributed switch uplinks.

  • An ESXi host might stop responding and display a purple screen with backtrace similar to the following:

    #0 EtherswitchInitLookupCtx (etherswitch=etherswitch@entry=0xnnnnnnnnnnnn,
    #1 0xnnnnnnnnnnnnnnnn in EtherswitchPortDispatch (ps=<optimized out>,
    #2 0xnnnnnnnnnnnnnnnn in Portset_Input (pktList=0xnnnnnnnnnnnn,
    #3 Port_InputResume (port=port@entry=0xnnnnnnnnnnnn, prev=prev@entry=0x0,
    #4 0xnnnnnnnnnnnnnnnn in Port_Input (port=0xnnnnnnnnnnnn,
    #5 0xnnnnnnnnnnnnnnnn in TcpipTxDispatch (dispatch=<optimized out>,
    #6 0xnnnnnnnnnnnnnnnn in TcpipDispatchLoop (numProcessed=0xnnnnnnnnnnnn,
    #7 TcpipDispatch (wdt=wdt@entry=0x0, private=0xnnnnnnnnnnnn,
    #8 0xnnnnnnnnnnnnnnnn in TcpipDispatchWorld (private=<optimized out>)
    #9 0xnnnnnnnnnnnnnnnn in CpuSched_StartWorld (
    #10 0xnnnnnnnnnnnnnnnn in CpuSched_ArchStartWorld (destWorld=0xnnnnnnnnnnnn,

  • Repeated change of the filter policy on a VM with a lot of disks results in memory leak, which might cause the hostd service to fail.

  • When the ESXi host is under a lockdown mode, in a vSAN environment, the SIMS (Storage Infrastructure Management Service) is unable to connect to the hostd service. The issue might cause the vSAN management and the health service to fail.

  • The hostd service might stop responding if multiple threads change the StatsMetadata object simultaneously. You will see a backtrace similar to the following in the hostd coredump:

    #8 Statssvc::StatsMetadata::IsCounterEnabled (this=0xnnnnnnnn, counterId=-nnnnnnnnnn) at bora/vim/hostd/statssvc/statsMetadata.cpp:2251
    #9 0xnnnnnnnn in Statssvc::StatsMetadata::GetAllCountersInt (this=0xnnnnnnnn, counterIds=...) at
    bora/vim/hostd/statssvc/statsMetadata.cpp:2080

  • After you upgrade from ESXi 5.0 Update 3 to ESXi 6.0 Update 2, Windows VMs might display a blue screen on reboot. The issue is observed in CPUs that support xsave instructions.

  • During vMotion of the NSX Edge VM, the ESXi host might fail with a purple screen that contains messages similar to the following:

    cpu14:2999181)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]MCSLockWithFlagsWork@vmkernel#nover+0x1 stack: 0xnnnnnnnnnnnn, 0x0,
    cpu14:2999181)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]DVFilterRefFilter@com.vmware.vmkapi#v2_3_0_0+0x44 stack: 0x0, 0x4180
    cpu14:2999181)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]DVFilterCheckpointGetSize@com.vmware.vmkapi#v2_3_0_0+0x78 stack: 0xb
    cpu14:2999181)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]VMotionRecv_GetDVFilterState@esx#nover+0x26 stack: 0xnnnnnnnnnnnn, 0
    cpu14:2999181)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]VMotionRecv_ExecHandler@esx#nover+0x12b stack: 0x14d, 0xnnnnnnnnnnnn
    cpu14:2999181)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]VMotionRecv_Helper@esx#nover+0x170 stack: 0x0, 0x77dc, 0x0, 0x0, 0x0
    cpu14:2999181)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0

  • When you apply a Filter Policy with vm-specific filtering rules through port specific configuration, you might encounter DVS Port error where the virtual NIC disconnects and is unable to be connected again.

  • When performing multiple VM operations, the Log Facility might fail due to thread name corruption. The issue results in the logging to fail.

  • Attempts to run esxcli command might fail with an error message similar to the following in the hostd.log file:

    Error on command storage filesystem list. Error was Invalid name specified for MetaStructure: 'FilesystemVolume'

  • NSX Edge fails to send OSPF Hello-packets when IGMP snooping is enabled on vSphere Distributed switch due to which OSPF adjacency goes down.

  • In an environment with Nutanix NFS storage, the secondary FT virtual machine fails to take over when the primary FT virtual machine is down. After you apply this patch, you can configure the RenameRPCTimeout parameter by running the following command:

    esxcfg-advcfg -s 20 /NFS/RenameRPCTimeout

    Note: As the host profile does not capture the RenameRPCTimeout parameter, the value of RenameRPCTimeout is not persistent in stateless environments.

  • A virtual machine might get killed with signal 9 when the BufferCacheAlloc causes the VMX thread to wait, and the watchdog detects the condition and kills the thread. An error message similar to the following is logged in the vmkernel.log file:

    The Kernel killed us because the FiltLibUpcall didn't complete for 120secs.
    nnnn-nn-nnT10:47:11.647Z cpu13:nnnnnnnnnn)WARNING: FiltModS: FiltMod_CheckUpcallThreads:839: Upcall thread 1001008528 did not signal liveness within last 120000 ms, suspect hang up. Killing associated cartel nnnnnnnnnn.


    This patch resolves the issue by providing the "IoFilterWatchdogTimeout" config option that allows you to either disable the watchdog by setting the value to 0 or to set it to a value between 120 seconds and 3600 seconds.

  • When using vSANSparse snapshot, the ESXi host might stop responding and display a purple screen with the following backtrace:

    @BlueScreen: Spin count exceeded (vSANsparseCache-0xnnnnnnnnnnnn) - possible deadlock with PCPU 15
    Code start: 0xnnnnnnnnnnnn VMK uptime: 3:06:50:20.789
    Saved backtrace from: pcpu 15 SpinLock spin out NMI
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]rb_tree_iterate@vmkernel#nover+0x25 stack: 0xnnnnnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]vSANSparseUpdateCache@com.vmware.vSAN#0.0.0.1+0xc8 stack: 0xnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]vSANSparseGWECb@com.vmware.vSAN#0.0.0.1+0x14d8 stack: 0xnnnnnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]DOMClient_GetWrittenExtentsCb@com.vmware.vSAN#0.0.0.1+0x1b stack: 0x
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]DOMClientSendResponse@com.vmware.vSAN#0.0.0.1+0x44d stack: 0x0
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]DOMOperationSendResponseToBackend@com.vmware.vSAN#0.0.0.1+0x19 stack
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]DOMOperationDispatch@com.vmware.vSAN#0.0.0.1+0xdc stack: 0xnnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]vSANServerMainLoop@com.vmware.vSANutil#0.0.0.1+0x2bb stack: 0xnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]WorldletBHHandler@vmkernel#nover+0x51e stack: 0xnnnnnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]BH_Check@vmkernel#nover+0xe1 stack: 0xnnnnnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]CpuSchedIdleLoopInt@vmkernel#nover+0x182 stack: 0x8000
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]CpuSchedDispatch@vmkernel#nover+0x16b5 stack: 0xnnnnnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]CpuSchedWait@vmkernel#nover+0x240 stack: 0x0
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]WorldWaitInt@vmkernel#nover+0x28e stack: 0x2001
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]UserObj_Poll@<None>#<None>+0x106 stack: 0xnnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]LinuxFileDesc_Ppoll@<None>#<None>+0x2c2 stack: 0xnnnnnnnn
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]User_LinuxSyscallHandler@<None>#<None>+0x25a stack: 0x3
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]User_LinuxSyscallHandler@vmkernel#nover+0x1d stack: 0x10b
    0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]gate_entry_@vmkernel#nover+0x0 stack: 0x0


  • The Hostd service might stop responding and a core dump is generated under /var/core. Error messages similar to the following might be displayed in the hostd.log file:

    Memory exceeds hard limit. Panic
    Lookupvm: Hostctl error retrieving VmkVm Obj for VM

  • The SCSI Unmap command issued from Guest OS to the disk with IO filters attached might result in a data loss due to a race condition.

  • An ESXi host might fail to list the Active Directory Domain users for assigning permissions in the permission tab if the DNS name of the Host is different from the DNS name of the Active Directory Domain. An error message similar to the following is logged in the hostd.log file:

    Error accessing directory: Can't bind to LDAP server for domain

  • Attempts to open multiple virtual machines with VMRC connections might fail. An error message similar to the following might be displayed:

    Unable to connect to the MKS: Connection terminated by server.

    The issue occurs when the vSANdevicemonitord runs using the Init process memory resource pool instead of its own memory space. The vSANdevicemonitord might affect the available memory for other processes running in the Init memory resource pool.

  • Attempts to apply a Host Profile with scratch configuration pointed to NFS datastore might fail with an error message similar to the following if the NFS is not manually mounted first:

    Cannot apply the host configuration

    The issue occurs because the scratch partition is located on an NFS datastore. As a result, attempts to update the scratch location results in the error because the scratch location is not available on the ESXi host at the time when the setting is being configured by host profiles.

  • SCSI operations might fail as the storage heap of the host gets exhausted with out of memory messages. You can see error messages similar to the following in the /var/log/vmkernel.log file:

    vmkernel.6:nnnn-nn-nnT21:27:33.888Z cpu0:nnnnn)WARNING: ScsiPeriodicProbe: 1125: Failed to issue device reclaim request to helper queue: Out of memory
    vmkernel.6:nnnn-nn-nnT21:27:33.888Z cpu0:nnnnn)WARNING: ScsiPeriodicProbe: 461: Failed to issue probe request to helper queue: Out of memory


    The issue might cause the ESXi host to stop responding and disconnect from the vCenter Server.

  • When a driver or module calls for memory allocation, the ESXi host might fail and display a purple screen with the following messages due to PCPU lock up.

    cpu14:32793)WARNING: Heartbeat: 796: PCPU 9 didn't have a heartbeat for 50 seconds; *may* be locked up.
    cpu14:32793)World: 9729: PRDA 0xnnnnnnnnnnnn ss 0x0 ds 0x10b es 0x10b fs 0x0 gs 0x13b
    cpu14:32793)World: 9731: TR 0xnnnn GDT 0xnnnnnnnnnnnn (0x402f) IDT 0xnnnnnnnnnnnn (0xfff)
    cpu14:32793)World: 9732: CR0 0xnnnnnnnn CR3 0xnnnnnnnnn CR4 0xnnnnn
    ....
    cpu9:33339)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]MemNode_NUMANodeMask2MemNodeMask@vmkernel#nover+0x5b stack: 0xnnnnnn
    cpu9:33339)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]MemDistributeNUMAPolicy@vmkernel#nover+0x27a stack: 0xnn
    cpu9:33339)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]MemDistribute_Alloc@vmkernel#nover+0x299 stack: 0xnnnnnn
    ....
    ....
    ....
    ....
    cpu9:2778012)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]User_LinuxSyscallHandler@vmkernel#nover+0x1d stack: 0x0
    cpu9:2778012)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn]gate_entry_@vmkernel#nover+0x0 stack: 0x0

  • A vSAN cluster might show some components with metadata in an invalid state due to a race condition.

  • The Virtual Desktop Infrastructure (VDI) cluster might reach close to 100% disk utilization of the vSAN datastore under the following circumstances:

    • The vSAN 6.2 or later cluster is using All Flash
    • The Deduplication and Compression option is On
    • The disk group size is greater than 1.6TB
    • The disk group has greater than 16TB of deletes

  • The relog tasks on a SSD (vSAN Cache disk) might get stuck when it encounters an INVALID Log Index. The issue causes a buildup of LLOG Log and raises the log congestion which results in increased input and output latencies.

  • An ESXi host with CBT enabled VMs might stop responding and display a purple screen with a backtrace similar to the following due to a race condition when the VMX is exited:

    #0 List_RemoveNoReinit (fileDesc=0xnnnnnnnnnnnn, fhID=nnnnnnnnn, openFlags=<value optimized out>, failedOpen=<value optimized out>) at bora/vmkernel/public/list.h:390
    #1 List_Remove (fileDesc=0xnnnnnnnnnnnn, fhID=nnnnnnnnnn, openFlags=<value optimized out>, failedOpen=<value optimized out>) at bora/vmkernel/public/list.h:415
    #2 FSSClearObjectOpenState (fileDesc=0xnnnnnnnnnnnn, fhID=nnnnnnnnn, openFlags=<value optimized out>, failedOpen=<value optimized out>) at bora/vmkernel/filesystems/fsSwitch.c:5879
    #3 FSSCloseFile (fileDesc=0xnnnnnnnnnnnn, fhID=nnnnnnnnn, openFlags=<value optimized out>, failedOpen=<value optimized out>) at bora/vmkernel/filesystems/fsSwitch.c:3306
    .....
    .....
    .....
    #15 0x0000nnnnnnnnnnnn in Syscall_InstructionHandler ()
    #16 0x0000nnnnnnnnnnnn in gate_entry ()
    #17 0x000000000cd1e840 in ?? ()

Patch Download and Installation

The typical way to apply patches to ESXi hosts is through the VMware vSphere Update Manager. For details, see the Installing and Administering VMware vSphere Update Manager.
 
ESXi hosts can be updated by manually downloading the patch ZIP file from the VMware download page and installing the VIB by using the esxcli software vib command. Additionally, the system can be updated using the image profile and the esxcli software profile command. For details, see the vSphere Command-Line Interface Concepts and Examples and the vSphere Upgrade Guide.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 0 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 0 Ratings
Actions
KB: