VMware ESXi 6.5, Patch Release ESXi650-201712401-BG: Updates esx-base, esx-tboot, vsan, and vsanhealth VIBs
search cancel

VMware ESXi 6.5, Patch Release ESXi650-201712401-BG: Updates esx-base, esx-tboot, vsan, and vsanhealth VIBs

book

Article ID: 326736

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Release date: December 19, 2017

Patch CategoryBugfix
Patch SeverityCritical
Host Reboot RequiredYes
Virtual Machine Migration or Shutdown RequiredYes
Affected HardwareN/A
Affected SoftwareN/A
VIBs Included
  • VMware_bootbank_esx-base_6.5.0-1.36.7388607
  • VMware_bootbank_esx-tboot_6.5.0-1.36.7388607
  • VMware_bootbank_vsan_6.5.0-1.36.7388608
  • VMware_bootbank_vsanhealth_6.5.0-1.36.7388609
PRs Fixed 1783062, 1815818, 1819261, 1827373, 1836648, 1840215, 1850351, 1852059, 1854779, 1854783, 1857401, 1860060, 1861934, 1862770, 1862779, 1862788, 1862904, 1866866, 1868025, 1872523, 1873082, 1874634, 1875240, 1875822, 1876225, 1876475, 1879433, 1880323, 1881658, 1882524, 1884219, 1886521, 1888478, 1888629, 1890411, 1892149, 1893288, 1897585, 1897677, 1898523, 1898578, 1901119, 1901772, 1902357, 1903311, 1903950, 1904797, 1905710, 1905910, 1906177, 1906786, 1906991, 1907016, 1907182, 1907583, 1908230, 1908258, 1910263, 1910545, 1911717, 1911856, 1912461, 1913503, 1914201, 1915026, 1915223, 1915833, 1916214, 1918679, 1918776, 1919066, 1919921, 1919975, 1921752, 1923052, 1923504, 1924484, 1924918, 1925390, 1927474, 1928114, 1928654, 1930348, 1930485, 1930887, 1931012, 1931041, 1931370, 1933592, 1934061, 1934578, 1935680, 1937601, 1937655, 1938138, 1938633, 1939159, 1939253, 1940162, 1940983, 1941848, 1941907, 1942110, 1942710, 1943412, 1943691, 1947295, 1947402, 1948167, 1950696, 1951538, 1951999, 1952088, 1952879, 1953124, 1953369, 1954156, 1954878, 1955202, 1956175, 1956541, 1957407, 1957703, 1958491, 1958552, 1958901, 1958953, 1959242, 1959245, 1959258, 1959290, 1959307, 1959904, 1960102, 1960155, 1960189, 1960233, 1960902, 1961621, 1961622, 1961628, 1961631, 1961632, 1961635, 1961897, 1963001, 1963015, 1964660, 1964893, 1964899, 1965834, 1968463, 1968467, 1974986, 1976257, 1976485, 1976769, 1980910, 1982155, 1982652, 1989658, 1995733, 1997379, 2001389, 1678559
Related CVE numbersNA


Environment

VMware vSphere ESXi 6.5

Resolution

Summaries and Symptoms

This patch updates the esx-base, esx-tboot, vsan and vsanhealth VIBs to resolve the following issues:

  • Distributed Object Manager processing operations take too much time when an object has concatenated components. This can cause delays for other operations, which leads to high I/O latency.

  • With this fix, the VMW_SATP_LOCAL Storage Array Type Plug-In (SATP) provides multi-path support for local devices. The ad-hoc creation of claim rules to claim multiple paths to local devices is no longer necessary. Multi-path support for local devices is not applicable for hard disk drives that work in the 4K native mode and must not be used in such configurations. The new feature supports the fixed and most recently used (MRU) path selection policies (PSP), but does not support the Round Robin PSP (PSP_RR).

  • This fix introduces a new Storage Array Type Plugin (SATP) claim rule of the Native Multipathing Plugin (NMP) for Huawei XSG1 arrays to achieve optimal performance as it sets SATP to VMW_SATP_ALUA, PSP to VMW_PSP_RR and select tpgs_on as default.

  • ESXCLI now has command esxcli vsan debug evacuation precheck to provide information for preflight checks on ESXi hosts before going into maintenance mode.

  • Heavy resync traffic in a vSAN cluster might cause issues with performance and this fix implements thresholds for fairness between resync traffic and virtual machine I/O traffic.

  • Previously, logins by Active Directory domain user accounts ending with a dollar sign were treated as machine accounts by default and logins on vCenter Server and ESXi hosts failed.

  • With this fix, the first reboot of VSAN clusters with enabled deduplication might take up to ten times less to complete, because of a change in the hmap reads process from one at a time to multiple in parallel.

  • Frequent enable and disable calls of a virtual machine to the VMXNET3 device might cause an ESXi host to fail with a purple screen due to the high consumption of heap memory to generate events to the VMkernel. This fix introduces a node to the VMkernel Sys Info Shell (VSISH) at /config/Net/intopts/Vmxnet3DevEnableDelay and limits the maximum number of enable and disables to 33 per second. If you face a use case where a workload needs more than 33 enable and disables per second, you can tune this parameter.

  • When you run a virtual machine on an ESXi host, and the storage where the VM reside enters All-Paths-Down (APD) state, this might cause the VM to become inaccessible. However, the vmx process continues to run. Manual restart of the hostd service will allow the transition of the ESXi host to maintenance mode, while there is a VM powered on and the vmx process is still running.

  • Hostd might fail during a recompose operation for virtual desktop pools, because of an internal routine that attempts to free a string constant on a rarely executed path.

  • Hostd might run out of memory due to a vNIC link flap, because if multiple virtual machines connected to a distributed virtual switch generate frequent vNIC flaps, this might result in a large number of events posted to hostd that might exceed its memory limit.

  • If you use Internet Explorer 11 to open a virtual machine console via the VMware Host Client, the console might lose connectivity and go black. This issue is not reported for other browsers than Internet Explorer 11 and for other clients than the VMware Host Client.

  • If you observe that an ESXi becomes unresponsive to vCenter Server, the vpxa process shuts down and restarts intermittently, and a dump file of the vmx process is created under VM folders, this might be due to a failure of the vmx process while processing message event [msg.mks.noGPUResourceFallback] Hardware GPU resources are not available. The virtual machine uses software rendering. The vmx process passes a message event with invalid string to hostd, and the agent cannot create a proper response to the GetChange request from vpxa, so vpxa gets an error on deserializing the SOAP response body and opts to end the task, which fails the vmx process.

  • When an intermittent network loss occurs, this might cause Active Directory domains to go offline and fail user authentication. This fix attempts to reconnect the domains, when the offline state is suspected, and to bring it back to online state.

  • The compliance check of a Host Profile might time out if the profile has many IP ranges defined in the firewall rule sets, because the check might treat multiple IP address ranges as invalid IP addresses and to take long to finish.

  • If you provision more than one network adapter card to an ESXi host, the Single Root I/O Virtualization (SR-IOV) feature, which allows a single Peripheral Component Interconnect Express (PCIe) physical function to supply multiple virtual functions for use by guest operating systems, might not work as expected. When interpreting the module parameter that lists the number of virtual functions to create for each physical function, the kernel might match parameter values from the list to physical functions in an unexpected order. This fix makes changes to the VMkernel Device Manager (vmkdevmgr) so that the physical functions are always considered and matched to max_vfs parameter values in PCI SBDF address order.

  • The vmlinux driver, which creates millions of heap chunks, might cause the CPU to hog while collecting heap stats. If you use the vm-support command for heap stats collection, this might cause the ESXi host to fail with a purple screen and panic message TLB invalidation timeout. For example, @BlueScreen: PCPU 40 locked up. Failed to ack TLB invalidate.

  • If you use localcli commands or the sfcb-vmware_raw method in attempt to retrieve chassis information, the calls might fail with a core dump file such as sfcb-vmware_raw-zdump.000 and make system monitoring unavailable.

  • If the initialization of a session of the Virtual Volumes storage provider (VASA) takes long after a reboot of an ESXi host, attempted removal of a Virtual Volumes datastore might fail due to active bindings. This fix removes the active bindings after the initialization of a VASA session.

  • The SnmpWalk might cause the snmpd to stop responding on hardware server with greater than 128-bit CPUs. As a result, you must restart the snmpd process.

  • You might see long running virtual machines with VMware vSphere Fault Tolerance (FT) power off during failover. A memory leak during checkpointing causes the failure and since FT relies on checkpointing, the issue affects VMs with FT.

  • You might not see the SFCBD diagnostics file cim-diagnostic.sh.txt although the daemon runs, and SFCBD diagnostics might not be available when vm-support bundle is collected. The SFCBD diagnostics output collection task is fixed in this release.

  • ESXi hosts might fail if you delete VMkernel port groups with enabled VMware vSphere vMotion in NSX for vSphere environment and all settings revert after a reboot.

  • After a reboot of an ESXi host, you might not see datastores on NexSan E60 fiber channel arrays, because the datastores might have failed to mount automatically.

  • This fix resolves cases where VMware vSphere Distributed Switch might not pass OSPFv3 multicast packets with IPsec with Security Associations and Authentication Headers.

  • Due to a memory corruption, an ESXi host might fail with an error @BlueScreen: Spin count exceeded - possible deadlock with PCPU XXX with a backtrace similar to:

    PShare_RemoveHint@vmkernel

    VmMemCow_PShareRemoveHint@vmkernel

    VmMemCowPFrameRemoveHint@vmkernel

    VmMemCowPShareFn@vmkernel

    VmAssistantProcessTasks@vmkernel

    CpuSched_StartWorld@vmkernel

    Before the host fails with a purple diagnostic screen, you might see similar logs:

    04:40:54.761Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc795 status: "Invalid address" (bad0026)

    04:40:54.763Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc7b5 status: "Invalid address" (bad0026)

    04:40:54.764Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc7d5 status: "Invalid address" (bad0026)

    04:40:54.765Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc7f5 status: "Invalid address" (bad0026)

    04:40:54.766Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc815 status: "Invalid address" (bad0026)

    04:40:54.768Z cpu21:14392120)WARNING: UserMem: 14034: vmx-vthread-6: vpn 0xa00bc835 status: "Invalid address" (bad0026)

  • In configurations with multiple storage arrays on one HBA using a vmklinux driver, loss of connection to one of the arrays might stall IO operations of other arrays and even cause the ESXI host to lose connectivity to VMFS volumes.

  • If you migrate a virtual machine with a policy defined by the vSphere APIs for IO Filtering (VAIO), you might not be able to modify the Limit - IOPs value.

  • An ESXi host might fail with a purple diagnostic screen due to a spinlock in the Content Based Read Cache (CBRC) feature, if you try to run a recompute operation on a Virtual Machine Disk (VMDK) online. The issue is unlikely to happen if you do the recompute operation offline.

  • An ESXi host might fail with a purple diagnostic screen or become unavailable during standard operations with Microsoft Cluster Service (MSCS) such as resource failover and node failover, due to a race condition that depends on a narrow timing window.

  • SMBIOS strings in some processors might contain ISO Latin-1 strings that are treated as UTF-8 strings, which might cause errors and fail the installation of ESXi. With this fix, you can install ESXi on machines where the SMBIOS has ISO Latin-1 strings.

  • The device configuration setting Is Shared Clusterwide, that can be configured through the ESXCLI command set, might not persist across reboots and cause host profile compliance issues, specifically for hosts with SAN boot LUN devices. This fix makes the device configuration setting persistent across reboots.

  • Domain Join might fail with an error STATUS_INVALID_NETWORK_RESPONSE, if SMBv2 is enabled on an ESXi host and SMBv1 is disabled on Active Directory.

  • ESXi hosts might lose connectivity due to an I/O exception in a driver of Cisco Unified Computing System Virtual Interface Card Fibre Channel over Ethernet Host Bus Adapters (Cisco VIC FCoE HBA) that might cause the ESXi lose all paths on the associated HBA.

  • Even after you disable the coredumps target warning by setting UserVars.SuppressCoredumpWarning=1 in Advanced Settings, you still might see a warning similar to No coredump target has been configured. Host core dumps cannot be saved at every reboot of an ESXi host.

  • After you upgrade an ESXi host, you might be unable to search for Active Directory users or groups and assign them permissions.

  • Virtual machines on ESXi 6.5 hosts with disk size larger than 256 gigabytes in VMFS5 and larger than 2.5 terabytes in VMFS6 might become unresponsive during I/O operations, and in some cases, might cause the host to hang and become unresponsive too. Such VMs require a host reboot to restart. This problem is due to a synchronization issue that causes a deadlock while VMFS metadata from multiple I/O contexts is processed.

  • During snapshot consolidation of virtual machines with multiple disks that have different absolute paths but share the same name, if the parent of any delta disk is inaccessible, the delta disk can get reparented to a wrong parent and cause disk chain corruption, which in turn, might cause a shutdown of the virtual machines.

  • If you reboot a NFS 4.1 server, write requests from ESXi hosts might contain old metadata that might lead to NFS4ERR_STALE_STATEID error in the NFS server and cause guest VMs to lose connectivity.

  • In the vSphere Web Client, if you change your selection of a virtual device node for the CD/DVD drive from SATA (0:0) to IDE (0:0), or from IDE (0:0) to SATA (0:0), the change in the settings might not take effect on the actual configuration of the CD/DVD drive.

  • This fix prevents at configuration the possible polling of v1 or v2c notification communities by the ESXi SNMP agent.

  • In a high availability environment, a storage controller failover might cause the virtual machines hosted on a NFS 4.1 datastore to become unresponsive.

  • vSAN Health Check might report a warning Controller firmware is VMware certified without previously notifying for the prerequisite that you must install a vendor tool from a list of supported firmware versions for a given controller before performing this check. This fix updates the warning to provide instructive information.

  • If vSAN Encryption is enabled, messages for firewall configuration changes that are related to internal processes and are not important might start to pile in ESXi hosts at /var/run/log/vobd.log and as result, the tasks and events logs of the vCenter Server might be overloaded. You can ignore messages like Firewall configuration has changed. Operation 'addIP4' for rule set vsanEncryption succeeded or Firewall configuration has changed. Operation 'removeIP4' for rule set vsanEncryption succeeded.

  • A stateless ESXi host booted by vSphere Auto Deploy with a Host Profile that contains configuration for a VMware vSphere Distributed Switch, Active Directory Permission and enabled Stateless Caching to USB might fail to perform stateless caching to a USB. 

  • Fault Tolerance secondary virtual machines might fail to takeover if a file rename operation takes longer than 10 seconds, because this is the timeout limit. With this fix, the timeout for file rename operations can be configured.

  • An ESXi host might fail with a purple diagnostic screen due to race condition in the NFSv4.1 Client.

  • You might see all virtual machines of an ESXi host have input/output operations per second (IOPS) matching the lowest IOPS limit from all virtual machines deployed on a NFSv3 datastore, and as a result overriding any higher IOPS limit of other VMs on the same NFS datastore.

  • You might not be able to complete the Enter Maintenance Mode task on an ESXi host and see a vim.fault.Timedout error even if there are no pending operations or powered on virtual machines on the host.

  • A host in a vSAN cluster might fail with a purple diagnostic screen due to a race condition in the vSAN sparse module.

  • If you use the ESXCLI command esxcli network nic down -n <vmnic> to disable a vmnic, the SNMP link status might continue to show that the NIC is down even after you enable the NIC with the command esxcli network nic up -n <vmnic>.

  • When the master host in a preferred fault domain goes down, the backup master in the secondary fault domain assumes the role of the cluster master and the witness host goes out of the cluster for a brief period, during which vSAN objects might become inaccessible and cause temporary lack of connectivity of virtual machines.

  • Delays in processing data in deferred processing mode from lower layers in the local log-structured object manager (LSOM) might cause occasional latency and temporary throughput drops in writes.

  • Atomic test and set (ATS) locking in VMFS6 volumes might cause a temporary I/O latency, because of a rule that tries to compare a modified test image with the on-disk image during unlock, which might lead to multiple but harmless ATS errors.

  • You can see partial or full black screen in a Horizon View session when the guest operating system of a host wakes up from display sleep.

  • In large Active Directory environments, the Likewise agent might duplicate Linux user or group IDs and cause an ESXi host to lose connectivity to Active Directory. You might see error messages similar to Conflict discovered for UID <#>. User had this UID at time 2017/04/20 17:50:48, but now (2017/04/20 17:50:51) user has the UID. Please check that these users are not currently conflicting in Active Directory. This could also happen (safely) if the UIDs were swapped between these users.

  • When you run the command esxcli vsan debug object list, for inaccessible objects you might see no value for the fields Type, Path, Group UUID and Directory Name. With this fix, these fields can have a valid value if there are active components in the inaccessible objects.

  • You might see error message similar to memory admission check failure if you attempt memory hot add on a virtual machine configured with fixed passthrough that limits the assignment of physical PCI devices to VMs and prevents the use of vSphere vMotion, while the actual problem is that the fast suspend and resume feature is not supported on such VMs. This fix provides a correct error message.

  • Attempts to join Active Directory domains might fail intermittently with error LW_ERROR_KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN: Client not found in Kerberos database. The issue is fixed by adding retry logic in JoinDomain() in case of KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN and LW_ERROR_PASSWORD_MISMATCH with a sleep of 30 seconds.

  • Resync of data on a VSAN cluster might take long and consume unnecessary disk space if you change the storage policy on virtual machines or objects in a short period, because if a policy is changed while vSAN tries to recreate the object layout based on the previous change, and resyncing is not complete, this might double the resync load on the cluster.

  • This fix provides optimization of concurrent delete and write jobs during vSAN deduplication to prevent a drop in ESXi hosts performance.

  • You might see a vSAN alarm from a VMkernel Observations (VOBs) event stating The storage capacity of the coredump targets is insufficient, even though there might be sufficient capacity.

  • During a NFSv4.1 mount operation, if a step of the mounting fails, specifically a RECLAIM_COMPLETE NFSv4.1 request, no error is thrown and the datastore appears as mounted, but displays as read-only.

  • In ESXi hosts with 10 gigabit network adapters, if the hardware activation of a RX queue fails and in the next load balancing cycle some of the filters for the hardware activation of this queue are changed, this might cause the host to fail with a purple diagnostic screen.

  • During the reboot of a vSAN enabled ESXi host, the host screen displays a message VSAN: Initializing SSD: <...> Please wait... and it does not change although processes run in the background so it might seem like initialization hangs. With this fix, periodical status messages on the vmkernel.log are available to monitor background work.

  • When you try to clone a virtual machine with digest disks, and those disks are in use of the Content Based Read Cache (CBRC), this might cause the digest disk to be cloned as thick provisioned instead of thin provisioned. As a result, the cloning operation might fail, because the VVol storage array does not support thick provisioning.

  • Hostd might fail with a core dump similar to hostd-worker-zdumpXXX in /var/core/ if a server runs for a long time, because the memory usage might exceed the hard limit.

  • A SFCB process might run out of memory and fail with a core dump similar to sfcb-vmware_bas-zdump in /var/core as it might exceed memory limits. This fix increases the sfcb-vmware_raw resource pool.  

Patch Download and Installation

The typical way to apply patches to ESXi hosts is through the VMware vSphere Update Manager. For details, see the Installing and Administering VMware vSphere Update Manager.
 
ESXi hosts can be updated by manually downloading the patch ZIP file from the VMware download page and installing the VIB by using the esxcli software vib command. Additionally, the system can be updated using the image profile and the esxcli software profile command. For details, see the vSphere Command-Line Interface Concepts and Examples and the