To address these scenarios under which the would block occurs, VMware introduced changes on both VMware vSphere and NSX for vSphere.
VMware vSphere
- In VMware vSphere 6.0 Update 1, host scan operation by EAM fails but NSX host preparation status shows as green incorrectly as the returned unhandled error 99. In releases with a fix for this issue, EAM correctly reports that the host scan failed and raises a vibNotInstalled state. The following example message from the /var/log/eam.log file on the affected ESXi host shows the error code 99.
INFO | 2015-02-25 11:41:32,221 | host-4 | VcPatchManager.java | 356 | Scan result on VcHostSystem(host-383):
ScanResult {
errorCode = 99,
vibUrl = 'https://172.16.224.232/bin/vdn/vibs/5.5/vxlan.zip',
responseXml = <unset>,
bulletins = [],
VMware NSX for vSphere
A workaround and a new NSX Manager log messages are implemented in VMware NSX for vSphere 6.1.5 and VMware NSX for vSphere 6.2.0:
The NSX workaround consists of a detection logic in which NSX Manager checks the preparation status of a host after it is rebooted. If the host is found in the would block state, NSX Manager takes the corrective action by resetting the vSphere Distributed Switch properties, then the VTEP properties, and finally the port properties on the ESXi host. This corrective action repairs the VXLAN configuration without a second host reboot.
In addition, NSX Manager provides an alert that the would block condition was detected and resolved through the following log messages:
- VXLAN opaque data is missing on host host-351, repushing the opaque data.
Note: This log indicates that the remediation has started.
- Reseted the vxlan opaque properties on host host-351.
Note: This log indicates that the remediation has completed.
- Not detected VXLAN opaque data missing on host-351, skip repush the opaque data.
Note: This log indicates that the ESXi host has not encountered any issues on the reboot.
Some virtual machines may start running on the ESXi host before the NSX Manager finishes the detection and remediation described above. These virtual machines remains disconnected even after the ESXi host is repaired. These virtual machines must be connected back the network.
Following the completion of the remediation process, you need to resync the message bus using API.
Notes: Before performing the steps, ensure that:
- You have basic authorization with the NSX Manager web credentials such as the admin user, or any vCenter user granted NSX privileges.
- Headers Content-type: application/xml and Accept: application/xml are used.
For more information on how to make API calls to the NSX Manager, see the
Using the NSX REST API section in the
VMware NSX for vSphere API Guide.
To re-sync the message bus, use REST API:
Request:
POST https://NSX_Manager_IP/api/2.0/nwfabric/configure?action=synchronize
Request Body:
<nwFabricFeatureConfig>
<featureId>com.vmware.vshield.vsm.messagingInfra</featureId>
<resourceConfig>
<resourceId<{HOST/CLUSTER MOID}</resourceId>
</resourceConfig>
</nwFabricFeatureConfig>Note: To further troubleshoot a would block condition, collect the logs immediately after the Message Bus is re-sync.
If the NSX Agent has already been deployed, attempt the following solutions in sequence to install the NSX VIBs:
Note: NSX 6.1.4 & earlier VIBs can be downloaded from:
https://vsm-ip/bin/vdn/vibs/5.5/vxlan.zip.
- If you see Resolve in the Installation Status column, click Resolve and then refresh your browser window. For more information, see the Prepare Hosts on the Primary NSX Manager section of the Cross-vCenter NSX Installation Guide.
- Standard VIB install with Update Manager or with the esxcli command.
For more information on how to install a VIB on an ESXi host, see Downloading and installing async drivers in VMware ESXi 5.x and ESXi 6.0.x (2005205). Also, see Installing patches on an ESXi 5.x/6.x host from the command line (2008939)
Note: After the above commands are run:
- Reboot each of the ESXi host where the force install succeeded. (For example, VIBs are not skipped)
- NSX Installation tab resolve on cluster is required to detect that VIBs are now correctly installed
If the NSX Agent has not been deployed:
- Remove ESX from cluster
- Reboot
- Add ESX to cluster
What is fixed in vSphere 6.0 Update 2
- In vCenter Server 6.0U2 and ESXi 5.5P8, for the case of live NSX VIB installation on new (not yet prepared) hosts, EAM now reports a partial installation (For example: NSX VIBs have been installed on the running image but not on boot disk). In vSphere versions without this change, an error condition, such as another VIB (For example, ixgbe driver VIB), can prevent the NSX VIBs from being copied to the boot disk, and EAM reports a successful NSX installation / Host Ready status.
- A new esxupdate error code with reboot the host immediate is reported when there is a live VIB installation failure caused by jumpstart plugins, rc scripts or init.d scripts.
- A new error code with Please reboot the host immediately to discard the unfinished message is reported when there is a live NSX VIB installation failure. The failed NSX VIB is not reported as installed. The new error code and message are reported to NSX User Interface (UI).
Detect if alt-bootbank is not up to date and do a force install
To detect if the alt-bootbank is not up-to-date, use the --dry-run option. Output is installed (instead of skipped) if alt-bootbank is not up to date, run this command:
esxcli software vib install --no-live-install --force --dry-run --depot /path/vxlan.zip
Key:
--force | -f
Bypasses checks for package dependencies, conflicts, obsolescence, and acceptance levels. Really not recommended unless you know what you are doing. Use of this option will result in a warning being displayed in the vSphere Client.
--no-live-install
Forces an install to /altbootbank even if the VIBs are eligible for live installation or removal. This causes the installation to be skipped on PXE-booted hosts.
--dry-run
Performs a dry-run only. Report the VIB-level operations that would be performed, but do not change anything in the system.
--depot | -d
Specifies full remote URLs of the depot index.xml or server file path pointing to an offline bundle .zip file.
For more information, see the esxcli software section of the vSphere Command-Line Interface Documentation.
On any ESXi host where the alt-bootbank is not up to date, run the same command without the --dry-run option.
esxcli software vib install --no-live-install --force --depot /path/vxlan.zip