You can simulate HA failover depending on the version of vSphere deployed in the environment.
Select a method below depending on your installed version of vSphere.
For a vSphere 4.x environment where you are running HA based on AAM and have two redundant NICs for the Management Network, you can physically disconnect the patch cable where these physical NICs are uplinked.
Alternatively, you can issue a command to your switch software to disconnect the ports. This simulates a host isolation event since vCenter Server is not communicating with the hosts. Furthermore, the hosts in the cluster have the AAM agent running. The agent is designed to monitor the uptime of neighboring hosts in the cluster. If the primary host of the cluster detects that the host you have disconnected is isolated, it restarts its virtual machines on surviving hosts in the cluster. Ensure that your HA cluster settings have the appropriate Host Isolation Response setting, as this type of host outage is considered to be a Network Isolation.
In vSphere 5.x, HA is provided by the Fault Domain Manager (FDM) agent deployed on each of the HA cluster hosts. FDM is used where both Network and Datastore Heartbeats are used to determine the availability of a host, and in determining types host failure, whether that is a physically failed host or a Network Isolation type of failure. The FDM agent on secondary hosts report uptime information to the primary host's FDM agent. The primary host communicates with vCenter Server to report the uptime of itself and all secondary hosts.
For example, there are two ESXi/ESX hosts with two vmnics in a redundant NIC team serving Management Network traffic. These hosts are also sharing a single shared datastore. You want the virtual machines to failover to the surviving host in the cluster.
To prepare the environment for failover simulation:
None
. Ensure no datastores are selected from the available list, and select Select only from my preferred datastores.Isolated
. This causes all virtual machines to register and restart on the surviving host.As mentioned in Method 1 and 2, disconnecting the network to forcibly disrupt communication between primary and secondary hosts is an option in simulating HA failover. However, to simulate a power-outage or hardware fault type of failure, hard power off the host physically or by using a remote management application such as KVM, DRAC, iLO, or RAS.
Note: Use of this method may require re-installation of ESXi/ESX if the kernel module is not properly disabled/re-enabled. When disabling the kernel module for the physical NIC, you lose all remote management through the ESXi Service Console, and can only remotely manage the host through KVM, DRAC, iLO, or RAS. Be sure to have physical access to the host if a remote management application is not available.
Method 4 allows one to simulate a network isolation again, but this time by disabling the physical NIC (vmnic) driver module from the VMkernel, instead of physically disconnecting a patch cable or interrupting connectivity at the physical switch layer.
First determine which module is in use by the physical NIC by using below article.
Determining Network/Storage firmware and driver version in ESXi
Next, disable the module by running the command:
esxcli system module set --disabled module_name
esxcfg-module -disable module_name