This article provides information on resolving the vCLS health issues, so that DRS functions correctly in the cluster.
Symptoms:
vSphere 7.0 Update 1, vSphere DRS for a cluster depends on the health of vSphere Cluster Services (vCLS). vCLS on a cluster configures a quorum on vCLS system VMs on the cluster. These VMs are necessary to maintain the health of the cluster services. If vCLS health gets impacted due to unavailability of these VMs in a cluster, then vSphere DRS will not be functional in the cluster until the time vCLS VMs are brought back up.
Below are the listed operations that could fail if performed when DRS is not functional. Also, another point to note that below operations on a new DRS enabled cluster will not be available until the first vCLS VM is deployed and powered-on in that cluster.
- A new workload VM placement/power-on.
- Host selection for a VM that is migrated from another cluster/host within the vCenter.
- Migrated VM could get powered-on on a non-DRS selected host.
- Placing a host into maintenance mode might get stuck if it has any powered-on VM
- Invocation of DRS APIs such as ClusterComputeResource.placeVm() and ClusterComputeResource.enterMaintenanceMode() will get InvalidState.
- Configuration of Workload Management, Supervisor Cluster and Tanzu Kubernetes Cluster will fail.
Note: If DRS is not enabled on such a cluster, then the vSphere Cluster health will be in the degraded state. In the vSphere Client UI, you see the error similar to:
vSphere DRS functionality was impacted due to unhealthy state vSphere Cluster Services caused by the unavailability of vSphere Cluster Service VMs. vSphere Cluster Service VMs are required to maintain the health of vSphere DRS.
For more information, see
vSphere Cluster Services (vCLS) in vSphere 7.0 Update 1 .