vSphere Cluster Services (vCLS) Known Issues/Corner Cases

Products

VMware vCenter Server

Issue/Introduction

The list below includes known issues and/or corner cases one should be aware of, but is not exhaustive. Any new issues or corner cases will be added periodically.

Environment

VMware vCenter Server 8.0.x
VMware vCenter Server 7.0.x

Resolution

If a user tries to perform any unsupported operation on vCLS VMs including configuring FT, DRS rules or HA overrides on these vCLS VMs, cloning these VMs or moving these VMs under a resource pool or vApp could impact the health of vCLS for that cluster resulting in DRS becoming non-functional.
- Some of the supported operation on vCLS VMs are migration of these VMs to different hosts or datastores and attaching Tags/custom-attributes on these VMs.
Since vCLS VMs are deployed as soon as a first host is added to a new cluster, any test scripts to validate empty cluster in a greenfield deployment should have to be change. These vCLS VMs should be ignored from the cluster capacity checks.
Before the first vCLS VM for the cluster is powered-on in a DRS enabled cluster, if users tries to perform any operation which invoke DRS algorithm, these operations will fail.
- Below are the listed operations that may fail if DRS is non-functional:
  - A new workload VM placement/power-on.
  - Host selection for a VM that is migrated from another cluster/host within the vCenter Server.
  - Migrated VM could get powered-on on a non-DRS selected host.
  - Placing a host into Maintenance Mode might get stuck if it has any powered-on VM
  - Invocation of DRS APIs such as ClusterComputeResource.placeVm() and ClusterComputeResource.enterMaintenanceMode() will get InvalidState.
- Below operations will not depend on the vCLS health, so user can perform these operations independent of vCLS VM deployment.
  - Resource pool creation.
  - DRS configuration (such as automation level, overrides etc.).
  - Addition/editing of VM/Host rules.
  - vSphere with Tanzu supervisor cluster configuration
- Since vCLS VMs are deployed right after adding the first set of hosts to the cluster, these VMs could be placed in the local datastores for the hosts if shared storage for the cluster is not configured yet. These VMs will not be automatically migrated to the shared storage when shared storage gets configured for this cluster. VMware recommends to manually migrate these vCLS VMs to shared datastore after the shared storage is configured for the cluster. In case these VMs are not migrated to shared datastore, then these vCLS VMs are not protected against host failures by HA.
- Do not update the hardware version of vCLS VMs. They are kept at HW version 11 so these are backwards compatible with vSphere 6.5.
- All vCLS VMs for a datacenter inside a vCenter Server are stored in a specific folder named vCLS. Users should not rename/delete this folder. Such operation of renaming or deleting the folder could result in the failure to create new vCLS VMs for clusters impacting the health of vCLS.
- There could be cases where less number of workload VMs could powered-on in an HA enabled cluster configured with Slot policy admission control.
- CPU/Memory consumed by vCLS VMs is not displayed in VM summary page inside vSphere Client as these are small VMs.
- When downsizing on clusters (reducing number of hosts), there might be cases where more number of vCLS VMs could be running than necessary. Also, in this situation, some or all of these vCLS VMs could be residing in the same host.
- Orphaned vCLS VMs might appear as workload VMs in hosts and clusters navigation as EAM will not delete these orphaned VMs as part of cleanup when a host containing orphaned VM is added to the cluster.
  - Manually unregister these VMs from vCenter Server inventory.
- Cluster delete or host removal without placing hosts into Maintenance Mode might keep some
- vCLS VMs in an orphaned state. When such hosts are added back to clusters, these VMs might conflict with new vCLS VMs. It is recommended to add hosts using one of the below methods to address this issue:
1. Add the host to the VC inventory as a standalone host and then move it to the cluster.
2. Power off all the VMs running on the host and then proceed to adding the hosts.
- Add the host to the VC inventory as a standalone host and then move it to the cluster.
- vCLS VMs cannot be evacuated in Quarantine Mode triggered from proactive HA.
- A host cannot be placed into standby mode if DPM is configured on the cluster when a vCLS VM is running on that Host and even when there are no workload VMs running.
- vCLS VMs cannot run on a host where vt-x is disabled. Since software MMU is deprecated starting vSphere 6.7 onwards, running vCLS VMs require vt-x (or AMD-v) to be enabled along with nested page table support.
- If you use the esxcli command to put hosts in Maintenance Mode that are managed by a cluster in a running vCenter Server and if this host has vCLS VMs, then the Maintenance Mode task will be stuck, thus the cli execution will be stuck. The workaround is to power-off vCLS VMs after running the CLI either by logging in to vSphere Client or ESXi Host client or through esxcli in a new session. Power off will succeed as the host status will be in entering Maintenance Mode and any new power-on operation on a host in that state will fail.
- In deployments using VM based licensing like vSphere for ROBO licensing, vCLS VMs are shown in the licensing UI as counted towards the licensed VMs. But these VMs are not counted towards the licensing as these are system VMs.
- ESXi 6.5 hosts with AMD Opteron Generation 3 (Greyhound) processors cannot join Enhanced vMotion Compatibility (EVC) AMD REV E or AMD REV F clusters on a vCenter Server 7.0 Update 1 system. The CPU baseline for AMD processors of the ESX agent virtual machines have POPCNT SSE4A instructions, which prevents ESXi 6.5 hosts with AMD Opteron Generation 3 (Greyhound) processors to enable EVC mode AMD REV E and AMD REV F on a vCenter Server 7.0 Update 1 system.
- VMDK removal or corrupted - when vmdk is removed from vCLS VMs, these VMs will become orphaned. In such a scenario, vCLS VMs will not be recreated as VMs can go into orphaned state for various reasons including vSphere HA management of VMs. To recover from this situation, the user has to manually delete such vCLS VMs where new VMs will be created.

Special considerations

In the event of a Host failure in a HA enabled cluster and when this host had vCLS VMs, these VMs are powered on by HA in a different host if there is shared storage configure for the cluster. In certain cases, ESX Agent Manager might be trying to power-on these VMs resulting in some failed tasks, but VMs will be powered-on to maintain the vCLS health status. You can ignore these task failures.
In case you need to downgrade vCenter Server to an older version where vCLS is not supported, then you can clean up the vCLS VMs. Users have to manually delete these vCLS VMs after successful downgrade of vCenter Server older version.

vSphere Cluster Services (vCLS) Known Issues/Corner Cases

Article ID: 316510

Updated On:

Products

Issue/Introduction

Environment

Resolution

Special considerations

Feedback