Troubleshooting vSphere with Tanzu (TKGS) Supervisor Control Plane VM's
search cancel

Troubleshooting vSphere with Tanzu (TKGS) Supervisor Control Plane VM's

book

Article ID: 323407

calendar_today

Updated On:

Products

VMware

Issue/Introduction

vSphere with Tanzu provisions three Supervisor Control Plane VMs which act as the control plane for the Supervisor Cluster. There are times when you may need to troubleshoot these VM's in order to run a kb or test networking. 

This kb details the following.
-Networking details around the supervisor control plane VM's.
-How to ssh into supervisor control plane VMs.
-How supervisor control plane VM's correlate to EAM agencies in vCenter.

Resolution

Network Diagram of vSphere with Tanzu

vSphere With Tanzu Network Diagram v1.png

 

How to SSH into Supervisor Control Plane VMs

First SSH into vCenter and run /usr/lib/vmware-wcp/decryptK8Pwd.py 

root@vcsa-01a [ ~ ]# /usr/lib/vmware-wcp/decryptK8Pwd.py
Read key from file

Connected to PSQL

Cluster: domain-c8:ac691e20-bbff-481f-a67c-3b85d0b5a1d5
IP: 192.168.110.101
PWD: PgfFTPFtW/3FEfyqGuyVPYOJQMwNhOq21t2WiPijbH4BTrkHVgJqJpRETXaoZlchRZ20hGxWtGXKVMwFR4YWJ6Bh5i8L9Do9nkzJG9AlKlwwR24WbqTfaoDUM8kNLhxE1hIPzQoDgL7oQkSRnJRcBz+UPvd82Jc8buR9nsceMbg=
------------------------------------------------------------

root@vcsa-01a [ ~ ]#


This will give you its password. We recommend ssh-ing from the vCenter to test that the vCenter and management network on the SV VM have connectivity. The IP address shown in the decrypt password will always be the FIP. If etcd is down on the supervisor cluster than the fip will not be assigned and you will need to ssh into the actual ip (eth0) of the supervisor control plane VM. When SSH-ing into the FIP you may get an error about the certificate changing. This is normal as the FIP "floats" between nodes, so the backing cert changes each time it floats. Delete the entry for the FIP under /root/.ssh/knownhosts or delete that file entirely to workaround it. 

 

PLEASE NOTE: When on the supervisor control plane VM you have permissions to permanently damage the cluster. If VMware Support finds evidence of a customer making changes to the supervisor cluster from the SV VM, they may mark your cluster as unsupported and require you redeploy the entire vSphere with Tanzu solution. Only use this session to test networks, look at logs, and run kubectl logs/get/describe commands. Do not deploy, delete, or edit anything from this session without the express permission of a KB or VMware Support.

 

Supervisor Control Plane VMs and their EAM Agency/VM ID Information

Each Supervisor Control Plane VM has a correlated EAM (ESXi Agent Manager) Agency. 

The EAM Agency can be found under the notes section when viewing the VM. In the below example the eam agency is EAM Agency: vmware-vsc-apiserver-p69z67

image.png

The VM ID can be found in the URL when selecting the VM. 
In the below example the VM ID is vm-13007

image.png

You can use the eam agency and the VM ID to correlate errors with the supervisor control plane VMs located in the logs on vCenter under /var/log/vmware/vpxd , /var/log/vmware/eam , and /var/log/vmware/wcp 

IMPORTANT NOTE: You can manually delete eam agencies from the web client via Menu -> Administration -> vCenter Server Extensions -> vSphere ESX Agent Manager -> Configure. Deleting an EAM agency will DELETE the supervisor control plane VM and a new one will be created. THIS IS NOT A VALID TROUBLESHOOTING METHOD FOR CUSTOMERS. Do not delete eam agencies without the EXPRESS permission of a VMware support engineer. Depending on versions and the existing health of the supervisor cluster it is entirely possible to render the entire cluster un-recoverable. If VMware Support finds evidence of a customer manually deleting an EAM Agency, they may mark your cluster as unsupported and require you redeploy the entire vSphere with Tanzu solution.