This article provides procedures for replacing a single node from a backup in a 3-node NSX-T Manager cluster within a VCF environment.
The NSX cluster's health will be in degraded state when there is a faulty NSX manager node and can block several VCF operations.
In this procedure, the 3-node NSX-T manager cluster has a single node down. In this example, there are 3x NSX-T manager nodes in the MGMT cluster:
If you do not know the admin and root passwords, follow the instructions in the password chapter of the VMware Cloud Foundation Operations and Administration guide to retrieve them from the SDDC Manager inventory. If you want to change these two passwords, should do so after restoring the NSX-T manager VMs and using the SDDC Manager password update function.
Note: Make sure you download the OVA image before you start and check the md5sum of the OVA file as well. Section 2.2.4 lists the procedures for determining the specific OVA to download.
2.2.1. Power off faulty NSX-T manager VM
2.2.2. Delete faulty NSX-T manager VM
Note: Steps differ based on if the node is primary or secondary for deleting the manager
First you must obtain the UUID of the faulty NSX-T manager, SSH into one operational NSX-T Manager
ssh admin@vi1nsxmanager3
Issue the get cluster status command, record the UUID of the faulty NSX-T manager
get cluster status
Issue the detach command:
detach node <uuid>
For example: detach node 77a01dab-58f1-4a86-8134-0bfc3e0c40d9
After removing one node from the cluster and before adding a new one, make sure you run "get cluster status" command and verify the services are UP on the nodes.
The new node cannot be added from the NSX-T manager UI. The NSX-T manager VMs are in the MGMT cluster, but the NSX-T manager cluster only knows about the VI WLD vCenter, not the MGMT domain vCenter, so the add nodes wizard will not allow adding a node to the MGMT cluster.
For example: 2.4.2.1.0.14374085
nsx-unified-appliance-2.4.2.1.0.14374085.ova, click Next
For example:
fqdn=vi1nsxmanager2.dellrack1.vmware.corp
role=nsx-manager-nsx-controller
gateway=172.17.110.1
ipv4=172.17.110.24
netmask=255.255.255.0
dns=172.17.110.251
domain=dellrack1.vmware.corp
ntp=172.17.110.251
ssh=enabled (checked)
allowroot=disabled (unchecked)
ssh admin@vi1nsxmanager1
get cluster config | find Id:
get certificate api thumbprint
ssh admin@vi1nsxmanager2
join <vi1nsxmanager1 ip> cluster-id <uuid> thumbprint <thumbprint> username admin
For example:
join 172.17.110.23 cluster-id 3ca96913-7d42-4cce-a69c-365a7c52b545 thumbprint dd35cf8826bcb9cd6bd21deddb81a7447cc726fcfa393d71781d492a3302ca1e username admin
For example:
Issued By=CA
Issued To=vi1nsxmanager2.dellrack1.vmware.corp
curl -H 'Accept: application/json' -H 'Content-Type: application/json'\
--insecure -u 'admin:<admin password>' -X POST\
'https://<new nsx-t mgr fqdn or ip>/api/v1/node/services/http?action=apply_certificate&certificate_id=<certificate id>'
For example:
curl -H 'Accept: application/json' -H 'Content-Type: application/json'\
--insecure -u 'admin:VMware123!VMware123!' -X POST\
'https://vi1nsxmanager2.dellrack1.vmware.corp/api/v1/node/services\/http?action=apply_certificate&certificate_id=24781ed5-7721-49bb-801d- cc8a4415d60e'
Specific to VCF 4.0: If assigning the certificate fails because the certificate revocation list (CRL) could not be verified, please follow the steps in KB 78794 to address the problem. If you decide to disable CRL checking in order to assign the certificate, re-enable CRL checking once the certificate has been assigned.
2.2.7.2 If the certificate does not exits, please follow Replace Expired or Self-signed NSX-T Manager Certificates with VMCA-Signed Certificates for more information.
For example:
echo | openssl s_client -no_ign_eof -showcerts -connect \
vi1nsxmanager2.dellrack1.vmware.corp:443 > nsx2.pem
For example:
openssl x509 -in nsx2.pem -noout -text | more
This step is specific to VCF 4.0, as a final step, you need to update the SSH keys SDDC Manager saves for the NSX-T managers. VMware offers a script that automates this process. Please follow the Refresh SDDC Manager SSH Keys procedure documented in the KB 79004.
When Cloud Foundation deploys NSX-T Manager, it creates a VM anti-affinity rule to prevent the VMs of the same NSX-T Manager cluster from running on the same host. In this step, you need to add the newly deployed replacement VM to the rule for this NSX-T Manager cluster.
Log in to the management domain vCenter Server, and select Menu > Hosts and Clusters.
In the Navigator pane, select the management cluster
Select Configure > VM/Host Rules.
Finally, add the VM to the correct "separate virtual machine" rule. The rule for the management-domain NSX-T Manager cluster is named anti-affinity-rule-nsxt, while the rule for workload domains has the form "<NSXT Mgr VIP FQDN> - NSX-T Managers Anti Affinity Rule". Once you locate the rule, click edit, and add the newly deployed VM (e.g., vi1nsxmanager2) to it.