Duplicate VTEPs in ESXi hosts after rebooting vCenter Server
search cancel

Duplicate VTEPs in ESXi hosts after rebooting vCenter Server

book

Article ID: 339061

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

To remove the duplicate VTEPs that are in the VXLAN prepared ESXi hosts after rebooting vCenter Server 6.0.

Symptoms:
Duplicate Virtual Tunnel End Points (VTEPs) are configured in VXLAN prepared ESXi hosts after rebooting vCenter Server 6.0.

Note: For additional symptom and log entries, see the Additional Information section.

Environment

VMware NSX for vSphere 6.1.x
VMware NSX for vSphere 6.2.x

Cause

When vCenter Server 6.0 is initializing a service restart or a machine reboot, it may take longer time to query the ESXi host configuration. Due to this time delay, NSX Manager receives an update indicating that hosts config is NULL.

NSX manager removes the VTEP information associated with ESXi hosts from the database and the IP address is made available for reuse. Later, NSX manager tries to correct the problem by creating a new VTEP on the affected ESXi hosts and may reuse the previous IP addresses. At this point, an ESXi host has more VTEPs than expected and there may be VTEPs with duplicate IP addresses in the environment.

Resolution

This issue is resolved in VMware NSX for vSphere 6.2.4 with vSphere 6.0 Update 3, available at VMware Downloads.

To work around this issue if you do not want to upgrade, remove the duplicate VTEPs:

To remove the duplicate VTEPs:
  1. Place the affected ESXi host in the maintenance mode.
  2. Delete all VTEP vmknics on the ESXi host.
  3. In the vSphere client, select the ESXi host and click Configuration > Networking > vSphere Distributed Switch.
  4. Select Manage Virtual Adapters for the VXLAN vDS and remove the vmkernel interface.
  5. Exit the maintenance mode.
  6. Repeat Steps 1-5 for all affected ESXi hosts.
  7. Click Network & Security > Installation > Host Preparation, select the cluster the host belongs to initiate a force-sync VXLAN from the UI.
  8. Verify networking on virtual machines on the remediated ESXi hosts.

To prevent this issue from occurring:

Keep the NSX Manager down when vCenter Server is restarted. Start the NSX Manager only after the vCenter Server is fully initialized and shows correct network configuration for all hosts.

Use the attached script to monitor and understand when it is safe to start the NSX Manager after restart of vCenter Server.

  1. Download the 2144605_checkESXconfig.zip file attached to this Knowledge Base article.
  2. Copy the 2144605_checkESXconfig.py script to the ESXi host or to the VCSA using SCP and run the script:

    # python 2144605_checkESXconfig.py --server Center_server--user username--password password

    Notes:
    • This script connects to vCenter Server and queries the config for all ESXi servers in the vCenter Server environment.
    • The default security policy blocks outbound connections on 443. Modify the firewall under Security profile to enable http Client to allow outbound connections on port 443 if the script fails.

    You see this output if ESXi host does not have a NULL config value:

    #python checkESXconfig.py --server myvcenter.corp.local --user [email protected] --password mycorppassword123
    API thread output:
    Hosts with NULL config : {}
    Hosts with NULL config.network : {}
    PC thread output:
    Hosts with NULL config : {}
    Hosts with NULL config.network : {}
    vCenter initialisation has completed, it is now ok to start the NSX Manager service.

    You see this output if ESXi host has a NULL config value:

    # python checkESXconfig.py --server myvcenter.corp.local --user [email protected] --password mycorppassword123
    API thread output:
    Hosts with NULL config : {}
    Hosts with NULL config.network : {'vim.HostSystem:host-27': 1, 'vim.HostSystem:host-481': 1, 'vim.HostSystem:host-10': 1}
    PC thread output:
    Hosts with NULL config : {}
    Hosts with NULL config.network : {'vim.HostSystem:host-27': 1, 'vim.HostSystem:host-481': 1, 'vim.HostSystem:host-10'
    Waiting for 10 seconds
    Writing Hosts with NULL config to file ConfigIssueHosts.txt
    vCenter is still initialising, do not start the NSX Manager Service. Please check the output files generated for affected hosts and apply the workaround.


    Notes:
    • In this case, three ESXi hosts (host-27, host-481 and host-10) have NULL values.
    • ConfigIssueHosts.txt output file contains the DNS names of these ESXi hosts.

  3. If you do not see hosts with a NULL value in the output, start the NSX Manager service.

    If you see host(s) with NULL values in the output:

    Open an ssh session to the ESXi host(s) and restart the hostd and vpxa management services by running these commands:

    /etc/init.d/hostd restart
    /etc/init.d/vpxa restart


    Note: This may result in the ESXi host disconnecting from vCenter Server briefly, but virtual machines running are not impacted.

Note: If you reboot vCenter Server 6.0 when ESXi hosts are in a Disconnected or Not Responding state, ESXi hosts reports a NULL config, which is expected. VMware recommends you to ensure that all ESXi hosts are connected and responding in vCenter Server before rebooting.


Additional Information

You experience these additional symptoms:
When you run the show log command on the NSX Manager console, you see entries similar to:
  • In NSX for vSphere 6.2.x

    INFO DCNPool-4 VdnHostInstallationServiceImpl:691 - Host host-75 has 0 vmknics

    INFO DCNPool-4 VdnHostInstallationServiceImpl:699 - vmknic vmk2 does not appear in the host host-75, remove it from database

    INFO TaskFrameworkExecutor-28 VdnHostInstallationServiceImpl:359 - Host host-75 found (new or vmknic missing). Initializing preparation

    INFO TaskFrameworkExecutor-6 PrepareVdsHostTask:212 - Creating vmknic on host host-75 pg dvportgroup-179 ipaddress xxx.xxx.xxx.xxx

  • In NSX for vSphere 6.1.x and later:

    ERROR ViInventoryThread ViManagedHostSystemObject:358 - no hostNetInfo

Note: This issue does not occur in vCenter Server 5.5 environments because vCenter Server 5.5 caches the host configuration in the database.

In rare cases vCenter Server may restart in an uncontrolled fashion. For example, vCenter Server machine crashes, runs out of disk space etc and it may not be possible to stop the NSX manager service before the vCenter Server service restarts. If this scenario is of concern then the vCenter Server service and its dependent services can be set not to automatically start on OS boot up.

This can be configured under services in Windows or using chkconfig on an vCSA

Disable the automatic startup of vCenter Server services:
  • chkconfig -s vmware-eam off --level 0123456
  • chkconfig -s vmware-perfcharts off --level 0123456
  • chkconfig -s vmware-sps off --level 0123456
  • chkconfig -s vmware-vdcs off --level 0123456
  • chkconfig -s vmware-vpx-workflow off --level 0123456
  • chkconfig -s vmware-vsm off --level 0123456
  • chkconfig -s vsphere-client off --level 0123456
  • chkconfig -s vmware-vsan-health off --level 0123456
  • chkconfig -s vmware-vpxd off --level 0123456
Manually starting services after a reboot:
  • service-control --start vmware-vpxd
  • service-control --start vmware-eam
  • service-control --start vmware-perfcharts
  • service-control --start vmware-sps
  • service-control --start vmware-vdcs
  • service-control --start vmware-vpx-workflow
  • service-control --start vmware-vsm
  • service-control --start vsphere-client
  • service-control --start vmware-vsan-health
Re-enable the automatic startup of vCenter services:
  • chkconfig -s vmware-vpxd on --level 35
  • chkconfig -s vmware-eam on --level 35
  • chkconfig -s vmware-perfcharts on --level 35
  • chkconfig -s vmware-sps on --level 35
  • chkconfig -s vmware-vdcs on --level 35
  • chkconfig -s vmware-vpx-workflow on --level 35
  • chkconfig -s vmware-vsm on --level 35
  • chkconfig -s vsphere-client on --level 35
  • chkconfig -s vmware-vsan-health on --level 35
Enabling trivia logging in VMware vCenter Server
Collecting diagnostic information for ESX/ESXi hosts and vCenter Server using the vSphere Web Client
Collecting diagnostic information for VMware NSX for vSphere 6.x
重新引导 vCenter Server 后 ESXi 主机中存在重复的 VTEP
vCenter Server の再起動後に ESXi ホストで VTEP が重複する

Attachments

2144605_checkESXconfig.zip get_app