vSAN Health Service - Network Health - Unexpected vSAN cluster members
search cancel

vSAN Health Service - Network Health - Unexpected vSAN cluster members

book

Article ID: 315548

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article explains the Network Health - Unexpected vSAN cluster members check in the vSAN Health Service and provides details on why it might report an error.


Environment

VMware vSAN 8.0.x
VMware vSAN 7.0.x
VMware vSAN 6.7.x
VMware vSAN 6.5.x
VMware vSAN 6.0.x

Resolution

Q: What does the Network Health - Unexpected vSAN cluster members check do?

This health service check tests if all ESXi hosts participating in vSAN are part of the same vSphere cluster. This is important, as cluster-wide processes such as enabling Distributed Resource Scheduler (DRS), or enabling vSphere High Availability (HA) cannot include ESXi hosts that are not part of the vSphere cluster and can lead to operation issues.
 
This check compares the vSphere cluster members to the vSAN cluster members. If you only use the vCenter Server to manage vSAN, this check should never fail, as by definition a vSAN cluster and a vSphere cluster should in effect have all the same members.
 
However, if you use the command line at any time for a cluster membership, (For example esxcli vsan cluster join), it is quite possible to create a mis-configured cluster, where an ESXi host that participates in vSAN is not part of the vSphere cluster. Another possibility is that host profiles were used.

Q: What does it mean when it is in an error state?

Even though the ESXi host might not be part of the vSphere cluster, vSAN will still utilize the ESXi host, use it to store data and service I/O. In other words, the datastore functions properly and correctly.
 
ESXi hosts that are disconnected from their vCenter Server could show up in this way, and reconnecting them to the vCenter Server will resolve this health check issue.
 
However, when the cluster is in such a situation, it can give rise to operational hazards. As the ESXi host is not tracked as a part of the vSphere cluster, it is very easy to overlook the critical role the ESXi host plays in the availability and persistence of data on vSAN.
 
For example, inadvertently rebooting or re-purposing the ESXi host for another use, or by simply placing it into maintenance mode may cause issues to the vSAN cluster and impact the availability of the virtual machines running on the cluster. You may not notice that impact based on what the vSphere Web Client reports. There may be no warning generated that the ESXi host is about to be re-purposed for some other user, as the vCenter Server does not recognize the ESXi host to be part of the cluster.

Q: How does one troubleshoot and fix the error state?

To get an overall view of the vSAN cluster state, the Ruby vSphere Console (RVC) commands such as vsan.cluster_info can help. This displays all ESXi hosts that are participating in vSAN, and can be used to compare against the list of hosts that are part of the vSphere cluster to determine which one is not included.
 
To get an individual hosts view of the cluster, run this command:

esxcli vsan cluster get
 
You should check why the ESXi host is not a part of the vSAN cluster but is part of the vSphere cluster. If the ESXi host was joined to the cluster in error using the Command Line Interface (CLI), another command, esxcli vsan cluster leave can be used to take the ESXi host back out of the cluster.
 
However, the ESXi host should first be put into maintenance mode using the full data migration option to evacuate all data first and ensure data availability.
 
If the ESXi host fails to leave the cluster, make a note of the reason. Also, make note of any warnings or errors that are created in the /var/log/vmkernel.log file when this operation is attempted. Contact VMware Global Support Services if the issue persists. For more information, see How to file a Support Request in Customer Connect (2006985).
 
If there is no difference in the list of hosts in vSphere cluster and vSAN cluster, restart the vsanmgmt service on the all hosts and RETEST vSAN health :
/etc/init.d/vsanmgmtd restart

There shouldn't be any impact of restarting this service on hosts. 


Additional Information

For more information on collecting VMware vSAN Logs, see Collecting vSAN support logs and uploading to VMware (2072796).

Also, see:
vSAN Health Service - Cluster Health - vSAN Health Service up-do-date
vSAN Health Service - Cluster Health - Advanced vSAN configuration in sync
vSAN Health Service - Network Health - Hosts disconnected from vCenter Server
vSAN Health Service - Network Health - vSAN Cluster Partition
vSAN Health Service - Network Health – Hosts with vSAN disabled
vSAN Health Service - Network Health - All hosts have a vSAN vmknic configured
vSAN Health Service - Network Health - All hosts have matching subnets
vSAN Health Service - Network Health - All hosts have matching multicast settings
vSAN Health Service - Network Health - Hosts small ping test (connectivity check) and Hosts large ping test (MTU check)
vSAN Health Service - Network Health - Hosts with connectivity issues
vSAN Health Service - Network Health – Multicast assessment based on other checks
vSAN Health Service - Data Health – vSAN Object Health
vSAN Health Service - Physical Disk Health - Metadata Health
vSAN Health Service - Physical Disk Health - Overall Disk Health
vSAN Health Service - Limits Health – Current Cluster Situation
vSAN Health Service - Limits Health – After one additional host failure
vSAN Health Service - Physical Disk Health - Disk Capacity
vSAN Health Service – Physical Disk Health – Software State Health
vSAN Health Service – Physical Disk Health – Component Metadata Health
vSAN Health Service - Physical Disk Health – Congestion
vSAN Health Service - Physical Disk Health – Memory pools
vSAN Health Service - vSAN HCL Health - Controller Release Support
vSAN Health Service – vSAN HCL Health – Controller Driver
vSAN Health Service - vSAN HCL Health – vSAN HCL DB up-to-date
vSAN Health Service - vSAN HCL Health – SCSI Controller on vSAN HCL
vSAN Health Service - Cluster Health – CLOMD liveness check
vSAN Health Service - Cluster Health - vSAN Health service installation
vSAN Health Check Information
vSAN Health Service - Network Health - Active Multicast connectivity check
vSAN 运行状况服务 — 网络运行状况 — 意外的 VSAN 群集成员
Virtual SAN 健全性サービス - ネットワークの健全 - VSAN クラスタの想定外のメンバー