vSAN Health Service - Limits Health – After one additional host failure
search cancel

vSAN Health Service - Limits Health – After one additional host failure

book

Article ID: 327052

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article explains the Limits Health – After one additional host failure check in the vSAN Health Service and provides details on why it might report an error.

Environment

VMware vSAN 8.0.x
VMware vSAN 7.0.x
VMware vSAN 6.5.x
VMware vSAN 6.2.x
VMware vSAN 6.x
VMware vSAN 6.6.x
VMware vSAN 6.1.x
VMware vSAN 6.7.x
VMware vSAN 6.0.x

Resolution

Q: What does the Limits Health – After one additional host failure check do?

In addition to the basic limit health check, there is also a simulation of how resources would look like after an ESXi host failure has occurred. If a single ESXi host fails, two things can happen. First, the resources on that ESXi host (such as cache and capacity) are no longer available. Second, vSAN attempts to re-protect (rebuild) all components belonging to objects that are now currently running with reduced redundancy due to the failure.

This health check simulates both actions described above. If the ESXi host with the most resources consumed fails, this health check calculates how much resources would be used from the remaining hosts in the cluster, and how much resources would still be available.

Note: If there is already a failure in the cluster, this test will report on one additional failure. Therefore, this test reports on the results of the current failure and the additional failure that it introduces.

In vSphere 6.7 Update 3 and later  releases the Health check name is updated to "Capacity Utilization"

Q: What does it mean when it is in an error state?

If this check reports that after a host failure, more than 100% of resources will be used, it means that re-protection fails for some objects because there are not enough resources available.

Note: This health check simulation is very simple. It only looks at cluster aggregate resources, so just like the basic limits check, it does not consider the distribution and placement rules.

However, this simple simulation will verify that, after a failure, a vSAN cluster has been configured with enough resources to operate in an operationally safe manner after a re-protection. This test does not check for balance and fault domains, so these needs to be considered independently of this test.

For example, a user may enforce an operational business policy to have no less than 25% free disk space under normal conditions and no less than 15% free disk space after one failure. This check can be used to implement such a policy and to verify that this is indeed the case.

Q: How does one troubleshoot and fix the error state?

There is no troubleshooting involved in this health check. It is primarily for information only. If this health check fails, you may wish to add additional resources to the cluster to facilitate a successful rebuild after a failure. If you feel that there should be enough capacity in the cluster to rebuild after a failure, check to see if any of the components such as Disks drives are in a failed state.


Additional Information

For more information on collecting VMware vSAN logs, see Collecting vSAN support logs and uploading to VMware (2072796).

Also, see:
vSAN Health Service - Cluster Health - vSAN Health Service up-do-date
vSAN Health Service - Cluster Health - Advanced vSAN configuration in sync
vSAN Health Service - Network Health - Hosts disconnected from vCenter Server
vSAN Health Service - Network Health - Unexpected vSAN cluster members
vSAN Health Service - Network Health - vSAN Cluster Partition
vSAN Health Service - Network Health – Hosts with vSAN disabled
vSAN Health Service - Network Health - All hosts have a vSAN vmknic configured
vSAN Health Service - Network Health - All hosts have matching subnets
vSAN Health Service - Network Health - All hosts have matching multicast settings
vSAN Health Service - Network Health - Hosts small ping test (connectivity check) and Hosts large ping test (MTU check)
vSAN Health Service - Network Health - Hosts with connectivity issues
vSAN Health Service - Network Health – Multicast assessment based on other checks
vSAN Health Service - Data Health – vSAN Object Health
vSAN Health Service - Physical Disk Health - Metadata Health
vSAN Health Service - Physical Disk Health - Overall Disk Health
vSAN Health Service - Limits Health – Current Cluster Situation
vSAN Health Service - Physical Disk Health - Disk Capacity
vSAN Health Service – Physical Disk Health – Software State Health
vSAN Health Service – Physical Disk Health – Component Metadata Health
vSAN Health Service - Physical Disk Health – Congestion
vSAN Health Service - Physical Disk Health – Memory pools
vSAN Health Service - vSAN HCL Health - Controller Release Support
vSAN Health Service – vSAN HCL Health – Controller Driver
vSAN Health Service - vSAN HCL Health – vSAN HCL DB up-to-date
vSAN Health Service - vSAN HCL Health – SCSI Controller on vSAN HCL
vSAN Health Service - Cluster Health – CLOMD liveness check
vSAN Health Service - Cluster Health - vSAN Health service installation
vSAN Health Check Information
vSAN Health Service - Network Health - Active Multicast connectivity check
Virtual SAN 运行状况服务 - 限制运行状况 – 再次出现主机故障后
vSAN 健全性サービス - 制限値の健全性 – ホスト障害後