Verifying your hardware is functioning correctly
search cancel

Verifying your hardware is functioning correctly

book

Article ID: 306441

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This document helps to ensure that your hardware is functioning as expected.
 
Faulty hardware can cause fresh installs, upgrades, or ESX/ESXi hosts to fail.


Symptoms:
  • Unable to install on certified hardware
  • Unable to upgrade on certified hardware
  • Receiving MCE errors in your log files
  • Failed to load vmkernel: 0xbad0013
  • ESX/ESXi host failed but is now back up
  • ESX/ESXi host fails repeatedly
  • ESX/ESXi host stops responding


Environment

VMware vSphere ESXi 5.0
VMware ESX Server 2.1.x
VMware ESX Server 3.0.x
VMware vSphere ESXi 6.0
VMware ESXi 4.1.x Embedded
VMware vSphere ESXi 6.7
VMware vSphere ESXi 5.5
VMware vSphere 7.0.x
VMware ESX 4.1.x
VMware ESXi 4.1.x Installable
VMware ESXi 3.5.x Installable
VMware vSphere ESXi 7.0.0
VMware ESX Server 3.5.x
VMware ESXi 4.0.x Installable
VMware ESXi 4.0.x Embedded
VMware ESX 4.0.x
VMware vSphere ESXi 6.5
VMware vSphere ESXi 5.1
VMware ESX Server 2.5.x
VMware ESXi 3.5.x Embedded
VMware ESX Server 2.0.x

Resolution

Run Hardware Diagnostic tests

Most servers are shipped with a hardware diagnostics CD, although other hardware vendors may choose to install a hidden utility partition located on your hard drive.
 
Note: If you are not experienced with computers or have any concerns, please contact your hardware vendor.
 
You can diagnose hardware related problems on your server by booting from the diagnostic CD or choosing Diagnostics from the boot device list.
 
These diagnostic tools allow you to:
  • Check the hardware configuration and verify that it is functioning correctly.
  • Test individual hardware components.
  • Diagnose hardware-related problems.
  • Obtain a complete hardware configuration.
When testing, if a component failure is detected, make note of any error code(s) and contact the hardware vendor.

Note: This diagnostic will not be able to detect the hardware fault unless it occurs during the test, so it must be run for quite some time.
 

Check your memory

Note: This process requires downtime on your ESX/ESXi host for up to 48 hours. In most cases, contacting your hardware vendor for a diagnostic utility as mentioned above should be sufficient in testing your hardware. VMware does not endorse or recommend any particular third party utility. However, there are third party options available to test your memory.

To test your memory:

  1. Download memtest86+ from http://www.memtest.org/ .
  2. Extract the ISO image from the .gz or .zip archive.
  3. Burn the image to CD.
  4. Boot your ESX/ESXi host from the CD.
  5. The memtest goes through each memory bank and checks for errors. Run the tool for several hours, at least until it starts pass 2, to ensure the full suite of tests have been executed.

    Note: If memtest86+ does not run on your hardware, contact your vendor for their memory test utility.
 
Ensure your server configuration conforms to Non-Uniform Memory Access (NUMA) specifications
 
Notes:
  • If you are not experienced with computers or have any concerns, please contact your hardware vendor.
  • Problems related to NUMA usually occur following a RAM upgrade or after an ESX/ESXi Server host installation.

You might see an error such as the following:

The BIOS reports that NUMA node 1 has no memory. This problem is either caused by a bad BIOS or a very unbalanced distribution of memory modules.

NUMA is a system where each processor has separate memory. The separate memory helps to avoid a performance hit when several processors attempt to address the same memory.
 
The main requirement is that a similar amount of memory is installed beside each processor. If the amount of memory installed beside each processor is not similar, it is unbalanced and you might experience performance problems. For more information, see ESX Server Memory Management on Systems with AMD Opteron Processors (1570) .

More information on NUMA is also available in the Resource Management Guide.
 

Run the VMware CPU Identification Utility

Notice: VMware CPU Identification Utility is only available for ESX 2, 3 / ESXi 4.x, 5.x
 
To ensure that your CPU(s) are being detected as expected you can use the VMware CPU Identification Utility. You can download the utility at VMware Shared Utilities. This tool helps you ensure that the ESX host is detecting and reporting your CPU(s) correctly.
 
When the VMware CPU Identification Utility has been downloaded, the cpuid.iso image can be used to create a bootable CD that aids in processor and feature identification. The tool displays Family/Model/Stepping information for the CPUs detected, and hexadecimal values for the CPU registers that identify specific CPU features. The hexadecimal register values are then interpreted to indicate whether the CPUs support features like, 64bit, SSE3, and nX/xD.
 
The following is sample output:
Reporting CPUID for 2 logical CPUs...
All CPUs are identical
 
Family: 0f Model: 04 Stepping: 1
 
ID1ECX ID1EDX ID81ECX ID81EDX
0x0000641d 0xbfebfbff 0000000000 0x20100000


Vendor : Intel
Processor Cores : 1
Brand String : " Intel(R) Xeon(TM) CPU 2.80GHz"
SSE Support : SSE1, SSE2, SSE3
Supports NX / XD : Yes
Supports CMPXCHG16B : Yes
Hyperthreading : Yes
Supports 64-bit Longmode : Yes
Supports 64-bit VMware : No
 

Where Fujitsu-Siemens BX630 Server System is running

Configure the BX630 system so that the CPU and memory count is appropriate for a successful boot. On a two CPU system with only one memory pair, disable CPU2 in the BIOS system setup. Refer to the hardware manufacturer's documentation for accessing the BIOS. This ensures the VMkernel detects only one Opteron processor and one memory pair, which is the proper configuration The system now boots successfully. After the ESX Server boots, with CPU2 disabled, the NIC in the riser card is not visible in the MUI, has no lspci entries, and not issue any PCI BIOS messages at boot. Turn on CPU2 after boot, and the NIC card reappears.


Additional Information

Minimum system requirements for installing ESXi/ESX
A new installation of ESX does not boot on Fujitsu-Siemens BX630 Servers
Decoding Machine Check Exception (MCE) output after a purple screen error
ハードウェアが適切に動作していることを確認する
验证硬件是否正常运行