Inaccurate Monitoring and Health Status seen in VI Client, vSphere Client, VirtualCenter, or vCenter Server (1010716)
What sort of sensors should I expect to see on my host? How do I know if anything is missing?
VMware added support for CIM infrastructure that allows access to Asset (Inventory) information and Health Status information. Some server vendors started developing CIM providers for ESX/ESXi 3.5, ESX/ESXi 4.0/4.1 and ESXi 5.0/5.1
Some information available from the hardware vendors may be incorrectly or incompletely displayed in the VMware Infrastructure (VI) Client, vSphere Client, VirtualCenter, or vCenter Server.
Due to the limitations of the IPMI standard, some vendors have implemented OEM specific extensions to cover all hardware asset and status information available through their hardware sensors. Currently, ESX/ESXi does not support these extensions.
Beginning in ESXi 3.5, VMware has included hardware:
- Asset (Inventory) information
Asset information display errors are typically due to an older SMBIOS version, which results in processor, thread counts,or power supplies showing up incorrectly. SMBIOS version 2.5 is the minimum version required.
- Health Status monitoring information
Health status information display errors might occur because the vendor has not implemented all the necessary information or has implemented the CIM provider using an OEM specific implementation. Some typical errors include:
- absence of memory node
- display of memory device temperature under temperature node
- display of Unknown status for the power node
- inaccurate reading– for example 0 Watts, xyz Amps for power supplies
- absence of power supply failure alarm– in such an event, only a redundancy failure warning is displayed
- incorrect reading under Other node regarding system management module I2C err
Contact your hardware vendor if you encounter any of these unexpected health or asset information displays in your ESX/ESXi interface.
The asset and health status support uses these underlying technology areas to retrieve the data:
- System management basic input/output system (SMBIOS)
- Intelligent platform management interface (IPMI)
- Local RAID cards
SMBIOS is an industry standard that defines a set of tables. Through these tables the operating system provides basic asset information on what is detected by BIOS during power on. This includes information about CPUs, cache, DIMMs, and PCI cards. This information is not used directly for tracking the health of the system, as the tables are only established once at boot and are not updated dynamically while the system is up and running.
IPMI provides a collection of sensors that monitor various aspects of the server. The exact set of sensors vary from server vendor to vendor. The IPMI specification defines a set of common sensor types. VMware supports these common sensor types natively. IPMI allows server vendors to define their own vendor specific sensors. VMware does not support these server specific proprietary sensors directly. However, VMware works with our server partners to deliver add-on modules (CIM providers) that can expose these additional capabilities.
IPMI has two sets of sensors:
- Analog sensors – show an analog reading, for example, temperature in degrees C.
Typically analog sensors provide a set of thresholds which indicate healthy and unhealthy ranges. These readings are translated into health status in VI Client and vSphere Client.
- Discrete sensors – exist in one or more states, for example, a presence sensor.
Many discrete sensors mapped directly to health states. Some discrete sensors indicate the presence or lack of presence of an entity, rather than a specific health state.
Server manufacturers can use analog or discrete sensors to instrument their systems. ESX/ESXi detects both types.
Typical IPMI data includes:
- Temperature Sensors
- Voltage Sensors
- Battery Sensors – CMOS, and so forth
- Presence Sensors – not a health reading
- Power Supplies
- Current Power usage
- Overall memory health – includes detection of ECC error and is not reported per DIMM
- Chassis Intrusion
- Storage – Some servers include IPMI sensors for internal disk presence
Servers vary in behavior. All manufacturer implemented behaviors are not incorporated into the VI Client or vSphere Client health monitor. To understand all of your server’s possible behaviors:
- Review the data presented in the health monitor and cross reference it with the documentation provided by the server manufacturer before deploying a new type of server in production.
- Some systems support remote IPMI to configure the BIOS. If your system does, use third party tools, such as ipmitool, to query the Baseboard Management Controller (BMC) of the IPMI for the known sensors and field replaceable units (FRUs) in your server. Take this information and verify that the health monitor accurately captures the expected sensors.
- Some newer systems support WS-Management for direct CIM access to the BMC. If your system does, use winrm or other WS-Management client tools to query the BMC to see what sensors are supported. Take this information and verify that the health monitor accurately captures the expected sensors.
VMware natively supports LSI based MR and IR cards. Support includes these types of information:
- Card model
- Card firmware version – listed under Software Components
- Physical drives
- RAID volumes – includes RAID level, physical drives, current state, an so forth
VMware partners have developed CIM provider add-ons which support additional RAID cards.
Note: ESXi 5.0 no longer natively support LSI based MR and IR cards. If third-party CIM providers that implement the HHRC profile are installed, then the health information for the provider-supported cards is exposed.
Additional information is made available for these miscellaneous devices:
- Network Interface Card (NIC) MAC and firmware version
- ESX/ESXi firmware version and alternate boot bank version
Additional InformationFor translated versions of this article, see:
- Español: Supervisión y estado de mantenimiento incorrectos en VI Client, vSphere Client, VirtualCenter o vCenter Server (2073134)
- 日本語: VI Client、vSphere Client、VirtualCenter、または vCenter Server で監視ステータスと健全性ステータスの表示が不正確である (2076985)
- 简体中文: VI Client、vSphere Client、VirtualCenter 或 vCenter Server 中显示的监控和运行状况状态不准确 (2077949)