ESXi host goes into not responding when doing IPMI related operations with log reporting IpmiIfcSdrReadRecordId: retry expired
search cancel

ESXi host goes into not responding when doing IPMI related operations with log reporting IpmiIfcSdrReadRecordId: retry expired

book

Article ID: 317650

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

To avoid the ESXi host going into not responding status due to hostd.

Symptoms:
  • ESXi host goes into not responding when doing IPMI related operations
  • This can cause the hostd to crash.
  • The hostd.log will report similar entries as below

2019-07-17T16:30:09.223Z verbose hostd[A7C2B70] [Originator@6876 sub=PropertyProvider] RecordOp ASSIGN: summary.runtime, ha-root-pool. Sent notification immediately.
IpmiIfcSdrReadRecordId: data length mismatch req=19,resp=8
2019-07-17T16:30:11.554Z verbose hostd[A540B70] [Originator@6876 sub=PropertyProvider] RecordOp ASSIGN: guest.disk, 1. Sent notification immediately.
IpmiIfcSdrReadRecordId: retry expired.
IpmiIfcSdrReadRecordId: sensor not found, record id: 2219
IpmiIfcSdrReadRecordId: retry expired.

  •  In vmkernel.log you will see similar entries as below

 2019-07-17T16:30:17.145Z cpu19:1166661)UserDump: 3024: hostd-worker: Dumping cartel 1166400 (from world 1166661) to file /var/core/hostd-worker-zdump.000 ...


Note:The preceding log excerpts are only examples.Date,time and environmental variables may vary depending on your environment.
 


Cause

After every 90 secs, we get censor data from IPMI for hardware health check and if at the same time some other IPMI specific operation is done leading to race condition.

Resolution

This issue is resolved in:

  • VMware vSphere ESXi 6.5 P06 ESXi650-202102001
  • VMware vSphere ESXi 6.7 P03 ESXi670-202008001
  • VMware vSphere ESXi 7.0.1 Update 1.

To download, go to  Customer Connect Downloads.


Workaround:
To workaround follow the below steps

1. /etc/init.d/hostd stop
2. edit the /etc/vmware/hostd/config.xml
     <cimsvc>
        <path>libcimsvc.so</path>
        <enabled>true</enabled>
     </cimsvc>

  to <enabled>false</enabled>
3. /etc/init.d/hostd start

Alternatively  avoid performing the below operations.
  • 3rd party tool to fetch  ipmi related operations like  Fru get, Fru list, sdr get, sdr list, sel clear, sel list, sel get
  • ESXCLI related commands using scripts to get IPMI related operations for e.g, esxcli  hardware ipmi sel get





Additional Information

VMware Skyline Health Diagnostics for vSphere - FAQ

Impact/Risks:
1.Hardware health monitoring will stop working.
2.Sensor and SEL(System Event Log) data in vSphere Client and MOB( Managed Object Browser) won't be available.
3.esxcli commands to get sensor data will also not work.