ESXi/ESX host using Emulex 10Gb NIC cards fails with a purple diagnostic screen with the error: PCPU #: no heartbeat
search cancel

ESXi/ESX host using Emulex 10Gb NIC cards fails with a purple diagnostic screen with the error: PCPU #: no heartbeat

book

Article ID: 310135

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • An ESXi/ESX host becomes unresponsive and displays a purple diagnostic screen indicating that a CPU did not receive a heartbeat.
  • The purple diagnostic screen is triggered by a non-maskable interrupt (NMI).
  • You see diagnostic information similar to:

    cpu7:xxxxxxx)@BlueScreen: PCPU 10: no heartbeat (2/2 IPIs received) cpu7:xxxxxxx)Code start: xxxxxxxxxxxxxx VMK uptime: 26:05:22:00.554
    cpu7:xxxxxxx)Saved backtrace from: pcpu 10 Heartbeat NMI
    cpu7:xxxxxxx)0x4122457dba78:[xxxxxxxxxxxxxx]Util_Udelay@vmkernel#nover+0x2d stack: 0x418000010000
    cpu7:xxxxxxx)0x4122457dbac8:[0x4180260e3e56_be_mpu_post_wrb_ring@<None>#<None>+0xed stack: 0x41001c401150
    cpu7:xxxxxxx)0x4122457dbb28:[0x4180260e0649 be_function_post_mcc_wrb@<None>#<None>+0x128 stack: 0x0
    cpu7:xxxxxxx)0x4122457dbb78:[xxxxxxxxxxxxxx]be_get_die_temperature@<None>#<None>+0xb7 stack: 0x41224b0e7000
    cpu7:xxxxxxx)0x4122457dbcc8:[xxxxxxxxxxxxxx]rate_timer_func@<None>#<None>+0x66b stack: 0x4122457dbd18
    cpu7:xxxxxxx)0x4122457dbd68:[xxxxxxxxxxxxxx]Timer_BHHandler@vmkernel#nover+0x226 stack: 0xfffc013000ffff
    cpu7:xxxxxxx)0x4122457dbde8:[xxxxxxxxxxxxxx]BH_Check@vmkernel#nover+0x98 stack: 0x2457dbe28
    cpu7:xxxxxxx)0x4122457dbed8:[xxxxxxxxxxxxxx]CpuSchedDispatch@vmkernel#nover+0x11ea stack: 0x0
    cpu7:xxxxxxx)0x4122457dbf48:[xxxxxxxxxxxxxx]CpuSchedWait@vmkernel#nover+0x242 stack: 0x410000000000
    cpu7:xxxxxxx)0x4122457dbf98:[xxxxxxxxxxxxxx]CpuSched_VcpuHalt@vmkernel#nover+0x14b stack: 0x418025bb8619
    cpu7:xxxxxxx)0x4122457dbfe8:[xxxxxxxxxxxxxx]VMMVMKCall_Call@vmkernel#nover+0x1af stack: 0x0


  • The ESXi/ESX host is using Emulex 10Gb NIC cards.


Environment

VMware vSphere ESXi 5.1
VMware vSphere ESXi 5.0
VMware vSphere ESXi 5.5
VMware ESX 4.0.x
VMware ESX 4.1.x

Resolution

This is a known issue affecting Emulex 10 Gb Nics with older firmware versions.

To resolve this issue, upgrade 10 Gb NIC firmware to version 4.6.166.6105 or later and must use BE2NET drivers of version 4.6.166.9 or later. Both are available from the hardware vendor.

To work around this issue, perform these preemptive actions:

  1. Ensure that Spanning Tree Protocol is enabled on upstream switches. For more information, see STP may cause temporary loss of network connectivity when a failover or failback event occurs (1003804).

  2. For IBM servers, enable Management Network Auto-Discovery (MCAD) in the BladeCenter AMM.

  3. Ensure your physical switch firmware is up-to-date. For IBM Blade servers, ensure the firmware of the Virtual Fabric Switch is up-to-date.



Additional Information



STP may cause temporary loss of network connectivity when a failover or failback event occurs
使用 Emulex 10Gb 网卡的 ESXi/ESX 主机失败并显示紫色诊断屏幕,同时显示以下错误:PCPU #: 无检测信号
Emulex 10Gb NIC カードを使用する ESXi/ESX ホストが紫の診断画面と次のエラーで失敗する: PCPU #: ハートビートなし