Broadcom 5719/5720 NICs using tg3 driver become unresponsive and stop traffic in vSphere
search cancel

Broadcom 5719/5720 NICs using tg3 driver become unresponsive and stop traffic in vSphere

book

Article ID: 344372

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article provides resolution and workaround when Broadcom 5719/5720 NICs using tg3 diver becomes unresponsive and stops traffic in vSphere.

Symptoms:
  • When a system uses the tg3 driver with 1 Gb NICs, /var/log/vmkernel or /var/log/messages logs report messages indicating that NetQueue feature is enabled in the driver:

    T20:45:09.053Z cpu14:2091)<6>tg3 : vmnic3: RX NetQ allocated on 1
    T20:45:09.053Z cpu14:2091)<6>tg3 : vmnic3: NetQ set RX Filter: 1 [xx:xx:xx:xx:xx:xx 0]
    T20:45:44.054Z cpu7:2091)<6>tg3 : vmnic3: NetQ remove RX filter: 1
    T20:45:44.054Z cpu7:2091)<6>tg3 : vmnic3: Free NetQ RX Queue: 1


  • One or more NICs in the system stop functioning or responding, reporting partial or full loss of network connectivity to virtual machines or any other type of VMkernel networking (vMotion, management, NFS, iSCSI, etc).
  • The NICs impacted do not appear to be receiving CDP (Cisco Discovery Protocol) information from the upstream physical switch.


Environment

VMware ESX 4.1.x
VMware vSphere ESXi 5.1
VMware vSphere ESXi 5.0
VMware ESX 4.0.x
VMware vSphere ESXi 6.0
VMware vSphere ESXi 5.5
VMware ESXi 4.1.x Embedded
VMware ESXi 4.1.x Installable

Cause

This issue occurs when Broadcom BCM5719 and BCM5720 NICs are used in the system.

Note: The aforementioned NIC models are the most commonly affected but this issue can occur with any NIC that uses the tg3 driver.

Resolution

This issue is resolved by updating the Broadcom driver:
  • For ESXi 5.5, download the latest Broadcom tg3 async driver, available at VMware Downloads.
  • For ESXi 5.0/5.1, this issue is resolved in Broadcom tg3 async driver version 3.129d.v50.1 and later, available at VMware Downloads.
  • For ESXi/ESX 4.x, this issue is resolved in Broadcom tg3 async driver version 3.129d.v40.1 and later, available at VMware Downloads.

For more information on updating network drivers, see:
Note: Ensure to contact and confirm with your hardware vendor before upgrading to the driver versions mentioned in this article.

To work around this issue without upgrading the network driver, disable the NetQueue feature.

Note: NetQueue can only be disabled on third-party async versions of the tg3 driver. Inbox drivers are now included with ESXi 5.0 Update 2 and ESXi 5.1 and do not include the NetQueue feature. To see the various async and inbox driver versions for the Broadcom 5719/5720 adapter, see VMware Hardware Compatibility Guide.

The performance enhancement from NetQueue does not benefit 1 Gb NICs. This feature spreads the network load across multiple CPUs, and a single CPU can handle approximately 3 Gb of network load.

Therefore, if there are no 10 Gb NICs on the host, you can disable NetQueue for the host by running these commands:
  • On ESXi 5.x, run this command:

    # esxcli system settings kernel set -s netNetqueueEnabled -v FALSE
    # reboot

  • On ESXi/ESX 4.x, run this command to verify the existing settings on the tg3 driver:

    # esxcfg-module -q | grep -E "^tg"

    or

    # esxcfg-advcfg -j netNetqueueEnabled
    netNetqueueEnabled = TRUE

If there are 10 Gb NICs on the host in addition to the tg3 NICs, then only disable NetQueue for the tg3 driver.
  • To disable NetQueue on ESXi/ESX for the tg3 driver, run this command:

    # esxcfg-module -s force_netq=0,0,0,0 tg3

  • To disable NetQueue for the host, run this command:

    esxcfg-advcfg -k FALSE netNetqueueEnabled

  • To enable NetQueue for the host, run this command:

    esxcfg-advcfg -k TRUE netNetqueueEnabled

Note: The number of zeroes (0) in the force_netq parameter array must be the same as the number of tg3 devices on your system. For example, the preceding command applies if you have 4 tg3 NICs, which can be verified using the esxcfg-nics --list command.

To revert the change or to enable NetQueue for the tg3 driver, run this command:

# esxcfg-module -s "" tg3

After the changes are complete, reboot the host.


Additional Information

You can verify this issue by unloading and reloading the tg3 driver by running these commands:
  • To unload the driver:

    # vmkload_mod -u tg3

  • To reload the driver:

    # vmkload_mod tg3

To check if the setting is configured:
  1. View the contents of the esx.conf file by running this command:

    # cat /etc/vmware/esx.conf

  2. At the end of this file, ensure that you see an entry similar to:

    /vmkernel/module/tg3/options = "force_netq=0,0,0,0"

To verify the current NetQueue status after it is disabled, run this command:

# esxcli system settings kernel list | grep -i netqueue
netNetqueueEnabled Bool Enable/Disable NetQueue support. FALSE FALSE TRUE

Where the keys are:

Bool Column = FALSE
Enable Column = FALSE
disable Column = TRUE


You can also use the vSphere Client to make the configuration change:
  1. Click the host in vCenter Server.
  2. Click Configuration.
  3. Under Software, click Advanced Settings.
  4. Expand VMkernel in the list and click Boot.
  5. Scroll down to the setting named VMkernel.Boot.netNetqueueEnabled and deselect it to disable.
  6. Reboot the host.
In ESXi 5.0 and later versions, if you can identify the specific NIC that is malfunctioning you can resolve the issue by forcing the link state down and then setting it back up at the OS level. Run these commands (to reset vmnic1 in this example):

localcli network nic down -n vmnic1
localcli network nic up -n vmnic1

This has an advantage over unloading and re-loading the driver because this only affects a specific NIC at a time and not all NICs using the tg3 driver.

Note: For 10 GB network adapters connecting to a 1 GB switch, the switch interface should be left to auto negotiate. Contact your switch vendor to verify that your switch supports 10 GB Network Interfaces.

Determining Network/Storage firmware and driver version in ESXi 4.x and later
Installing async drivers on ESXi 4.x and ESX 4.x
How to download and install async drivers in ESXi 5.x/6.x
tg3 ドライバを使用する Broadcom 5719/5720 NIC が応答不能になり、vSphere でトラフィックが停止する
使用 tg3 驱动程序的 Broadcom 5719/5720 网卡在 vSphere 中变得无响应并停止通信