Network I/O performance regression after migrating from bnx2x driver to qfle3 driver
search cancel

Network I/O performance regression after migrating from bnx2x driver to qfle3 driver

book

Article ID: 318438

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
When you upgrade from VMware ESXi 6.5 to 6.7, you experience the below symptoms:
  • Qlogic E3 NIC driver migrates from bnx2x to qfle3 driver.
  • You experience performance regression, i.e higher CPU utilization, lower throughput in some cases.
OR

 You see below PSOD on ESXi 6.7 host:

0x451a8f99bc20:[0x4180291533dd]FastSlab_AllocWithTimeout@vmkernel#nover+0x41 stack: 0x451a8f99bd68, 0x451a8f99bcb8, 0x451a8f99bd60, 0x430eba573560, 0x430eba6afda0
0x451a8f99bc70:[0x41802903c445]vmk_PageSlabAlloc@vmkernel#nover+0x22 stack: 0x451a00002400, 0x418000000000, 0x451a8f99bd00, 0x2400, 0x430eba573560
0x451a8f99bc80:[0x4180291cdfda]PktPageAlloc_AllocPages@vmkernel#nover+0x37 stack: 0x451a8f99bd00, 0x2400, 0x430eba573560, 0x430eba6affa0, 0x459aea403a30
0x451a8f99bd00:[0x4180292589cb]vmk_PktAllocPage@vmkernel#nover+0x10 stack: 0x430eba584c08, 0x4180299db709, 0xc04eac000, 0x2400, 0x430eba573560
0x451a8f99bd10:[0x4180299dbf51]qfle3_page_alloc_and_map@(qfle3)#<None>+0x22 stack: 0xc04eac000, 0x2400, 0x430eba573560, 0x26b, 0x430eba573560
0x451a8f99bd60:[0x4180299f05a5]qfle3_alloc_rx_sge_mbuf@(qfle3)#<None>+0x2e stack: 0x3, 0x1000, 0x0, 0x25bc, 0x430eba57e020
0x451a8f99bda0:[0x4180299f2164]qfle3_rxeof@(qfle3)#<None>+0x13d5 stack: 0x451a8f99be80, 0x418029226fbc, 0x1, 0x418029226ad3, 0x451a8f99bf00:[0x4180299f7d1e]qfle3_netpoll_rxcb@(qfle3)#<None>+0x21b stack: 0x8000000000001054, 0x4300d0db2680, 


Environment

VMware vSphere ESXi 6.7
VMware vSphere ESXi 6.5

Cause

This issue occurs due to the difference in data path mechanism: bnx2x supports queue-paring mode, while qfle3 does not. As the result, qfle3 consumes more MSI-X vectors in data path, which in turn causes higher system overhead. Because of the lack of queue-paring mode, networking Load balance is also unable to consolidate traffic in netqueue.

Resolution

To resolve this issue please upgrade qfle3 driver to 1.0.84.0 or later.

ESX6.5: https://customerconnect.vmware.com/web/vmware/details?downloadGroup=DT-ESXI65-QLOGIC-QFLE3-1.0.88.0&productId=614
ESX6.7: https://customerconnect.vmware.com/web/vmware/details?downloadGroup=DT-ESXI67-QLOGIC-QFLE3F-10680&productId=742 

To workaround this issue on VMware ESXi 6.x:

1. Disable qfle3 driver and enable bnx2x driver by running the below commands:

esxcli system module set --enabled=true --module=bnx2x
esxcli system module set --enabled=false --module=qfle3


2. Reboot the ESXi host.

Note: VMware ESXi 6.7 image is shipped with both bnx2x and qfle3 drivers.