Poor network performance or high network latency on Windows virtual machines
search cancel

Poor network performance or high network latency on Windows virtual machines

book

Article ID: 310350

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
A virtual machine with one virtual CPU and a high CPU load, or a virtual machine with two or more virtual CPUs in general and a Windows 2003/2008/7 guest operating system, may experience these symptoms:
  • Poor network performance and/or high ping response times:
     
    • When receiving network traffic (regardless of the amount of data and type)
    • While under high CPU load, or sharing CPU resources with highly-utilized virtual machines
       
  • Observed throughput may decrease to 512 kB/s on Gigabit Ethernet. Timeouts and connectivity disruptions may also be observed.
  • Ping replies may take up to 20 seconds.
  • Sensitive services like database servers may perform poorly or time out.
  • The number of virtual and physical network cards has no effect on this issue.
  • This issue occurs with different virtual network adapter types (E1000, VMXNET2 and VMXNET3).
  • Measured performance results (generated with tools like iperf) may worsen when adding more virtual CPUs to the virtual machine.


Environment

VMware vSphere ESXi 5.5
VMware ESXi 3.5.x Installable
VMware ESXi 3.5.x Embedded
VMware vSphere ESXi 6.0
VMware ESXi 4.1.x Installable
VMware vSphere ESXi 5.0
VMware ESX 4.1.x
VMware ESX 4.0.x
VMware ESXi 4.1.x Embedded
VMware ESXi 4.0.x Embedded
VMware vSphere ESXi 6.5
VMware ESXi 4.0.x Installable
VMware vSphere ESXi 6.7
VMware ESX Server 3.5.x
VMware vSphere ESXi 5.1

Cause

There are three possible causes for this issue:
  • Power plan

    On Windows 2008 and 2008 R2, the power plan is set to Balanced by default. Microsoft has observed and confirmed that changing the power plan from Balanced to High Performance may increase overall performance. For more information, see General Guidelines for Improving Operating System Performance. Aggressive power saving plans can adversely affect performance, especially with latency-sensitive applications like web and database servers.
     
  • High CPU ready time (also referred as "%RDY" or "%RDY time")

    Note: This information is simplified and its only purpose is to illustrate the cause of the described issue. It should not be referenced outside of the context of this document. Although the example we use here is sufficient to describe the cause of the stated issue, it does not claim to be technically correct in every detail due to the complexity of the CPU scheduling process for virtual machines.

    This example is based on these assumptions:
     
    • An ESXi 5.0 host with a hyper-threading enabled quad-core CPU, resulting in 4 physical and 8 logical CPUs
    • A Windows 2008 R2 virtual machine with 4 virtual CPUs
    • Three Windows 2003 virtual machines with 2 virtual CPUs each

    In this configuration, the ESXi host exposes 8 logical CPUs as 10 virtual CPUs to the virtual machines. In other words, the ESXi host is overcommitted. Depending on the utilization of the virtual machines, the ESXi host will not be able to provide all virtual machines with the requested CPU time, thus the performance of the virtual machines will be as expected. However, if the load on multiple virtual machines increases, the ESXi host has to decide which virtual machine will be served first with the currently available CPU time. It is important to note that the ESXi host will serve multi-core virtual machines only when it is able to serve all the virtual CPUs from the particular virtual machine at once. Otherwise a virtual machine with a lower number of virtual CPUs will be served first. Although this is how the CPU scheduler is supposed to work, it can also lead to situations where certain virtual machines have to wait for an unreasonable amount of time for the requested CPU time. In such cases you can observe a degraded overall performance and increased response times. For more information about CPU scheduling and the meaning of the CPU %RDY time, see the VMware Technical Paper, Performance Troubleshooting for vSphere 4.1.
     
  • Receive Side Scaling (RSS)

    RSS is a mechanism which allows the network driver to spread incoming TCP traffic across multiple CPUs, resulting in increased multi-core efficiency and processor cache utilization. If the driver or the operating system is not capable of using RSS, or if RSS is disabled, all incoming network traffic is handled by only one CPU. In this situation, a single CPU can be the bottleneck for the network while other CPUs might remain idle.

    Note: To make use of the RSS mechanism, the hardware version of the virtual machine must be 7 or higher, the virtual network card must be set to VMXNET3, and the guest operating system must be capable and configured properly. On some systems it has to be enabled manually. These operating systems are capable of using RSS:
     
    • Windows 2003 SP2 (enabled by default)
    • Windows 2008 (enabled by default)
    • Windows 2008 R2 (enabled by default)
    • Windows Server 2012 (enabled by default)
    • Linux 2.6.37 and newer (enabled by default)
For further information on Microsoft RSS, see Receive Side Scaling (RSS). For more information on Linux Receive Side Scaling, see RSS and multiqueue support in Linux driver for VMXNET3 (2020567).

Resolution

Before proceeding with these steps, ensure that:
  • there are no problems in you external infrastructure like faulty hardware or possible misconfigurations (common configuration problems are IP conflicts, unintended traffic shaping, misconfigured trunk and EtherChannel ports)
  • the network is not congested
  • the network the ESXi host is on is stable and performs as expected
  • the virtual machines are configured with the VMXNET3 network adapter
  • the hardware drivers and firmware versions are recent
  • the BIOS is recent and configured appropriately
  • the virtual machine is running the latest version of VMware Tools (they contain the drivers for the virtual hardware)
  • any security software like intrusion detection/prevention systems or packet inspectors have enough resources available and are configured correctly (check the logs for incorrectly filtered traffic or dropped packets)
After you have confirmed that your infrastructure is healthy and all components are configured correctly, check the power saving configuration. For virtual machines with more than one virtual CPU, also check if high CPU %RDY times have a negative impact on these virtual machines.

The final step is to check the RSS settings. Changing the RSS settings should only be done by trained network administrators. VMware also recommends confirming that all relevant applications (including the operating system) support changes to the RSS configuration.


Power plan

To ensure that the system takes advantage of the available resources, it is important to disable all power saving features while investigating performance issues. If the power saving configuration appears to be related to the performance problems, a customized power plan based on the performance and power saving requirements should be created. If you are unsure about which power saving configuration is recommended for your system, engage your hardware vendor.

To adjust the power plan settings on a Windows 2008 Server:
 
  1. Click Start, type powercfg.cpl, and press Enter.
  2. Ensure that the High performance option is selected.

    Note: Steps 3 through 6 are optional.
     
  3. Click Change plan settings.
  4. Click Change advanced power settings.
  5. To enable access to all settings, click Change settings that are currently unavailable.
  6. Browse the available settings and adjust as necessary.
  7. Click OK to confirm and close all windows.

    Note: Some changes might require a reboot of the guest system.
     
Checking CPU %RDY times

To determine if a virtual machine is impacted by high CPU %RDY times, use one of these methods:
  • Count all virtual CPUs on a particular host or cluster, and divide by the number of logical CPUs. A result of one or higher means that the host or cluster is overcommitted and should be investigated. Values of four or higher are considered overloaded and must be investigated immediately.

    Notes:
    • The intent of this method is to quickly determine if a host is overcommitted, rather than determining if it is not. VMware recommends using esxtop to observe detailed host performance.
    • Although hyper-threading doubles the number of logical processors, it cannot provide the same performance as two physical processor cores. If it is likely that the host is overcommitted, calculate using the number of physical CPUs, rather than logical CPUs.
       
  • The esxtop command displays the values for the CPU %RDY time when run on the host with the affected virtual machines. For more information on how to use and interpret the output of esxtop, see the World Statistics section in the Interpreting esxtop Statistics Communities document. You can also run resxtop, which is provided with the vSphere Management Assistant (vMA).
  • The vm-support command provides the capability to create performance snapshots. For more information, see Collecting performance snapshots using vm-support (1967).

To relieve an overcommitted host, use one of these methods:
  • Move the affected virtual machine to a host with more available resources
  • Move other virtual machines off the host
  • Decrease the number of virtual CPUs on the affected virtual machine
Note: Changing the CPU count might not be supported by the guest operating system. For more information, contact the operating system vendor.


Enabling and configuring Receive Side Scaling (RSS)

Before enabling RSS:
  • Ensure that the hardware version of the virtual machine is set to Version 7 or higher. For more information, see Virtual machine hardware versions (1003746).
  • Ensure that the virtual network adapter is set to VMXNET3 and that the operating system is supported by this adapter. For more information, see Choosing a network adapter for your virtual machine (1001805).
  • Ensure that RSS is enabled in the guest operating system. To verify this in a Windows guest operating system, open a command prompt and run the command:

    netsh int tcp show global

    The output indicates whether Receive-Side Scaling State is enabled or not.
     
  • Ensure that the network adapter in the virtual machine is configured to use RSS. To verify this in a Windows guest operating system:
     
    1. Open the Device Manager, navigate to Network adapters, and right-click the adapter you wish to enable RSS on.
    2. In the Properties window, click the Advanced tab, then click RSS in the list on the left side.
    3. Change the Value to Enabled and click OK to close the window. A reboot might be necessary for the changes to take effect.

      Note: Enabling/disabling the RSS feature interrupts the network connection on the adapter for several seconds. If you are accessing the system via a remote desktop session, ensure that you can access the system in another way in case an issue occurs that causes the network connection to not return.
By default, Windows uses up to four CPUs for RSS.
Note: TSO (TCP Segmentation Offload) is a feature of some NICs that offloads the packetization of data from the CPU to the NIC. TSO is supported by the E1000, Enhanced VMXNET, and VMXNET3 virtual network adapters (but not by the normal VMXNET adapter). In ESXi, TSO is enabled by default in the VMkernel, but is supported in virtual machines only when they are using the VMXNET3 device, the Enhanced VMXNET device, or the E1000 device. TSO can improve performance even if the underlying hardware does not support TSO.

Additional Information

For related information, see: Note: The preceding links were correct as of February 5, 2014. If you find a link is broken, provide feedback and a VMware employee will update the link.

If you observe dropped packets in esxtop, see The output of esxtop shows dropped receive packets at the virtual switch (1010071).
Choosing a network adapter for your virtual machine
Virtual machine hardware versions
Troubleshooting network performance issues in a vSphere environment
The output of esxtop show dropped receive packets at the virtual switch
Collecting performance snapshots using vm-support in ESX and ESXi
RSS and multiqueue support in Linux driver for VMXNET3
Windows 虚拟机上的网络性能低下或网络延迟时间较长
Windows 仮想マシンのネットワーク パフォーマンスが悪い、または待ち時間が長い