Knowledge Base

Search the Knowledge Base: |
Search the Knowledge Base: |
Virtual machine CPU usage spikes and remains abnormally high after VMotion in a VMware DRS enabled cluster
Details
Solution
Starting with ESX Server 3.5 and VirtualCenter 2.5, VMware DRS applies a cap to the memory overhead of virtual machines to control the growth rate of this memory. This cap is reset to a virtual machine specific computed value after VMotion migrates the virtual machine. Afterwards, if the virtual machine monitor indicates that the virtual machine requires more overhead memory, VMware DRS raises this cap at a controlled rate (1MB per minute, by default) to grant the required memory until the virtual machine overhead memory reaches a steady-state and as long as there are sufficient resources available on the host.
Diagnosing the Issue
To diagnose the issue:
-
Log in to VirtualCenter with Virtual Infrastructure Client as an administrator.
-
Right-click your cluster from the inventory.
-
Click Edit Settings.
-
Disable VMware DRS.
-
Click OK and wait for 1 minute.
-
In the Virtual Infrastructure Client, note the virtual machine's CPU usage from Performance tab and the virtual machine's memory overhead from the Summary tab.
-
Right-click your cluster from the inventory.
-
Click Edit Settings.
-
Re-enable VMware DRS.
-
Use VMotion to migrate a problematic virtual machine to another host.
-
Note the virtual machine CPU usage and memory overhead on the new host.
-
Disable VMware DRS on the cluster again, as noted above and wait for 1 minute.
-
Note the virtual machine CPU usage and memory overhead on the new host.
If the CPU usage of the virtual machine increases in step 11 in comparison to step 6, and decreases back to the original state (similar to the behavior in step 6) in step 13 with an observable increase in the overhead memory, this indicates the issue discussed in this article.
Working around the issue prior to VirtualCenter 2.5 Update 1
-
Log in to VirtualCenter with Virtual Infrastructure Client as an administrator.
-
Right-click your cluster from the inventory.
-
Click Edit Settings.
-
Ensure that VMware DRS is shown as enabled. If it is not enabled check the box to enable VMware DRS.
-
Click OK.
-
Click an ESX Server from the Inventory.
-
Click the Configuration tab.
-
Click Advanced Settings.
-
Click the Mem option.
-
Locate the Mem.VMOverheadGrowthLimit parameter.
-
Change the value of this parameter to 5 and click OK.
Note: By default this setting is set to -1.
To fix multiple ESX Server hosts
-
Log on to the VirtualCenter Server Console as an administrator.
-
Make a backup copy of the vpxd.cfg file (typically it is located in C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\vpxd.cfg ).
-
In the vpxd.cfg file, add the following configuration between the <vpxd> and the </vpxd> tags:
<cluster>
<VMOverheadGrowthLimit>5</VMOverheadGrowthLimit>
</cluster>
This configuration provides an initial growth margin in MB-to-virtual machine overhead memory. You can increase this amount to larger values if doing so further improves virtual machine performance.
-
Restart the VMware VirtualCenter Server Service.
Note: When you restart the VMware VirtualCenter Server Service, the new value for the overhead limit should be pushed down to all the clusters in VirtualCenter.
-
Log in to VirtualCenter with Virtual Infrastructure Client as an administrator.
-
Right-click your cluster from the inventory.
-
Click Edit Settings.
-
Disable VMware DRS.
-
Click OK. Wait for the DRS-disable task to complete.
-
Right-click your cluster from the inventory.
-
Click Edit Settings.
-
Enable VMware DRS.
-
Click OK.
Working around the issue if it persists after upgrading to VirtualCenter 2.5 Update 1
Note: The aforementioned steps also work, however this method is easier to implement and works for any ESX host that is added to the DRS Cluster.
-
Log in to VirtualCenter with Virtual Infrastructure Client as an administrator.
-
Right-click your cluster from the inventory.
-
Click Edit Settings.
-
Select VMware DRS (if it is not enabled enable it).
-
Click the Advanced Options button.
-
Add MemOverheadGrowth with a value of 4.
-
Click OK to close out of Advanced Options.
-
Click OK to close out of the cluster configuration.
A permanent fix for this behavor is included in VirtualCenter 2.5 Update 2.
Verifying the workaround
-
Log in to your ESX Server service console as root from either an SSH Session or directly from the console of the server.
-
Type less /var/log/vmkernel .
vmkernel: 1:16:23:57.956 cpu3:1036)Config: 414: VMOverheadGrowthLimit" = 5, Old Value: -1, (Status: 0x0)
vmkernel: 1:08:05:22.537 cpu2:1036)Config: 414: "VMOverheadGrowthLimit" = 0, Old Value: -1, (Status: 0x0)
In the case that the fix is unsuccessful attempt the following:
-
Create a new cluster and move the ESX Server hosts to this cluster.
-
Check to see if the fix has been implemented successfully.
Feedback
- KB Article: 1003638
- Updated: Aug 14, 2009
- Products:
VMware ESX
VMware ESXi
VMware VirtualCenter - Product Versions:
VMware ESX 3.5.x
VMware ESXi 3.5.x Embedded
VMware ESXi 3.5.x Installable
VMware VirtualCenter 2.5.x

