Knowledge Base

The VMware Knowledge Base provides support solutions, error messages and troubleshooting guides
 
Search the VMware Knowledge Base (KB)   View by Article ID
 

A snapshot removal can stop a virtual machine for long time (1002836)

Details

When a snapshot removal (consolidation) is in progress, no other tasks against the virtual machine can be performed (such as power operations or vMotion migration). There are several underlying tasks pertaining to snapshot removals that must be performed without interruption to ensure data integrity. Based on the amount of snapshot delta to be committed, the amount of time varies.

This article outlines the actions taking place against the virtual machines snapshots.

Solution

For live consolidations, virtual machine activity (specifically disk writes) during this time must also be committed. This delta information is kept in a temporary Consolidate Helper snapshot, committed at the end of the snapshot removal. For busy virtual machines, the volume of activity may obligate system resources for longer than a usual amount of time, resulting in more Consolidate Helper snapshot delta.

For example, a virtual machine with one virtual disk (disk.vmdk) and a snapshot.
  • disk.vmdk with extent disk-flat.vmdk
  • disk-000001.vmdk with extent disk-000001-delta.vmdk
If you choose to remove or consolidate the snapshot:
  1. An additional snapshot delta is created, the Consolidate Helper:

    • disk.vmdk with extent disk-flat.vmdk
    • disk-000001.vmdk with extent disk-000001-delta.vmdk
    • disk-000002.vmdk with extent disk-000002-delta.vmdk. The virtual machine is no longer writing to the above two files; all current writes while the snapshot removal is in progress are committed to the disk-000002-delta.vmdk extent file via disk-000002.vmdk.

  2. The VMware ESXi/ESX host's DiskLib API consolidates disk-flat.vmdk with disk-000001-delta.vmdk. Meanwhile, the virtual machine continues writing to disk-000002-delta.vmdk.

  3. After completing the consolidation of the snapshot, the ESXi/ESX host consolidates the Consolidate Helper disk-000002-delta.vmdk with disk-flat.vmdk.

    Prior to VMware ESX 3.5.0 patch ESX350-200804402-BG, virtual machines are stunned for the duration of the Consolidate Helper removal. In typical circumstances, this process is completed almost immediately. Virtual machines with considerable amounts of delta gathered in the temporary snapshot are stunned for a noticeable or disruptive amount of time. This can have adverse effects on guest applications or services.

    Subsequent versions of VMware ESXi/ESX perform this final consolidation using multiple, smaller, Consolidate Helpers to minimize or prevent guest operating system interruption.

  4. When all delta information recorded in disk-000002-delta.vmdk has been committed to disk-flat.vmdk, disk-000002-delta.vmdk and its descriptor file disk-000002.vmdk are removed from the datastore. The virtual machine continues from its base disk or selected point.
Note: Beginning in ESXi 5.0, the snapshot stun times are logged. Each virtual machine's log file (vmware.log) will contain messages similar to:
2013-03-23T17:40:02.544Z| vcpu-0| Checkpoint_Unstun: vm stopped for 403475568 us
In this example, the virtual machine was stunned for 403475568 microseconds (1 second = 1 million microseconds).

For older versions of VMware ESX, issues relating to virtual machines becoming unresponsive during consolidation have been resolved in:

Additional Information

For more information, see:
For more information on snapshots, see Understanding virtual machine snapshots in VMware ESXi and ESX (1015180).

For translated versions of this article, see:

Update History

10/22/2013 - Added links to translations.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 53 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.
What can we do to improve this information? (4000 or fewer characters)
  • 53 Ratings
Actions
KB: