When removing a snapshot virtual machines become unresponsive for over 30 minutes
search cancel

When removing a snapshot virtual machines become unresponsive for over 30 minutes

book

Article ID: 343118

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • Virtual machines become unresponsive for 30 minutes or much longer when removing a snapshot
  • In the vmware.log file, you see entries similar to:

    vmx| Checkpoint_Unstun: vm stopped for 3711025 us
    vmx| Checkpoint_Unstun: vm stopped for 574655 us
    vmx| Checkpoint_Unstun: vm stopped for 2191061 us
    Create snapshot smvi_2a175570-ed2f-....
    Operation completed
    Consolidate starts
    Intermediate snapshot taken, took 1.8s
    VM runs for 2 seconds, while consolidate of scsi0:0 is in progress
    Move to next disk, no more interations for scsi0:0 are necessary, stunned for 0.6s
    Consolidate of scsi0:1 finished, another iteration is needed. Intermediate snapshot is deleted, and another is created. VM stunned for 2.7s.
    Done with scsi0:0. Moving to scsi0:1.
    Done with scsi0:1. Another iteration will be needed...


    For more information on vmware.log file, see Locating virtual machine log files on an ESXi/ESX host (1007805).


Environment

VMware ESXi 4.1.x Embedded
VMware vSphere ESXi 5.0
VMware vSphere ESXi 5.1
VMware ESX 4.1.x
VMware ESXi 4.1.x Installable

Cause

This issue occurs if the virtual machine generates data faster than the consolidate rate. For example, asynchronous consolidate starts with a 5 minute run, then goes to 10 minutes, 20 minutes, 30 minutes, and so on. After 9 iterations, it goes to 60 minutes per cycle. During these attempts, consolidate is performed without stunning the virtual machine. After max iteration, a synchronous consolidate is forced where the virtual machine is stunned and a consolidation is performed.

Note: Snapshot consolidation time varies depending on the type of machine and environmental variables, thus it is difficult to estimate a completion time.

Resolution

This is a performance limitation due to snapshot implementation.

This behavior is modified in these ESXi versions to prevent the synchronous consolidation and leave the affected virtual machine running from snapshots, if the asynchronous consolidate cannot be completed.


To work around this issue if you do not want to apply the patch or if you are running a different version:
  1. Shut down the virtual machine.
  2. Right-click the virtual machine and click Edit Settings.
  3. Click the Options tab.
  4. Under Advanced, click General.
  5. Click Configuration Parameters and add these parameters one at a time and check the result:

    • Set snapshot.maxIterations to 20 (or higher). The default value is 10. If you cannot converge within default maxIteration (10), you just stun and perform synchronous consolidation, which may cause the virtual machine to be stunned for a long time. If you increase maxIterations to 20 or higher, then it is possible that the virtual machine will find a period within the maxIterations to perform synchronous consolidate.

    • Change the snapshot.maxConsolidateTime to 60 seconds. The default value is 6 seconds. If you set the value to 60 seconds, the consolidate helper sees an opportunity to do the synchronous consolidate earlier (before the snapshot grows to a 30 minute issue after 10 iterations). Setting snapshot.maxConsolidateTime to 60 means that you can afford to have the virtual machine stunned for 60 seconds so the virtual machine can perform synchronous consolidate within the iterations.

      Note: There is a coefficient of 2 on maxConsolidateTime. A value of 60=120 seconds. The default is 6 but that stuns for 12 seconds.

    • Set snapshot.maxIterations to 0. Setting snapshot.maxIterations to 0 causes the virtual machine to stun and perform synchronous consolidate in the first iteration only. This may reduce the stun time.


Additional Information

To be alerted when this document is updated, click the Subscribe to Article link in the Actions box.

Locating virtual machine log files on an ESXi/ESX host
Consolidating snapshots in ESX/ESXi 3.x and 4.x
スナップショットを削除すると、Windows 仮想マシンが 30 分以上反応しなくなる
移除快照时,虚拟机在超过 30 分钟的时间内无响应
How to increase the time limit on snapshot consolidation