Interpreting virtual machine monitor and executable failures

Products

VMware vSphere ESXi

Issue/Introduction

This article provides a high-level overview of the components of a virtual machine and provides information about the errors that can be reported by the components.

Environment

VMware ESXi 3.5.x Installable
VMware vSphere ESXi 5.0
VMware vSphere ESXi 5.5
VMware ESX Server 3.5.x
VMware ESXi 4.0.x Embedded
VMware ESX 4.1.x
VMware ESXi 4.1.x Installable
VMware vSphere ESXi 5.1
VMware ESX 4.0.x
VMware ESXi 3.5.x Embedded
VMware ESXi 4.0.x Installable
VMware ESXi 4.1.x Embedded

Resolution

What is a virtual machine?

A virtual machine is a tightly isolated software container that can run its own operating systems and applications as if it were a physical computer. A virtual machine behaves exactly like a physical computer and contains it own virtual (that is, software-based) CPU, RAM hard disk and network interface card (NIC).

What are the components of a virtual machine?

A virtual machine is composed of several processes or userworlds that run in the VMkernel. Combined, the processes collectively make up a group. The following is a summary of components of a virtual machine:

Virtual Machine Executable (VMX) process - A process that runs in the VMkernel that is responsible for handling I/O to devices that are not critical to performance. The VMX is also responsible for communicating with user interfaces, snapshot managers, and remote console.
Virtual Machine Monitor (VMM) process - A process that runs in the VMkernel that is responsible for virtualizing the guest OS instructions and manages memory. The VMM passes storage and network I/O requests to the VMkernel, and passes all other requests to the VMX process. There is a VMM for each virtual CPU assigned to a virtual machine.
Mouse Keyboard Screen (MKS) process - A process that is responsible for rendering the guest video and handling guest operating system user input.

Interpreting a VMM Process Error

If the VMM process experiences an error, the backtrace of the process is documented in the /var/log/messages in ESX/ESXi 4.x and earlier . For ESXi 5.x and above it is located at /var/log/vmkernel.log
The backtrace information looks similar to:

Nov 19 17:49:14 esx01 vmkernel: 3:01:10:57.942 cpu4:1256)WARNING: World: vm 1256: 6012: vmm1:ExchSRV01:vcpu-1:VMM64 fault 6: src=MONITOR rip=0xfffffffffc07696a regs=0xfffffffffc008ac0
Nov 19 17:49:14 esx01 vmkernel: 3:01:10:57.942 cpu4:1256)World: 6015: vmm group leader = 1255, members = 2
Nov 19 17:49:14 esx01 vmkernel: 3:01:10:57.942 cpu4:1256)Backtrace for current CPU #4, worldID=1256, ebp=0x37a3f7c
Nov 19 17:49:14 esx01 vmkernel: 3:01:10:57.942 cpu4:1256)0x37a3f7c:[0x63c1cb]World_VMMPanic+0xa7(0xdae840, 0x0, 0x0, 0x0, 0x0)
Nov 19 17:49:14 esx01 vmkernel: 3:01:10:57.943 cpu4:1256)0x37a3fa4:[0x63c1cb]World_VMMPanic+0xa7(0x29, 0x3eb72fd4, 0x7b00412f, 0x3eb1854c, 0x4020)
Nov 19 17:49:14 esx01 vmkernel: 3:01:10:57.943 cpu4:1256)0x37a3fdc:[0x61de1a]VMKCall+0x8a(0x29, 0x3eb72fd4, 0x82, 0x3eb667e0, 0x3eb1854c)

The following is an explanation of each section of the backtrace:

vm 1256: 6012:
This section of the event describes the virtual machine ID and the userworld ID that experienced the error. In this example, the userworld ID of the process that experienced the error is 6012, and the virtual machine ID is 1256.
vmm1:virtulam:vcpu-1:
This section of the event describes the virtual CPU and the virtual machine name that experienced the error. In this example, the error occurred on virtual CPU 1, and the virtual machine name started with the text ExchSRV01 (the name may be truncated).
VMM64 fault 6: src=MONITOR rip=0xfffffffffc07696a
This section of the event describes the actual error experienced by the process. In this example, the VMM process experienced a VMM64 fault 6. These error messages are very specific and can be searched for in the knowledge base.
World: 6015: vmm group leader = 1255, members = 2
This section of the event describes the group of processes that make up the virtual machine.
Backtrace for current CPU #4, worldID=1256, ebp=0x37a3f7c
This section of the event describes the physical CPU that was running the instruction at the time of the error. If multiple virtual machines are experiencing similar errors, this section of the error message can potentially identify faulty hardware. For more information, see Determining if virtual machine and ESX host unresponsiveness is caused by hardware issues (1003560).
0x37a3f7c:[0x63c1cb]World_VMMPanic+0xa7(0xdae840, 0x0, 0x0, 0x0, 0x0)
This section of the event describes the functions that are called when the process experiences the error. Lines similar to the above repeat for each function that is called leading up to the error.

If your virtual machine experiences an error similar to the above, search for the error message within the knowledge base. If the error has not been documented in the knowledge base, submit a Support Request.

Interpreting a Virtual Machine Executable (VMX) Process Error

If the VMX process experiences an error, the backtrace of the process is documented in the virtual machine log file. The backtrace information looks similar to:

Jul 02 17:57:48.800: vmx| Msg_Post: Error
Jul 02 17:57:48.800: vmx| [msg.disk.invalidClusterDisk] VMware ESX cannot open the virtual disk, "/vmfs/volumes/483df5c2-c1afaa7a-e5e2-001d0967f2a4/thickFTVM4/thickFTVM4.vmdk" for clustering. Please verify that the virtual disk was created using the 'thick' option.
Jul 02 17:57:48.800: vmx| [msg.disk.noBackEnd] Cannot open the disk '/vmfs/volumes/483df5c2-c1afaa7a-e5e2-001d0967f2a4/thickFTVM4/thickFTVM4.vmdk' or one of the snapshot disks it depends on.
Jul 02 17:57:48.801: vmx| [msg.disk.configureDiskError] Reason: @&!*@*@(msg.disklib.INVALIDMULTIWRITER)Thin/TBZ disks cannot be opened in multiwriter mode..----------------------------------------
Jul 02 17:57:48.807: vmx| Checkpoint error 0, couldn't continue after checkpoint
Jul 02 17:57:52.414: vmx| Backtrace:
Jul 02 17:57:52.414: vmx| Backtrace[0] 0xffeeaec8 eip 0xc1abca0

When you have identified this information, search the Broadcom support portal for the main error to locate articles that pertain to your exact issue. If the error has not been documented within the knowledge base, collect diagnostic information from the VMware ESX host and submit a Support Request.

Interpreting a Monitor Keyboard Mouse (MKS) Process Error

If the MKS process experiences an error, the backtrace of the process is documented in the virtual machine log file. The backtrace information looks similar to:

Sep 28 11:15:02.928: mks| Msg_Post: Error
Sep 28 11:15:02.928: mks| [msg.log.error.unrecoverable] VMware ESX unrecoverable error: (mks)
Sep 28 11:15:02.928: mks| ASSERT bora/devices/svga/svgaFifo.c:1176 bugNr=6985 Sep 28 11:15:02.928: mks| [msg.panic.haveLog] A log file is available in "vmware.log". [msg.panic.haveCore] A core file is available in "/vmfs/volumes/4a01a03d-c105cb4
5-b3c0-002219543490/Win2K8_64/vmware-vmx-debug-zdump.012". [msg.panic.requestSupport.withLogAndCore] Please request support and include the contents of the log file an
d core file. [msg.panic.requestSupport.vmSupport.vmx86]
Sep 28 11:15:02.930: mks| To collect data to submit to VMware support, run "vm-support".
Sep 28 11:15:02.930: mks| [msg.panic.response] We will respond on the basis of your support entitlement.
Sep 28 11:15:02.930: mks| ----------------------------------------
Sep 28 11:15:02.935: mks| MKS release: start, nesting 0
Sep 28 11:15:03.016: vmx| VTHREAD watched thread 1 "mks" died
Sep 28 11:15:03.423: vcpu-0| VTHREAD watched thread 0 "vmx" died

When you have identified this information, search the Broadcom support portal for the main error to locate articles that pertain to your exact issue. If the error has not been documented within the knowledge base, collect diagnostic information from the VMware ESX host and submit a Support Request.