Search the VMware Knowledge Base (KB)
View by Article ID

Migration of Service VM (SVM) may cause ESXi host issues in VMware NSX for vSphere 6.x (2141410)

  • 1 Ratings
Language Editions

Symptoms

In a VMware NSX for vSphere 6.x environment, when a Service VM (SVM) is migrated (vMotion/SvMotion), you experience these symptoms:
  • There is interruption in the service (workload VM) for which the Service VM (SVM) is providing data.
  • The ESXi host fails with a purple diagnostic screen.
  • The purple diagnostic screen contains backtraces similar to:

    @BlueScreen: #PF Exception 14 in world wwww:WorldName IP 0xnnnnnnnn addr 0x0
    PTEs:0xnnnnnnnn;0xnnnnnnnn;0x0;
    0xnnnnnnnn:[0xnnnnnnnn]VmMemPin_DecCount@vmkernel#nover+0x1b
    0xnnnnnnnn:[0xnnnnnnnn]VmMemPinUnpinPages@vmkernel#nover+0x65
    0xnnnnnnnn:[0xnnnnnnnn]VmMemPin_ReleaseMainMemRange@vmkernel#nover+0x6
    0xnnnnnnnn:[0xnnnnnnnn]P2MCache_ReleasePages@vmkernel#nover+0x2a
    0xnnnnnnnn:[0xnnnnnnnn]DVFilterVmciUnmapGuestPage@com.vmware.vmkapi#v2_2_0_0+0x34

Cause

A Service VM (SVM) provides the plumbing required for special applications in workload virtual machines to access IO, generally networking, before the IO leaves the virtual machine through conventional means. (For example, through the virtual NIC).

Examples of such workload VMs are: 
  • NSX Guest Introspection VM
  • McAfee IDS/IPS/Firewall
  • Palo Alto Networks Firewall
  • Symantec IDS/IPS/Firewall
  • Trend Micro Deep Security
The specialized plumbing effectively pins the virtual machine to the ESXi host. Therefore, the virtual machine is deployed in a 1:1 relationship to the ESXi hosts in the cluster. When the SVM be migrated, the plumbing cleanup is not handled correctly which causes the issue to occur.

Resolution

This issue is resolved in:
To work around the issue if you do not want to upgrade, ensure not to migrate the Service VM (SVM) manually (vMotion/SvMotion) to other ESXi host in the cluster. If you need to migrate the SVM to another datastore (svMotion), VMware recommends to instead perform a cold migration.

If Distributed Resource Scheduling (DRS) is required, disable vMotion for a specific virtual machine through the vCenter Server Managed Object Browser (MOB).
  1. Open a web browser and type in the address https://<vcenter_ip>/mob/?moid.
  2. Under Methods > ServiceContent, click RetrieveServiceContent.
  3. Click InvokeMethod on the top right hand corner.
  4. Click the link to rootFolder.
  5. Click the link labeled DataCenters.
  6. Follow the link to the datacenter that contains the virtual machine in question.
  7. Follow the next link into the vmFolder which holds information on all the virtual machines.
  8. Under the childEntity section, the name of the virtual machine should be displayed here.

    Note: Take a note of the vm-### of the VM you would like to limit.

  9. Open another browser window and type in address: https://<vcenter-ip>/mob/?moid=AuthorizationManager&method=disableMethods.
  10. Fill out the form with MOID changed to the id of the virtual machine you have chosen. The method is MigrateVM_Task, as well as filling out the sourceId with VCMob and sessionScope with false.
  11. Once everything is filled out, click Invoke Method. You should get Method Invocation Result: void.
Note: This also disables the manual ability to migrate a specific virtual machine.

To disable the ability to manually do a storage vMotion:

In order to disable storage vMotion, you have to go through the same process as above, only changing the last method to RelocateVM_Task instead of MigrateVM_Task.

At this point the Migrate VM option is no longer be available in the vCenter Web User Interface (UI) or Client UI. DRS relies on these settings in the backend, so is no longer be able to migrate the service VM.

If an ESXi host is manually entered into maintenance mode, it will power off but not move any service VM. A normal VM that has gone through this process will not power off, and instead the ESXi host hangs awaiting manual power off of the VM.

Impact/Risks

If a Service VM is migrated, the host on which the service VM is registered may fail.

Even if the host does not crash, the specialized application which is accessing the IO provided through the Service VM will no longer function correctly, causing an outage.

Additional Information

To be alerted when this article is updated, click Subscribe to Document in the Actions box.

Tags

PSOD, NSX, service VM, crash

See Also

Update History

11/9/2016 - Added product 6.5 Prerelease information.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 1 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 1 Ratings
Actions
KB: