Tuesday, February 5, 2019

Unable to power on Nutanix CVM in ESXi 6.0 due to Virtual Nested Paging error

As I was unable to 'google' a solution to this issue, nor was the issue a known condition within Nutanix's support team, I'm writing this article as a potential life-saver for anyone that may run into this very specific error condition down the road.

Environment:
ESXi 6.0 on Nutanix NX-8035-G5 hosts
Acropolis 5.0.3.1

Condition:  
After running ESXi updates on an ESXi host within this cluster, the CVM (cluster virtual machine) failed to power on with the following error message:

Power On virtual machine
The virtual machine cannot be powered on because virtual nested paging is not compatible with PCI passthru in VMware ESX 6.0.0.

*Note that it says 'passthru' instead of the proper spelling 'passthrough.'  This threw off google searches like whoa.

After contacting Nutanix support, it was initially thought that the VM itself was having PCI Passthrough issues, so he played around with those settings to no avail.   We also tried upgrading the VM hardware compatibility level of the CVM from 8 to 11, which also did not help.

It eventually occurred to me that we have the vhv-enable="TRUE" setting enabled on the esxi hosts within this cluster, as a remedy for vmotion issues & VMs with older hardware levels.  This is set within the ESXi hosts's /etc/vmware/config file.  After deleting this line from that file, and without rebooting the host, the CVM then immediately powered back on.  I could confirm this as the fix by powering down the VM, re-enabling that setting, and then attempting and failing to start the CVM again.

As I was concerned that disabling this host-level setting would impact some of our customer's VMs, I re-enabled this setting on the ESXi host and then negated it on the CVM itself.  So:

(on the esxi host in question)
  1. Edit the .vmx file for the CVM.  Do not use the GUI, as I found that it did not save the modification.
    vi /vmfs/volumes/[path-to-nutanix-local-storage]/[path-to-cvm-home]/[cvm-name].vmx
    or
    vi /vmfs/volumes/59bb934a-d42388b0-3ae0-0cc47a9ba4c2/ServiceVM_Centos//ServiceVM_Centos.vmx
  2. Add vhv-enable = "FALSE" to the bottom of the file.  Check to see that vhv.enable = "TRUE" is set, if so, change to "FALSE".
  3. Save the file.
  4. Start the VM.
  5. #winning
You should review this setting on all hosts within your cluster, as it's unlikely that you'll have vhv.enable set on just one host.  

No comments:

Post a Comment