2009-08-20 05:59 AM
I recently started using SMVI on my NFS-based VMware Infrastructure set up (details of my system configuration are included below). Unfortunetly since starting to use SMVI I've been experiencing issue with VMs powering off unexpectedly. Though I should note it doesn't happen at the same time as the snapshot. The only relevant log entry I can see is in VMware Infrastructure Client (there's nothing in the NetApp syslog to indicate an issue):
Configuration file for vm-name cannot be found
This is followed shortly afterwards by
Virtual Machine vm-name is connected
Reviewing TR-3428*, section 14.3 it discusses a problem related to deleting VMware snapshots and a patch (ESX350-200808401-BG) for the problem. However, it then goes on to note:
"When this patch is in use, there is a condition where virtual machines running 3rd party virtual machine management agents may get powered off unexpectedly. In order to avoid this behavior, please consult the support organization of the management agent regarding virtual disk pooling interval tuning."
I haven't followed the steps in 14.3 to fully enable the patch, but am I right in thinking that even having ESX350-200808401-BG installed could lead to this problem?
Am I also right in thinking that HP CIM constitutes a 3rd party virtual machine management agent in this context? Or is this referring to SMVI itself?
Any clarification or experience on this would be much appreciated.
Summary of set up:
*NetApp and VMware Virtual Infrastructure 3 StorageBest Practices 4.5.2 (July 2009)
2009-08-20 08:40 PM
Hmm....if you're running ESX 3.5U4, that patch is definitely in place (it was rolled up into 3.5U3 I believe.....definitely rolled up in 3.5U4).
Are you running the ESX Host Utilites? That optimizes your NFS settings as well as places the necessary config line in /etc/vmware/config (helps with snapshot issues in general on NFS datastores).
I'd probably try installing the ESX Host Utilities and go from there.
Poking around /var/log/vmware on the ESX service console might be helpful as well....would give you more verbose logs from the ESX perspective of what's going on.
2009-08-21 02:22 AM
Since my first message I've done the folloing:
However, I can't see any lock files on my NFS datastore for any of my VMs. I presume it should be in the VM's main directory along with the .vmx and .nvram file etc.?
Sadly the host utilities don't currently support ESXi (only ESX).
2009-08-23 04:33 PM
As ESX have no way of knowing where a LUN sits on a netapp controller SMVI sits in between NTAP controllers like a "translator" and maps ESXs datastore/VM info to NTAP controllers volume/LUN layout. This allows for backups/restores/ to occur. Thus I am not sure if your issue is SMVI related, but
its prudent to not rule it out at this point in time of course.
On another note we are running SMVI here and have no issues. I believe we are running 1.2 as well. What concerns me is that you are running a version of ESX that is not supported with ESX host utils. I would ask NTAP tech support if its OK to run SMVI without ESX host utils. or if there could be some impact
if its not installed.
2009-10-18 05:16 AM
The issue turned out to be caused by a network misconfiguration that occurred at the same time SMVI was enabled. SMVI has been running fine since that was corrected.
Thanks for the advice and apologies for not updating this question sooner.