We use VMWare ESX 3.5 update3 with a NetApp FAS2020 filer running 7.2.4L1.
We have our VMDKs hosted on a NFS volume with dedupe/sis enabled.
Until recently we were using VMWare Virtual Infrastructure Client to make snapshots of our VMs.
Now we have implemented NetApp's SMVI but are experiencing time-outs or disconnections of our VMs when performing snapshots.
The process works fine until it instructs VMWare to perform the last step which is "deleting the snapshot". At this point the VM seems to experience 5-10 seconds of "disconnectivity".
Is this normal or is there a fix for this?
I have followed the instructions in the VMWare/NetApp Best Practices guide particularly relating to VMWare and NFS volumes (not disabling "NFS.LockDisable" for example) and also installed the latest VMTools on each VM.
Is anyone else experiencing this?
Note: I get the same problem when performing snapshots using the VMWare client. Therefore I am assuming it's a VMWare/NFS misconfiguration somewhere? I apologise as initial tests showed it only an issue when using the SMVI client.
I am seeing exact same problme, but with 3140 and FC LUNs. I beleive the probelm is more related to VMware not being able to snap so many VMs at once. I can take 5 of the same VMs that fail every time and make a SMVI job for just those 5 and they run every time. I suspect there is some load issue with ESX. I am working a a case with NetAPP and VMware (useless support) on trying to figure this out. I'm not sure if SMVI can be made to throttle the sanps tot ESX, but spawning up 20+ snaps in ESX at one time casue several of them to timeout almost instantaneously. I can;t beleive NetAPP has not seen this more frequently. I do have ESX 3.5 U4 (new infrastrutrues) and same VMToosl as well as the Windows Guests have the latest MS VSS update 3 rollup.