VMware Solutions Discussions

SMVI backups causing VM timeout/freeze

marcconeley
4,612 Views

Hi Guys,

I am wondering if someone can help me with this.

We use VMWare ESX 3.5 update3 with a NetApp FAS2020 filer running 7.2.4L1.

We have our VMDKs hosted on a NFS volume with dedupe/sis enabled.

Until recently we were using VMWare Virtual Infrastructure Client to make snapshots of our VMs.

Now we have implemented NetApp's SMVI but are experiencing time-outs or disconnections of our VMs when performing snapshots.

The process works fine until it instructs VMWare to perform the last step which is "deleting the snapshot". At this point the VM seems to experience 5-10 seconds of "disconnectivity".

Is this normal or is there a fix for this?

I have followed the instructions in the VMWare/NetApp Best Practices guide particularly relating to VMWare and NFS volumes (not disabling "NFS.LockDisable" for example) and also installed the latest VMTools on each VM.

Is anyone else experiencing this?

Thanks,

M

Note: I get the same problem when performing snapshots using the VMWare client. Therefore I am assuming it's a VMWare/NFS misconfiguration somewhere? I apologise as initial tests showed it only an issue when using the SMVI client.

1 ACCEPTED SOLUTION

keitha
4,612 Views

You mentioned that you made the change to not disable the NFS locks, did you also then make the change to the /etc/vmware/config file by adding

prefvmx.consolidateDeleteNFSLocks = “TRUE”

It sounds like what is happening...

View solution in original post

4 REPLIES 4

keitha
4,613 Views

You mentioned that you made the change to not disable the NFS locks, did you also then make the change to the /etc/vmware/config file by adding

prefvmx.consolidateDeleteNFSLocks = “TRUE”

It sounds like what is happening...

marcconeley
4,612 Views

It seems me and my consultants missed that bit.

Thanks for the quick reply!

keithluken
4,612 Views

I am seeing exact same problme, but with 3140 and FC LUNs. I beleive the probelm is more related to VMware not being able to snap so many VMs at once. I can take 5 of the same VMs that fail every time and make a SMVI job for just those 5 and they run every time. I suspect there is some load issue with ESX. I am working a a case with NetAPP and VMware (useless support) on trying to figure this out. I'm not sure if SMVI can be made to throttle the sanps tot ESX, but spawning up 20+ snaps in ESX at one time casue several of them to timeout almost instantaneously. I can;t beleive NetAPP has not seen this more frequently. I do have ESX 3.5 U4 (new infrastrutrues) and same VMToosl as well as the Windows Guests have the latest MS VSS update 3 rollup.

eric_barlier
4,612 Views

Hi Keith,

We see this occasionally too, but the VmW guys just re-run the backups and then they work. They too reckon its a problem with VC or VmW snaps. It does

not bother them much for some reason..

Not much help in my post other than confirmation that others also thinks its a VmW issue.

Cheers,

Eric

PS: keep us up to date with your case with NTAP tech support if you can.

Public