I attempted a file snap restore this morning to restore a virtual server.
After running the command I went to view the folder contents using the Vsphere GUI. This displayed searching datastore ..... and never returned a display.
I logged in to a Vsphere host through putty and was able to see the file there. I assumed the restore was still in progress. Eventually other volumes Vsphere and Oracle started to become inaccessible on the filer, some but not all displayed inactive in Vcenter. A co worker who is the principal administrator failed the filer over and rebooted it
This fixed the problem until they failed it back. Apparently the restore started back up at this point. I deleted the file in the putty session and the symptoms went away.
The volume that file resided does have a snap mirror relationship and a scheduled replication was attempted during the restore. The transfer failed which I guess is normal and replication completed after I deleted the file.
Should peforming a file snap restore cause this? I'm not clear from reading the Data Protection documentation which I've pasted below.
Prerequisites for using SnapRestore
You must meet certain prerequisites before using SnapRestore.
• SnapRestore must be licensed on your storage system.
• There must be at least one Snapshot copy on the system that you can select to revert.
• The volume to be reverted must be online.
• The volume to be reverted is not being used for data replication.
General cautions for using SnapRestore
Before using SnapRestore, ensure that you understand the following facts.
• SnapRestore overwrites all data in the file or volume. After you use SnapRestore to revert to a
selected Snapshot copy, you cannot undo the reversion.
• If you revert to a Snapshot copy created before a SnapMirror Snapshot copy, Data ONTAP can no
longer perform an incremental update of the data using the snapmirror update command.
However, if there is any common Snapshot copy (SnapMirror Snapshot copy or other Snapshot
copy) between the SnapMirror source and SnapMirror destination, then you should use the
snapmirror resync command to resynchronize the SnapMirror relationship.
If there is no common Snapshot copy between the SnapMirror source and SnapMirror destination,
the you should reinitialize the SnapMirror relationship.
• Between the time you enter the snap restore command and the time when reversion is completed,
Data ONTAP stops deleting and creating Snapshot copies.
Should anyone ever stumble upon this article in a frantic panic trying to get their filer to respond to nfs/cifs/iscsi requests with a running single file snap-restore process in progress (this, by the way, is what VSC uses for VM restores by default on NFS - blows my mind) - you can cancel the single file snap-restore process by deleting the destination file/directory you are restoring to - if you are doing an inline restore, delete the folder/file you are trying to restore and manually copy out that deleted file/folder from your ".snapshots" folder
We had this same issue. Kicked off a VM restore using VSC, and the filer ground to a halt, vms datastores went off line, as did iscsi etc and as such most major applications offline.
Unfortunately we didn't find this whilst it was happening to know to delete the destination folder and whilst faffing around trying to find a way to recover essentially ended up waiting it out for 4.5 hours outage.
The upshot is, NetApp provide this full vm recovery option in their tool but in short, don't use it. Do it the long way and mount a snapshot and drag the vm out manually. There's no 'fix' aside from going to cluster mode or use a better backup and recovery product.
I was hoping someone would have replied to your posting. I am experiencing the same issue. I am on Data Ontap 7.3.2 and VSC 2.1.1 to do my backups. I am currently only testing this as it has not worked properly yet for me. Basically I have 3 NFS volumes, one for config files and two others for vmdk's. I am able to do a backup perfectly fine but then when I try to restore that snapshot it will saturate the whole interface that connects to the san and cause the volumes to become disconnected into vsphere. This was for a 25 GB vmdk and it wasn't even finished after 45 minutes. I had to reboot my SAN (FAS2020) in order to recover the volumes and get the restore to stop.
I had to reboot because I couldn't find an actual way to cancel the restore. Does anyone know how to do that? Also why would this crash my whole connection? There are no requirements indicating it needs its own NIC to do the transfers.
The restore shouldn't have any impact on an interface as a snapshot restore is all done internally. Basically the inode pointers are changed to point to the old file blocks at the time of the snapshot, rather than the current file blocks.
Any update on this strange behaviour ? I had the same issue with ONTAP 8.1.4P1 (7-mode) on a FAS6240 (not I little one as you see).
It started after a SMO restore operation on a cloned volume for which single file restore was needed/selected. The database has a lot of data files (a very big DB) ! People started complaining, especially about the cifs access. When I looked later on the stats there were latency issues for CIFS and iSCSI (FCP and NFS didn't suffered from it). Strange, because the Oracle environment runs over NFS. Killing the SMO process and later halting the host didn't solve anything. Only after offlining the volume, the controller behaved normal again. Unfortunately when I online it again the single-file snaprestore starts again :(.
Why is there such an impact ? How can I stop this, without destroying my volume ?
--[ INFO] SMO-07200: Beginning restore of database "NGDB"
-[ INFO] SD-00010: Beginning single file restore of file(s) [/ngdbhome/ngdb/DATA/CTX/Ctx07.dbf, /ngdbhome/ngdb/DATA/CciLob/22/CciLobData22.dbf5, /ngdbhome/ngdb/DATA/CciLob/22/CciLobData22.dbf6, /ngdbhome/ngdb/DATA/CciLob/22/CciLobData22.dbf3,
Thu Dec 4 16:38:07 CET [NETAPPXX:wafl.sfsr.done:notice]: Single-file snaprestore of inode 26253 (snapid 19, volume Test_CCIv36_NGDB_Data_clone) to inode 9681 has completed.
Thu Dec 4 16:38:07 CET [NETAPPXX:wafl.scan.start:info]: Starting redirect on volume Test_CCIv36_NGDB_Data_clone.
Thu Dec 4 16:42:05 CET [NETAPPXX:cifs.oplock.break.timeout:warning]: CIFS: An oplock break request to station 10.230.128.31() for filer NETAPPXX, share rmgmailarchive01indexes$, file \Indexes02\166741376BDAE5A4F9BFB6D82329C4E7B_5316\live\log.sqlt has timed out.
Thu Dec 4 16:53:08 CET [NETAPPXX:wafl.sfsr.done:notice]: Single-file snaprestore of inode 13360 (snapid 19, volume Test_CCIv36_NGDB_Data_clone) to inode 2037 has completed.
Thu Dec 4 16:53:16 CET [NETAPPXX:wafl.sfsr.done:notice]: Single-file snaprestore of inode 23958 (snapid 19, volume Test_CCIv36_NGDB_Data_clone) to inode 24675 has completed.
There was not much disk activity but the filer was doing lots of WAFL_Ex(Kahu).