2011-07-26 11:33 AM
Hi everyone I have the following scenario;
I have a volume which was in the process of a snapshot by what I think was Snapdrive. It created a flexvol clone and now I cannot offline that clone.
It is causing some difficulty for me where I cannot use the FilerView to do some basic administration and even more frustrating from ssh, where I get locked out after issuing the vol offline command.
FIlerView keeps giving me this error when I try to do anything in it;
Error: Volume(s) Operation Failed. Volume busy. Please retry the operation.
Error obtaining volume size for volume sdw_cl_hqvmrim05_datastore_0: Volume is in use by other operations
When I tell the filer to: vol offline sdw_cl_hqvmrim05_datastore_0
The session freezes. I can still see the debug information in the background, like snapdrive exchange and sql doing their operations and everything is still online.
I opened up a support case with Netapp, but nothing yet from them.
Running ONTAP 7.3.1. P3
What I have tried already; On the windows host that has snapdrive, restarted snapdrive. ( I cannot reboot teh machine at this time, production RIM server, but may schedule some downtime as this is a PITA.)
Solved! SEE THE SOLUTION
2011-07-26 12:46 PM
just to bump this; in the volume is a lun, the lun is offline and is not mapped to any igroups.
when trying to delete the lun; The LUN is busy, stop IO before attempting to destroy the LUN .
I also tried to map the lun back via SnapDrive, it failed with a SnapDrive error cannot add RDM to VM.
2011-08-01 03:21 AM
If you do a vol status on the containing volume and a lun show on the LUN, what does it show you? Is there still a clone of the LUN that is offline? What does the snap list show?
2011-08-02 06:38 AM
It did have a lun in the volume, however the lun was offline, but couldnt be deleted, the error message was busy i/o same thing for the volume.
Infact, the when I tried to delete the volume my SSH would freeze.
Ive tried everything but the solution was a cluster failover and a cluster giveback.
During the failover the offending node would reboot and clear what netapp level 3 engineer called a bad memory condition.
after the cluster giveback i was easily able to delete the offending flexvol and the snapshot that was backing it, all problems cleared up after that.