ONTAP Discussions

vFiler command cannot be run.

GTHOMAS13
4,246 Views

Hi Storage guru's 

 

About a couple of weeks ago, I've issued the command "vfiler dr resync" since then it seems like it is hanging.

When I run any vfiler command I recieve the following error - vfiler command cannot be run while 'vfiler dr' command is running; try again later.

I initially thought giving it sometime to complete, however weeks laters it still shows the same message.

 

My environment:

FAS3220

ONTAP 8.14P10 7-Mode

 

Please assist...

 

 

1 ACCEPTED SOLUTION

GTHOMAS13
4,009 Views

++ Update

 

Performed a failover/giveback of the node.

Issue has been resolved.

View solution in original post

5 REPLIES 5

JGPSHNTAP
4,237 Views

I've never seen it take that long.  vfiler dr resync does something with your mirrors.


check your snapmirrors to see if anyone of them are hung up

 

snapmirror status

 

 

GTHOMAS13
4,233 Views

Thanks for quick response.

 

I can confim that there are no hanging snapmirrors, every job is transferring as per schedual.

 

On that note: The newly created volume was manually initialized using the snapmirror commands as the vfiler dr resync would just not work. 

 

I'm now considering the takeover and giveback to resolve this, however what implacations would that have?

JGPSHNTAP
4,221 Views

^^

That's a wierd one, I've never seen a process hang like that.  And you are 100% sure all vols are replicated, then the only thing you can do is a failover and giveback.

 

 

 

 

GTHOMAS13
4,097 Views

I'm still investigating this issue, NetApp are reluctant to assist as the software version is EOS.

 

While digging, I noticed the source vfiler contained one more volume than the destination. 

BUG 543416 lists the symptoms (Misconfiguration of SnapMirror, Busy source/destination, Unavailable source volume) and solution (switch "snapmirror off") 

 

We then created the destination volume and manually initialized the replication for the missing volume.

 

This did not solve the issue either, digged abit deeper and found article: 

https://kb.netapp.com/app/answers/answer_view/a_id/1071014/loc/en_US 

 

 Checking the processes, nothing stands out as unusual.

filer% ps -eaf
PID TT STAT TIME COMMAND
1444 con Is+ 0:00.08 login /dev/cuacons.auth (ontaplogin)
1445 sp. Ss+ 17:59.38 login /dev/cuasp.auth (ontaplogin)  > The only PID which is increasing in time.
1446 rlm Is+ 0:00.01 login /dev/console (ontaplogin)
76746 p0 Ss 0:00.01 login [pam] (login)
76747 p0 S 0:00.01 USER=diag LOGNAME=diag HOME=/var/home/diag SHELL=/bin
76752 p0 R+ 0:00.00 USER=diag LOGNAME=diag HOME=/var/home/diag SHELL=/bin

 

GTHOMAS13
4,010 Views

++ Update

 

Performed a failover/giveback of the node.

Issue has been resolved.

Public