Subscribe

SMO backup fails - snapdrive cannot disconnect device

[ Edited ]

We have a problem during SMO 3.1P1 backup of 11.2 database using ASM on AIX 6.1.

As i can see in logfile, snapshot is taken without problems. Then it is mounted, registered in ASM and RMAN and that also works fine.

But after that, there is a problem when smo runs "snapdrive disconnect " command. It fails with this error:

     0514-062 Cannot perform the requested function because the specified device is busy.

Here is a interesting part of the log where you can see that there are processes holding hdisk11 (one of 2 hdisks snapshoted). As you can see smo waits 22 secs for ASM to close everything. But i guess it does not do that in time. Is there a way to increase this timeout? Or to do something else to force ASM close descriptors?

2011-05-16 14:17:24,566 [Execution Monitor Thread [fuser /dev/rhdisk10 /dev/rhdisk11]] [DEBUG]: EXE-00001: Shell result [0:00:00.108] (Exit Value: 0):

/dev/rhdisk10:

/dev/rhdisk11:

4456656 4849826 6815972 7471358 4456656 6488084 6815972 7471358 7995434

2011-05-16 14:17:24,571 [...]: CON-10003: Waiting 22.0 seconds for ASM instance to close open file descriptors on ASM Disk files [/dev/rhdisk10, /dev/rhdisk11].

2011-05-16 14:17:46,714 [Execution Monitor Thread [fuser /dev/rhdisk10 /dev/rhdisk11]] [DEBUG]: EXE-00001: Shell result [0:00:00.108] (Exit Value: 0):

/dev/rhdisk10:

/dev/rhdisk11:

4456656 4849826 6815972 7471358 4456656 6488084 6815972 7471358 7995434

Thank you

Re: SMO backup fails - snapdrive cannot disconnect device

What is the complete version of Oracle database and ASM that you are using?

Yes, the timeout that SMO uses is a configurable one.

Add the below config parameter in smo.config file and restart SMO server.

asm.dismountWaitTime=22000

The default is 22 seconds (value is in milli-seconds) and ASM must close all the descriptors within this specified time.

Let us know the Oracle database and ASM versions.

- Kanthan

Re: SMO backup fails - snapdrive cannot disconnect device

Thank you for the answer kanthan. I will try with increased timeout.

Our versions are 11.2.0.2 for both (ASM and DB)

I'm wondering if this could be the Oracle ASM bug (when asm doesn't close file descriptors after "drop diskgroup".

Re: SMO backup fails - snapdrive cannot disconnect device

I increased the parameter to 2 minutes and it still didn't work. ASM keeps locks on the hidsks.

Re: SMO backup fails - snapdrive cannot disconnect device

Hi, it looks like Oracle bug 11666137 "ASM dismounted disks are still held by background processes for long time" on metalink, go figure. To be fixed in Oracle 12.1. LOL.

Disk file descriptors may  be held open by some processes for a while after an ASM diskgroup has been dismounted.
Rediscovery Notes:  Disk descriptors still open after dismount.  asmcmd lsod (or lsof) shows open descriptors after dismount.
Workaround:  None

Try to reproduce it by using SQLPlus/srvctl only. Measure the approximate time it takes to perform the shutdown procedure (end-to-end), and add 20%-30% more time and insert it as new value.

-Jakub.