Oracle DB crashed after storge failover

raovolvoadmin · ‎2012-12-11

Hi All,

we have oracle DB running on one unix server.All mounts are NFS mounts(from the storage NAS box).

we have Metro-cluster setup for the storage and models are : FAS6280 and running on Ontap 8.1.2.

Issue is : we did a migration on Netapp storage box,during that time we did failover in the cluster during the weekend.Next day we have complains from the Database team about the Crash.During the Failover we have 80sec NFS timeout.

can anyonee please provide us the NetApp recomended setting for Oracle DB where it can sustain nfs timeout for some few second while doing failver or failback ?.

Have a nice day...

Regards

Rao.

nkarthik · ‎2012-12-11

Hi Rao,

The failover(fast) and giveback(slow) are vary from 4 seconds to 120 seconds, depends on Platform model, OnTap Version, Number of Volumes, I/O load. We need to tune two parameters(O2CB_HEARTBEAT_THRESHOLD and O2CB_IDLE_TIMEOUT_MS) to keep the VMs survive during CF failover, which are resides in /etc/sysconfig/o2cb file. By default the O2CB_HEARTBEAT_THRESHOLD 31 seconds and O2CB_IDLE_TIMEOUT_MS is 60000 milliseconds in OVM 3.1.1.

The netapp recommendation is 160 seconds to 190 sec based on considering storage controller and operating system timeout.

O2CB_HEARTBEAT_THRESHOLD = (((timeout in seconds) / 2) + 1) = (190/2)+1 = 91

Sl no.	Parameter	Min Value	Max value
1.	O2CB_HEARTBEAT_THRESHOLD	81	91
2.	O2CB_IDLE_TIMEOUT_MS	160000	190000

In http://media.netapp.com/documents/tr-3712.pdf ovm best practice guide – Page 41 has the details about the above parameters for OVM 2.x but it’s applicable for OVM 3.x.

At the same time, tune the LUNs provided to VMs like the below to change it for 190 sec

for i in /sys/class/scsi_generic/*/device/timeout

do

echo 190 > "$i";

done

Regards,

karthikeyan.N

raovolvoadmin · ‎2013-03-29

Thanks:-).Useful for me.

Oracle DB crashed after storge failover

Get ready to power on