Re: Oracle RAC node restart on Controller takeover/giveback

INFORHUNTER · ‎2012-06-26

Oracle RAC11gR2 （ASM）

RHEL 5.8

FAS 3210

a two nodes Oracle RAC on RHEL 5.8.both 2 nodes will restart while halt a controller.

Oracle ASM lost diskgroup at that time and never recover in 200s(Oracle RAC default disktimeout)

At the same time,all my vmware esxi goes well.

here is my multipath configuration:

defaults {

flush_on_last_del yes

max_fds max

pg_prio_calc avg

queue_without_daemon no

user_friendly_names no

}

multipaths {

multipath {

wwid 360a9800064724935574a6b6a4c507176

alias mpath1

path_grouping_policy multibus

}

multipath {

wwid 360a98000647344414b5a6b6a4c366667

alias mpath2

path_grouping_policy multibus

}

multipath {

wwid 360a9800064724935574a6b6a4c516c6f

alias mpath3

path_grouping_policy multibus

}

multipath {

wwid 360a9800064724935574a6a4d46784166

alias mpath4

path_grouping_policy multibus

}

multipath {

wwid 360a98000647344414b5a6b6a424b7148

alias mpath5

path_grouping_policy multibus

}

devices {

device {

vendor "NETAPP"

product "LUN"

path_grouping_policy group_by_prio

getuid_callout "/sbin/scsi_id -g -u -s /block/%n"

path_checker directio

path_selector "round-robin 0"

hardware_handler "0"

failback immediate

features "1 queue_if_no_path"

prio_callout "/sbin/mpath_prio_ontap /dev/%n"

rr_min_io 128

rr_weight uniform

no_path_retry fail

}

INFORHUNTER · ‎2012-07-19

Open a case at Oracle,here is their reply:

In the process of storage controllers Takeover/Giveback,there will be a short time of I/O interruption(in my case is about 45s),on the Oracle RAC level,its diskgroups changed to a umounted status .although the IO recovery in a short time,these Oracle diskgroup should mount manually.