Subscribe

Oracle RAC node restart on Controller takeover/giveback

Oracle RAC11gR2 (ASM)

RHEL 5.8

FAS 3210

a two nodes Oracle RAC on RHEL 5.8.both 2 nodes will restart while halt a controller.

Oracle ASM lost diskgroup at that time and never recover in 200s(Oracle RAC default disktimeout)

At the same time,all my vmware esxi goes well.

here is my multipath configuration:

defaults {

flush_on_last_del     yes

max_fds     max

pg_prio_calc     avg

queue_without_daemon     no

user_friendly_names     no

}

multipaths {

        multipath {

                wwid                    360a9800064724935574a6b6a4c507176

                alias                   mpath1

                path_grouping_policy     multibus

                }

        multipath {

                wwid                    360a98000647344414b5a6b6a4c366667

                alias                   mpath2

                path_grouping_policy     multibus

                }

        multipath {

                wwid                    360a9800064724935574a6b6a4c516c6f

                alias                   mpath3

                path_grouping_policy     multibus

                }

        multipath {

                wwid                    360a9800064724935574a6a4d46784166

                alias                   mpath4

                path_grouping_policy     multibus

                }

        multipath {

                wwid                    360a98000647344414b5a6b6a424b7148

                alias                   mpath5

                path_grouping_policy     multibus

                }

}

devices {

device {

vendor     "NETAPP"

product     "LUN"

path_grouping_policy     group_by_prio

getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"

path_checker     directio

path_selector     "round-robin 0"

hardware_handler     "0"

failback     immediate

features     "1 queue_if_no_path"

prio_callout            "/sbin/mpath_prio_ontap /dev/%n"

rr_min_io     128

rr_weight     uniform

no_path_retry      fail

}

}

Re: Oracle RAC node restart on Controller takeover/giveback

Open a case at Oracle,here is their reply:

In the process of storage controllers Takeover/Giveback,there will be a short time of I/O interruption(in my case is about 45s),on the Oracle RAC level,its diskgroups changed to a  umounted status .although the IO recovery in a short time,these Oracle diskgroup should mount manually.