ONTAP Discussions

Multipathing Issue on Oracle Linux 7

salmank
1,761 Views

I have following enterprise hardware:

 

A. NetApp AFF A400
B. Two Brocade G610 FC Switches
C. Two x86 Rack Mount Server.

 

1. NetApp AFF A400 has a total of 4 16Gbps Block Ports i.e., 2 ports on each controller. These physical ports are named as A0e and A0f for Controller A and B0e and B0f for Controller B.


2. Each physical Block Port on NetApp has 1 Logical Interface Configured on it and named as following:
a. Logical Interface LIFFCP1 on interface A0e.
b. Logical Interface LIFFCP2 on interface A0f.
c. Logical Interface LIFFCP3 on interface B0e.
d. Logical Interface LIFFCP4 on interface B0f.


3. Each x86 Rack Mount Server has 1 HBA with 2 16Gbps Ports as following:
a. Both Ports of HBA on Server 1 are named as Host15A and Host16A.
b. Both Ports of HBA on Server 2 are named as Host15B and Host16B.


4. Two Brocade G610 FC Switches are named as G610-A and G610-B. Both NetApp AFF A400 Controllers and x86 Servers are connected with both of the Brocade G610 FC Switches with following configuration:
a. LIFFCP1 and LIFFCP3 are connected to Port 0 and Port 1 on G610-A respectively.
b. LIFFCP2 and LIFFCP4 are connected to Port 0 and Port 1 on G610-B respectively.
c. Host15A and Host15B are connected to Port 2 and Port 3 on G610-A respectively.
d. Host16A and Host16B are connected to Port 2 and Port 3 on G610-B respectively.
e. Ports 0,1,2 and 3 of G610-A are put in one zone named Prod-A on G610-A.
f. Ports 0,1,2 and 3 of G610-B are put in one zone named Prod-B on G610-B.


5. Oracle Linux 7.6 UEK is installed on both Rack Mount Servers.


6. NetApp Host Utilities and Multipath RPMs are installed on both Rack Mount Servers.


7. Three 50GB LUNs are created on NetApp with names as LUN01, LUN02, LUN03 and provisioned on both Rack Mount Servers via both Brocade G610 FC Switches.


8. The output of the command "sanlun lun show" shows 4 disks for each LUN i.e.,
a. /dev/sdc, /dev/sdd, /dev/sde and /dev/sdf shown against LUN01 on Oracle Linux 7.6 on both Rack Mount Servers.
b. /dev/sdg, /dev/sdh, /dev/sdi and /dev/sdj shown against LUN02 on Oracle Linux 7.6 on both Rack Mount Servers.
c. /dev/sdk, /dev/sdl, /dev/sdm and /dev/sdn shown against LUN03 on Oracle Linux 7.6 on both Rack Mount Servers.


9. Oracle RAC 19c is configured on both Rack Mount Servers with above mentioned three 50GB LUNs as the Voting Disks in OCR Disk Group.


10. WWIDs of the above mentioned three 50GB LUNs visible on Oracle Linux 7.6 are as following:
a. WWID of Disk01 is 3600508b1001cbd89313f1d9cdd60d150 as shown from "multipath -ll" command.
b. WWID of Disk02 is 3600508b1001cbd89313f1d9cdd60d151 as shown from "multipath -ll" command.
c. WWID of Disk03 is 3600508b1001cbd89313f1d9cdd60d152 as shown from "multipath -ll" command.


11. The above mentioned WWIDs are formatted using fdisk. oracleasm utility is configured and used to create asm disks for the provided three WWIDs and named as VOTE01, VOTE02 and VOTE03 as oracleasm disks. Command used for this purpose are shown as following:
a. oracleasm createdisk VOTE01 /dev/mapper/3600508b1001cbd89313f1d9cdd60d150
b. oracleasm createdisk VOTE02 /dev/mapper/3600508b1001cbd89313f1d9cdd60d151
c. oracleasm createdisk VOTE03 /dev/mapper/3600508b1001cbd89313f1d9cdd60d152


12. These above mentioned oracleasm disks three are Voting Disks for Oracle RAC and are part of OCR Disk Group. Name of the OCR Disk Group is OCR_DiskGp.


13. If LIFFCP1 and LIFFCP3 are disconnected and LIFFCP2 and LIFFCP4 are kept connected on NetApp AFF A400, nothing happens to Oracle RAC and it keeps working smoothly i.e., VOTE01, VOTE02 and VOTE03 remains available to Oracle RAC and OCR_DiskGp remains mounted. However if LIFFCP2 and LIFFCP4 are disconnected and LIFFCP1 and LIFFCP3 are kept connected on NetApp AFF A400, Oracle RAC is disturbed i.e., VOTE01, VOTE02 and VOTE03 becomes unavailable to both Rack Mount Servers and OCR_DiskGp gets dismounted thus crashing the Oracle RAC and forcing both Rack Mount Servers to reboot.


14. It seems like the two paths i.e., LIFFCP2 and LIFFCP4 are active paths and LIFFCP1 and LIFFCP3 are passive paths for multipathing.


15. I want all these 4 ports i.e., LIFFCP1,LIFFCP2,LIFFCP3 and LIFFCP4 to be to be configured in such a way that if either of LIFFCP1 and LIFFCP3 or LIFFCP2 and LIFFCP4 are disconnected or disabled then Oracle RAC should not be disturbed and the 3 Voting Disks should remain available to both Rack Mount Servers.


16. What kind of configurational changes should i do on Oracle Linux to achieve this result?

1 ACCEPTED SOLUTION

TMACMD
1,696 Views

I have recently setup a few Brocade FC switches recently.

What I have been doing is something called peer-zoning.

You basically do this:

https://techdocs.broadcom.com/us/en/fibre-channel-networking/fabric-os/fabric-os-administration/9-1-x/Administering-Advanced-Zoning-AG/v26773885/v2677...

 

Connect all your stuff to the switch.

Create an ALIAS for each device

Create a ZONE for EACH initiator

Each Zone will contain one principal and other non-principal members.

Each NetApp FC LIF will/could be the principal member. Then add the same fabric hosts WWPN alias to the same zone. Be sure only to select the NetApp FC LIF as the only principal member.

Repeat. In your example,

Fabric-A will have 

ZONE P1  (principal member) LIFFCP1 with other members of Host 15A and Host 16A

ZONE P3  (principal member) LIFFCP3 with other members of Host 15A and Host 16A

 

Fabric-B will have 

ZONE P2  (principal member) LIFFCP2 with other members of Host 15B and Host 16B

ZONE P4  (principal member) LIFFCP4 with other members of Host 15B and Host 16B

 

Then create the zoneset (A fabric contains P1 & P3, B Fabric contains P2 & P4)

Active it.

That should give you the proper zoning and all paths should be seen

 

I think there is an issue with your config.

Refer to this Tech Report:

https://www.netapp.com/media/8744-tr3633.pdf

Look at page 28: 

An FC zone should never contain more than one initiator. Such an arrangement might appear to work initially, but crosstalk between initiators eventually interferes with performance and stability.

 

Please review that TR and report back after you have made changes to your environment.

 

View solution in original post

3 REPLIES 3

TMACMD
1,697 Views

I have recently setup a few Brocade FC switches recently.

What I have been doing is something called peer-zoning.

You basically do this:

https://techdocs.broadcom.com/us/en/fibre-channel-networking/fabric-os/fabric-os-administration/9-1-x/Administering-Advanced-Zoning-AG/v26773885/v2677...

 

Connect all your stuff to the switch.

Create an ALIAS for each device

Create a ZONE for EACH initiator

Each Zone will contain one principal and other non-principal members.

Each NetApp FC LIF will/could be the principal member. Then add the same fabric hosts WWPN alias to the same zone. Be sure only to select the NetApp FC LIF as the only principal member.

Repeat. In your example,

Fabric-A will have 

ZONE P1  (principal member) LIFFCP1 with other members of Host 15A and Host 16A

ZONE P3  (principal member) LIFFCP3 with other members of Host 15A and Host 16A

 

Fabric-B will have 

ZONE P2  (principal member) LIFFCP2 with other members of Host 15B and Host 16B

ZONE P4  (principal member) LIFFCP4 with other members of Host 15B and Host 16B

 

Then create the zoneset (A fabric contains P1 & P3, B Fabric contains P2 & P4)

Active it.

That should give you the proper zoning and all paths should be seen

 

I think there is an issue with your config.

Refer to this Tech Report:

https://www.netapp.com/media/8744-tr3633.pdf

Look at page 28: 

An FC zone should never contain more than one initiator. Such an arrangement might appear to work initially, but crosstalk between initiators eventually interferes with performance and stability.

 

Please review that TR and report back after you have made changes to your environment.

 

cedric_renauld
1,691 Views

Hello,

the Port zonningn can work, but it's recommended to zone with the WWPN from the SVM, because it's use NPIV technologie

How do you "do" the mapping in Ontap ?

TMACMD
1,686 Views

I will not disagree, however, you have more than ONE initiator in the zone. That is not advised. See the Tech Report I mentioned earlier.

 

Not understanding your question. Unless you mean:

create FC LIFs

Create CORRECT switch zoning.

You should be able to verify host WWPN with:

FCP INITIATOR SHOW

Create an "igroup" for EACH HOST (for allowance of mapping a LUN to a SINGLE HOST as needed)

Create an igroup with "child igroups" (basically, one igroup that contains the igroups for each host)

 

netapp-01::*> lun igroup create -vserver esx-01 -igroup esxi-c1-27 -protocol fcp -ostype vmware -initiator 20:00:00:25:B5:03:8A:08,20:00:00:25:B5:03:8B:08

netapp-01::*> lun igroup create -vserver esx-01 -igroup esxi-c1-28 -protocol fcp -ostype vmware -initiator 20:00:00:25:B5:03:8A:09,20:00:00:25:B5:03:8B:09

netapp-01::*> lun igroup create -vserver esx-01 -igroup esxi-all -protocol fcp -ostype vmware -initiator - child-igroups esxi-c1-27,esxi-c1-28

 

Then map luns to igroups and be done.

There is certainly a way to use the GUI. I typically do not use the GUI. I do nearly all my work on the CLI.

Public