ONTAP Discussions

FC connection lost to partner luns during/in takeover mode

Alfs29
3,821 Views

Hello,

 

QUESTION:

Why the hell this noisy pile of boxes looses FC LUNs when in takeover mode????

 

Given:

1) FAS2040 8.1.4P9 7-Mode dual controller (named FAS2040HIGH and FAS2040LOW)

2) DS14MK4 shelf with 2 controllers

3) ESX 5.5 host with dual HBA

4) SAN fabric DS-5000B

5) 3 FC luns located on shelf. Owner of those luns is FAS2040HIGH controller.

 

Setup:

1) Each FAS controller 0a is connected directly to one MK4 IN port.

2) Each FAS controller 0b is connected to SAN fabric.

3) Host has dual HBA both ports to same SAN switch. All 4 ports in one zone.

 

When both controllers on FAS are running i see that both cards see both paths.

 

Problem:

When i do cf takeover on FAS2040LOW controller (lets imagine FAS2040HIGH has died) ESXi loose all paths to luns

When i just unplug FAS2040HIGH FC connector to SAN switch everything is fine. 1/2 of paths is lost (what is actually expected).

 

 

Additional info:

------------------------

ALUA is ON on igroup.

ESXi has RR PSP for those luns and both paths to each lun are active during normal operation. (actually when using SAN instead of direct FAS attach to HOST i get 4 paths to each lun, which is normal again)

I have igroups on both FAS controllers containing WWPNs of both HBA ports.

Output of igroup show -v

 

FAS2040HIGH*> igroup show -v

initiator_group_FAS2040HIGH (FCP):

OS Type: vmware

Host Multipathing Software: Required

Member: 21:01:00:1b:32:b0:2b:07 (logged in on: 0b, vtic)

Member: 21:00:00:1b:32:90:2b:07 (logged in on: 0b, vtic)

UUID: 7d63d5cf-6d31-11e5-8cc3-123478563412

Pset: myportset1

ALUA: Yes

Report SCSI Name in Inquiry Descriptor: Yes

 

FAS2040LOW*> igroup show -v
initiator_group_FAS2040LOW (FCP):
OS Type: vmware
Member: 21:00:00:1b:32:90:2b:07 (logged in on: 0b)
Member: 21:01:00:1b:32:b0:2b:07 (logged in on: 0b)
Member: 21:fd:00:05:1e:90:7e:02 (not logged in)
UUID: 391940b8-71e7-11e5-bbb5-123478563412
Pset: myportset2
ALUA: Yes
Report SCSI Name in Inquiry Descriptor: Yes

 

AS2040LOW*> portset show
myportset2 (FCP):
ports:
FAS2040LOW 0b
FAS2040HIGH 0b
igroups:
initiator_group_FAS2040LOW

 

 

FAS2040LOW*> lun config_check -v

Checking for down fcp interfaces

======================================================

No Problems Found

......

bla bla bla

......

Checking for duplicate WWPNs

======================================================

No WWPN Conflicts Found

But SSI Relationship Conflict is found which can cause duplicate WWPNs in the future

 

WTF is this???? I dont have duplicate WWPNs .... 

 

1 ACCEPTED SOLUTION

Alfs29
3,801 Views

ok, it seems that my problem was caused by wrong values in lun config image keys.

Did this:

 

Stop the FCP service on both the nodes.

Run the following commands:

> priv set diag

*> lun config set local.single_image.key “”

*> lun config set partner.single_image.key “”

Repeat the process on the other node.

Rebooted each node.

 

... and after 48h of struggling partner FC LUNs are not lost during takeover!

View solution in original post

1 REPLY 1

Alfs29
3,802 Views

ok, it seems that my problem was caused by wrong values in lun config image keys.

Did this:

 

Stop the FCP service on both the nodes.

Run the following commands:

> priv set diag

*> lun config set local.single_image.key “”

*> lun config set partner.single_image.key “”

Repeat the process on the other node.

Rebooted each node.

 

... and after 48h of struggling partner FC LUNs are not lost during takeover!

Public