ONTAP Hardware

how to determine which host access the storage throught non-primary and no-optimal path

gwrcnbigman
6,061 Views

the following event occur in log:

scsitarget.partnerPath.misconfigured:error]: FCP Partner Path Misconfigured - Host I/O access through a non-primary and non-optimal path was detected

we didn't implement multiple path,how to determine which host access the storage throught non-primary and no-optimal path?how to fix it without impact?

thank advance,sorry for my poor english

11 REPLIES 11

andrc
6,040 Views

Issue `igroup show` on each controller CLI

The output should show each igroup, the initiators within the igroup and which ports they are logged in on. Ensure that each initiator is logged in on a port such as 0a, 0b, 0c, 0d as well as vtic.

If any initiator shows as only being logged in on vtic then it will be accessing the LUN via vtic and therefore not by the optimum path. Check your cabling and zoning to find out why the initiator is only logged in on vtic and therefore only has a path to one controller.

6,040 Views

thank you,but I had no idea how to fix the problem yet? any ideas will be appreciated

the output info of the command that you provided

fas2020a> igroup show

    realcastDB (FCP) (ostype: linux):

        21:01:00:e0:8b:bc:54:64 (not logged in)

        21:00:00:e0:8b:9c:54:64 (logged in on: vtic, 0a)

    dynamic (FCP) (ostype: linux):

        21:01:00:e0:8b:bc:52:64 (not logged in)

        21:00:00:e0:8b:9c:52:64 (logged in on: vtic, 0a)

    BlogDB (FCP) (ostype: windows):

        21:00:00:e0:8b:9c:51:64 (logged in on: vtic, 0a)

        21:01:00:e0:8b:bc:51:64 (not logged in)

    Backuse (FCP) (ostype: windows):

        21:01:00:e0:8b:bc:56:64 (not logged in)

        21:00:00:e0:8b:9c:56:64 (not logged in)

fas2020b> igroup show

    videoware (FCP) (ostype: windows):

        21:00:00:e0:8b:9c:50:64 (logged in on: 0a, vtic)

        21:01:00:e0:8b:bc:50:64 (not logged in)

    juyuanDB (FCP) (ostype: windows):

        21:01:00:e0:8b:bc:4f:64 (not logged in)

        21:00:00:e0:8b:9c:4f:64 (logged in on: 0a, vtic)

fas2020a: ip 192.168.10.34

fas2020b: ip 192.168.10.134

the following event that occur in log come from fas2020b:

scsitarget.partnerPath.misconfigured:error]: FCP Partner Path Misconfigured - Host I/O access through a non-primary and non-optimal path was detected,we didn't implement multiple path befause ignorance of system integrator and me

aborzenkov
6,040 Views
If any initiator shows as only being logged in on vtic then it will be accessing the LUN via vtic and therefore not by the optimum path.

It is not as simple. Host can see both optimal and non-optimal paths but due to (mis-)configuration chose non-optimal one.

When you have many hosts it becomes indeed not so easy to find which one is doing wrong. I wish NetApp would include WWPN of misbehaving host in this message (or provided otherwise some means to find it).

andrc
6,040 Views

It can be this simple on occasion and it's a quick check so it's always good to look for it as a first option.

Because the intitators are logged in on both vtic and an onboard port then MPIO should be configured on the host otherwise confusion occurs as in this case. The host sees two paths and doesn't know which one to choose so it's chosen the wrong one as it doesn't know any better.

gwrcnbigman
6,040 Views

is't there a method to solve it?

andrc
6,040 Views

Not that I'm aware of. According to your `igroup show` output you have 5 hosts connected, login on each one and confirm which path it's using....

gwrcnbigman
6,040 Views

thank for prompt reponse,but howto, the os of machine is linux or windows2003,can you tell me the details ,instruction step by step;sorry for my poor english,

thank you in advance

columbus_admin
6,040 Views

lun stats -o 'lun_path' will show you if that particular LUN is being accessed improperly and you may be able to track it that way.  However if all of your same OS hosts have access to all of your LUNs or all of your LUNs show partner access, this won't help at all.

You will get an output like this, and the part you want to look as are the last two fields.

        Read (kbytes)   Write (kbytes)  Read Ops  Write Ops  Other Ops  QFulls  Partner Ops Partner KBytes

        0                      24                   0               6               6               0         0                 0  

- Scott

gwrcnbigman
6,040 Views

fas2020b> lun stats -o '/vol/vol1/lun1'

    /vol/vol1/lun1  (606 days, 3 hours, 45 minutes, 20 seconds)

        Read (kbytes)   Write (kbytes)  Read Ops  Write Ops  Other Ops  QFulls  Partner Ops Partner KBytes

        6931445         2480642         99655     58594      590        0       158368      9411947  

fas2020b> lun stats -o '/vol/vol2/lun2'

    /vol/vol2/lun2  (606 days, 3 hours, 45 minutes, 26 seconds)

        Read (kbytes)   Write (kbytes)  Read Ops  Write Ops  Other Ops  QFulls  Partner Ops Partner KBytes

        1251784782      591894828       91956296  11177182   661        0       103133521   1843678725

can you tell me where the clue is?

aborzenkov
4,444 Views

It appears that hosts are accessing LUNs assuming active/active mode and are doing it in round robin fashion. You need to reconfigure hosts to avoid using non-optimal path. Just how exactly to do it depends on host OS and multipathing software in use.

columbus_admin
4,444 Views

Unfortunately it didn't narrow anything down, as both LUNs posted are experiencing writes from the non-optimal path.  The telling part is the two end sections, in an optimal write situation Partner Ops and Partner KBytes should be zero, unless there has been an issue.

fas2020b> lun stats -o '/vol/vol1/lun1'   

   

Partner Ops    Partner KBytes

158368            9411947 

   

   

fas2020b> lun stats -o '/vol/vol2/lun2'   

   

Partner Ops    Partner KBytes

103133521      1843678725

If you run lun stats -o again and it is increasing, the optimal configuration is not in use.  You may even need to check your switch zoning, if you have a zone with a WWNN to the storage system with a port on one controller, whose LUN is on the other controller, it is still a legal connection(though against every the best practices of every vendor I am aware of), but not an optimal one.  We use WWPNs to zone everything to ensure we are not in a position to cause ourselves problems in that department.

The filer WWNN is spanned across both controllers, so using WWNN is never a good idea in my opinion.  If it were a zoning issue, you should be able to correct it without impact, but should is the key word. 

I would strongly suggest putting in whatever amounts to a change for your organization before you attempt anything other than information gathering.  I am only familiar with Brocade, so I would create a new alias and a new zone, then add that zone to the configuration.  Once I was certain that everything was correct, then I would remove the offending zone from the configuration and make sure it was still working.  This can be done without an outage, but a simple mistake could take anything and/or everything out, so it should not just be attempted without a plan and communication to the end users/customers.

- Scott

Public