Data Backup and Recovery

AIX MPIO settings

dolly
12,732 Views

All,

IHAC who is running some performance tests from an AIX system.  The sticking point with the customer is that they want to be able to utilize every available FC port on the AIX server and the target ports on the filer.  The filer is running in single image mode and we're using AIX native mpio sw.  The mpio setting is "round-robin" and according to AIX documentation, this setting makes every port presented to the server have priority 1.  What we're seeing on the filer is that eventhough primary and secondary paths are clearly defined (verified with 'sanlun lun show -p'), there's still a considerable amount of traffic going over the cluster interconnect.  Is there a way to prevent this?

Thanks in advance,

Dolly

7 REPLIES 7

adityav
12,732 Views

Hey Dolly,

You need to have ALUA enabled on the igroup inorder that path optimizations are done. You can SDU to provision luns from your host and it will it set it up for you.

Adithya Vishwanath

MTS - SDU development

SnapDrive for Unix

NetApp

41843332 Direct

adityav@netapp.com

www.netapp.com

dolly
12,732 Views

I just checked and ALUA is enabled on the igroup.

===== INITIATOR GROUPS =====

emerald (FCP):

OS Type: aix

Host Multipathing Software: Required

Member: 10:00:00:00:c9:40:f9:1b (logged in on: vtic, 0g, 0e, 0c, 0a)

Member: 10:00:00:00:c9:59:df:0a (logged in on: vtic, 0g, 0e, 0c, 0a)

ALUA: Yes

Still getting FCP Partner Path Misconfigured errors.

Anything else I need to check?

Thanks!

Dolly

RodrigoNascimento
12,732 Views

Hi Dolly!

Have you installed the FC Host Utilities for AIX?

And have you tried to check what host is doing the FCP Path Misconfigured with lun config_check -A command?

See you!

Rodrigo N

dolly
12,732 Views

Yes, they have AIX HUK installed.

Dolly

RodrigoNascimento
12,732 Views

Could you send the lsdev -Cc disk output from this system?

Have you turned ALUA on before or after run the cfgmgr command?

ivissupport
12,732 Views

Hi Dolly, how are you?

My advice is to use WWPN (mac address) for zoning with SAN Switches

Dynamic least queue depth or round robin with subset load balancing policy is preferred for better performance (in this mode only the active paths used) so the vtic (clustering) is only

important for failover.

Thanks

jakub_wartak
12,732 Views

Hello,

you can actually have native round-robin on AIX6.1 with and without ALUA. The trick is that without ALUA you need to use dotpaths command, which sets the priorities for the paths, so that secondary paths are not normally used. Next you need to have proper attributes on the hdisk devices so that paths return to enabled state after being in Failed state.

Do not use "sanlun lun show -v -p" to display path priorties, or at least be sure run cfgmgr before if you do (lspath is much better; the code path seems to be different for those 2 utils; lspath always displays the correct state from the AIX ODM state).

Additionally iostat has a syntax (-m/-M or something like that)  to display *current* path utilizations every second. Do a test, write some big 10G file and make observations...

By definition lspath does not display priorities, but you can force it to do so. It needs a switch in form of "lspath -F dev,conn, etc, etc" - a more verbose way of displaying stuff. Man lspath,chpath etc are you friends on this one.

AIX6.1 TL2 and TL3 here and it works (with some APARs), even for rootvg (in RR mode), but be sure to get support note stating that you are supported (support matrix/PVR). I'm doing some extensive testing here (including SMSAP, HACMP/PowerHA, PM, VIOS) that includes failing VIOS under stress-test of Oracle/SAP plus failovers of LPARs by using HACMP/PowerHA, hot-backups using SMSAP under Oracle load, etc., restores by using PM/Vaulting, etc. Everything seems to be working so for with round-robin. Note: the installation of AIX LPAR requires having portset to only 1 path during the NIM/AIX install, mainly because the installation likes to put SCSI reservation locks on the LUNs... so after you install it is good to clear the locks by using "lun offline" or "vol offline" and then start the box for the first time or something like this -- be sure to read in detail the host-attachement kit for AIX/FC for more detials.

The problem with traffic on the cluster interconnect might be that:

1) ALUA failed to initialize priorities, not sure in that case "dotpaths -q" would show you something interesting

2) It is normal, because there is heartbeat in MPIO of AIX to probe every path according to hdisk MPIO interval (it needs to send some SCSI inquiry about hdisk every path, so that's why you might be having this traffic, especially if you have many LPARs/AIX servers).

Helped? I need to start collecting points here.... 😜

-Jakub.

Public