Additional Virtualization Discussions
Additional Virtualization Discussions
The storage team are on my case about our Linux boxes using both the primary and the partner path to send data to the Netapp filters. Our Solaris boxes don't do that but I can't see a way to get multipath on the Linux boxes to behave the same way.
Unfortunately, some of our boxes are older RHEL 4.9 with multipath 4.5 on them, so they're not running the latest and greatest.
I've tried several configurations and still see activity on both the primary and partner paths.
Can anyone offer some insight into how to configure multipath for data down primary path and secondary is for failover only.
Thanks,
Nigel
Solved! See The Solution
Hi Nigel,
Yes, you can configure multipath to send IO through primary paths by setting the following parameter in /etc/multipath.conf file's device specific section:
path_grouping_policy group_by_prio
In this case, the grouping will happen based on the priority of the paths and when there's atleast one active path in the primary paths group, IO will go through the primary group. There are other useful parameters which need to be set for better performance and to work with NetApp controllers. For a complete sample multipath.conf file for RHEL4.9, please refer to the "DM-Multipath configuration" section of the "Linux Host Utilities 5.3 Installation and Setup Guide" available on NOW.
Hope this helps.
Thanks,
Raj
Hi Nigel,
Yes, you can configure multipath to send IO through primary paths by setting the following parameter in /etc/multipath.conf file's device specific section:
path_grouping_policy group_by_prio
In this case, the grouping will happen based on the priority of the paths and when there's atleast one active path in the primary paths group, IO will go through the primary group. There are other useful parameters which need to be set for better performance and to work with NetApp controllers. For a complete sample multipath.conf file for RHEL4.9, please refer to the "DM-Multipath configuration" section of the "Linux Host Utilities 5.3 Installation and Setup Guide" available on NOW.
Hope this helps.
Thanks,
Raj
Hi Raj,
I already have group_by_prio. I'm using ontap as the method to determine priority and I can see it returning different values (4 for the primary and 1 for the partner path if I recall), but we're still seeing data down both. We actually have two primary and two secondary paths.
-sh-3.00# multipath -l
mpath1 (360a98000572d44345a34654571714976)
[size=10 MB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 0:0:0:1 sda 8:0 [active]
\_ 1:0:1:1 sdd 8:48 [active]
\_ round-robin 0 [enabled]
\_ 0:0:1:1 sdb 8:16 [active]
\_ 1:0:0:1 sdc 8:32 [active]
-sh-3.00# echo "show paths"|multipathd -k
multipathd>
0:0:0:1 sda 8:0 4 [active][ready] XXXXXXXXXXXX........ 12/20
0:0:1:1 sdb 8:16 1 [active][ready] XXXXXXXXXXXX........ 12/20
1:0:0:1 sdc 8:32 1 [active][ready] XXXXXXXXXXXX........ 12/20
1:0:1:1 sdd 8:48 4 [active][ready] XXXXXXXXXXXX........ 12/20
-sh-3.00# ./san_version
NetApp Linux Host Utilities version 5.3
-sh-3.00# pwd
/opt/netapp/santools
-sh-3.00# rpm -q --whatprovides /opt/netapp/santools/san_version
netapp_linux_host_utilities-5-3
-sh-3.00# rpm -lq netapp_linux_host_utilities
/opt/netapp
/opt/netapp/santools
/opt/netapp/santools/NOTICE.PDF
/opt/netapp/santools/san_version
/opt/netapp/santools/sanlun
/usr/sbin/sanlun
/usr/share/man/man1/sanlun.1
Seems like we might be missing something?
Ok, could you paste the output of "multipath -ll" and full multipath.conf (with all comments removed, if any) so that we can verify the settings? Also, when IO is happening to /dev/mapper/mpath1, capture the output of "iostat 2 /dev/sda /dev/sdb /dev/sdc /dev/sdd". IO should only be happening through /dev/sda and /dev/sdd in this case. IO on other paths should be close to 0.
Yes, of course. Here we go.
-sh-3.00# multipath -ll
mpath1 (360a98000572d44345a34654571714976)
[size=10 MB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [prio=8][active]
\_ 0:0:0:1 sda 8:0 [active][ready]
\_ 1:0:1:1 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=2][enabled]
\_ 0:0:1:1 sdb 8:16 [active][ready]
\_ 1:0:0:1 sdc 8:32 [active][ready]
-sh-3.00# cat /etc/multipath.conf|grep -v "^#"|grep -v "^$"
defaults {
user_friendly_names yes
}
devnode_blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
}
devices
{
device
{
vendor "NETAPP"
product "LUN"
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
prio_callout "/sbin/mpath_prio_ontap /dev/%n"
features "1 queue_if_no_path"
hardware_handler "0"
path_grouping_policy group_by_prio
failback immediate
rr_weight priorities
rr_min_io 128
path_checker readsector0
}
}
I had changed rr_weight from uniform to see if that'd fix it.
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdd 0.00 2.00 0.00 160.50 0.00 192584.00 0.00 96292.00 1199.90 48.21 168.34 2.79 44.85
avg-cpu: %user %nice %sys %iowait %idle
0.62 0.00 27.00 30.25 42.12
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 10.50 21.50 119.50 172.00 102472.00 86.00 51236.00 727.97 56.05 397.49 4.14 58.35
sdb 0.00 0.00 4.00 0.00 32.00 0.00 16.00 0.00 8.00 0.01 2.88 2.88 1.15
sdc 0.00 0.00 4.00 0.00 32.00 0.00 16.00 0.00 8.00 0.02 4.62 4.62 1.85
sdd 0.00 0.00 4.00 78.50 32.00 21480.00 16.00 10740.00 260.75 22.90 534.46 3.67 30.25
avg-cpu: %user %nice %sys %iowait %idle
0.25 0.00 25.00 1.25 73.50
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 6.00 1.50 48.00 12.00 24.00 6.00 8.00 0.00 0.13 0.13 0.10
sdb 0.00 0.00 0.50 0.00 4.00 0.00 2.00 0.00 8.00 0.01 11.00 11.00 0.55
sdc 0.00 0.00 0.50 0.00 4.00 0.00 2.00 0.00 8.00 0.01 30.00 30.00 1.50
sdd 0.00 0.00 0.50 0.00 4.00 0.00 2.00 0.00 8.00 0.00 0.00 0.00 0.00
Not sure how close to 0 it should be, but its not 0. The storage guys only gave me 10MB before to play with (when I asked for 5GB) they gave me 4 so I have been able to do some sustained writes and this is what it threw up eventually.
lease let me know if there's anything else I can provide.
Okay, uniform was the right setting. But you will certainly see some negligible IO (close to 0, but not 0) on the secondary paths since multipath will be sending some IO (path checkers) to check the health of the paths periodically. If you are seeing heavy IO on secondary paths, controllers will warn you with something like "FCP partner paths misconfigured". If you are seeing very minimal IO as compared to the other two paths, you should be fine. To continously pump IO on mpath1, use a dd command in an infinite loop - "for ((;1;)) do dd if=/dev/mapper/mpath1 of=/dev/null; done" and then check if the IO is negligible on sdc and sdb.
Well, we reset the counters and I restated multipathd and I guess the amount of data is negligable. I'll try it on another server tomorrow and report back.
Thanks.