our NetApp E2724 is direct attached to two Cisco C240 M4 Server with ESXi 6.0.0. The volume configuration is one Volume Group with 2 Volumes. For Volume “A” the preferred controller is A and for Volume “B” the preferred controller is B.
Everything is working fine, also the failover but I always get the critical error message “Volume not on preferred path due to AVT/RDAC failover”.
Configuration (see attachment):
2 x VmWare ESXi 6.0.0
iSCSI (2 vSwitches) direct attached
Path Selection: Round Robin (also tried “Fixed”)
Storage Array Type: VMW_SATP_ALUA
Does anyone know what could be done to solve this problem? OR is there a workaround?
Makes it sense to disable ALUA? (and how can I disable ALUA?)
Thats not a big deal... You just need to go into SANtricity, highligt the volume and right mouse click. One of the options will be to change ownership / preferred path. Change it to A or B whatever the preferred path should be.
If it keeps happening then there is a problem causing it. I would make sure the failover drivers are setup correctly on all system connecting to those ports. It only takes 1 system not configured correctly to cuase a issues with all luns presenting down that path.
My system setup is .... 3 volumes were created. 2 of them are accessed via Controller A and wroks great.
But the 1 of them that accessed via Controller B , called "Netapp-E2724_ESXi6-2", keeps sending that "Volume not on preferred path due to AVT/RDAC failover" messages and switch to Controller A everyday.
Before the E2724 change the path of volume "Netapp-E2724_ESXi6-2", the volume works normally.
What can I do to find out the root cause?
Some system information ...
At ESXi side ...
Result of command "esxcli storage nmp device list"
MAPPINGS (SANshare Storage Partitioning - Enabled (1 of 128 used))-------------------
Volume Name LUN Controller Accessible by Volume status DA Enabled Volume Capacity Type Access Volume 7 A,B Default Group Optimal NA Access Access Volume 7 A,B Host Group VMware Optimal NA Access Netapp-E2724_ESXi6-1 0 A Host Group VMware Optimal Yes 3,072.000 GB Standard Netapp-E2724_ESXi6-2 1 A Host Group VMware Optimal Yes 3,072.000 GB Standard Netapp-E2724_ESXi6-3 2 A Host Group VMware Optimal Yes 3,072.000 GB Standard
STORAGE ARRAY Default type: Factory Default
Host Group: VMware Data Assurance (DA) capable: Yes
Host: BE15 Host type: Windows Interface type: Fibre Channel Host port identifier: 50:01:43:80:05:67:49:62 Alias: BE15_p1 Host port identifier: 50:01:43:80:05:67:48:2a Alias: BE15_p2 Data Assurance (DA) capable: Yes Large sector size supported: No Host: ESXi6-3 Host type: VMWare Interface type: Fibre Channel Host port identifier: 50:01:43:80:00:c5:2f:dc Alias: ESXi6-3_p1 Host port identifier: 50:01:43:80:00:c5:30:38 Alias: ESXi6-3_p2 Data Assurance (DA) capable: Yes Large sector size supported: No Host: ESXi6-1 Host type: VMWare Interface type: Fibre Channel Host port identifier: 50:01:43:80:00:c4:6d:cc Alias: ESXi6-1_p1 Host port identifier: 50:01:43:80:00:c4:6d:68 Alias: ESXi6-1_p2 Data Assurance (DA) capable: Yes Large sector size supported: No Host: ESXi6-2 Host type: VMWare Interface type: Fibre Channel Host port identifier: 50:01:43:80:03:ae:a4:9a Alias: p1 Host port identifier: 50:01:43:80:03:ae:a4:2c Alias: p2 Data Assurance (DA) capable: Yes Large sector size supported: No
NVSRAM HOST TYPE DEFINITIONS
NOTE: The following indexes are not used: 3 - 4, 11 - 14, 16, 19 - 21, 28 - 31
HOST TYPE ALUA/AVT STATUS AIX MPIO Disabled 9 AVT_4M Enabled 5 Data ONTAP (ALUA) Enabled 26 Factory Default Disabled 0 (Default) HP-UX Enabled 15 Linux (ATTO) Enabled 24 Linux (DM-MP) Enabled 7 Linux (MPP/RDAC) Disabled 6 Linux (Pathmanager) Enabled 25 Linux (Symantec Storage Foundation) Enabled 27 Mac OS Enabled 22 SVC Enabled 18 Solaris (v11 or later) Enabled 17 Solaris (version 10 or earlier) Disabled 2 VMWare Enabled 10 Windows Enabled 1 Windows (ATTO) Enabled 23 Windows Clustered Enabled 8
You should check the file major-event-log.txt extracted from auto-support-file.7z.
In my case, it was caused by IO shipping. We found the backup server only have a single path connected to Controller A, the backup flow causes the IO shipping , and that makes the volumes owned by Controller B be changed to controller A. Eventually, we set preferred path to Controller A on all volumes.
You can find more information in KB Doc ID 1015723 . it describes how it happens and how to resolve it.
======= my major-event-log.txt =======
Date/Time: 2016/8/20 下午 11:41:58 Sequence number: 2429 Event type: 4011 Event category: Error Priority: Critical Event needs attention: true Event send alert: true Event visibility: true Description: Volume not on preferred path due to AVT/RDAC failover Event specific codes: 0/0/0 Component type: Controller Component location: Tray 99, Slot A Logged by: Controller in slot A