ONTAP Discussions

VMware and NetApp igroup compatibility. Not sure who is at fault.

DBWannaBe
9,091 Views

Hello,

 

I've got a vCenter running 7.0.3 and a brand new ESXI host running 7.0.3.  I've configured the fabric switch with the proper zoning and am having an issue mounting storage.  ONTAP is 9.9.1p10.  Storage is FAS.

 

Here's the issue:

If I create a new LUN & Volume and then add the appropriate WWPN to the initiator group, the ESXi host does not detect the storage.   When I run "igroup show -instance" using the cli on the NetApp, it shows that the WWPN is in the correct zone but reports "not logged in".

 

I believe that this shows that the fabric is configured correctly so the issue is with my endpoints, the ESXi host and the NetApp.

 

When I take that same ESXi host and move it into an existing vSphere cluster and move the WWPN in to an existing igroup on the NetApp, the storage comes up normally and I can view all of the LUNs that the existing vSphere cluster sees.

 

I experience the same problem whether or not I use ONTAP to create the volume and initiator, or if I use ONTAP Tools for vSphere.

 

Thanks for any insights or thoughts.

1 ACCEPTED SOLUTION

TMACMD
8,866 Views

You did create the lun as type VMware and it’s being shared as FC?

 

 This is ONTAP? (Not E-Series? I saw an issue last week we tested with 4k size and VMware would scan it properly)

 

 what’s the output of “lun show -path /xxx -instance” look like and also “lun mapping show -path /xxx -instance”?

View solution in original post

20 REPLIES 20

paul_stejskal
9,083 Views

Did you rescan the storage from the ESXi side through vSphere? Also did you map the LUN to the igroup?

 

Without having some command outputs like igroup show, lun show, etc., it's hard to say. If you want, you can join Discord (https://discord.gg/netapp) and get some live help, or provide your serial number to look at AutoSupports. Also a case is a valid option.

TMACMD
9,081 Views

ONTAP is not logging into the switch (per comment). Suspect using the HW port name instead of the SVM Port Name.

TMACMD
9,083 Views

When zoning....if doing soft...BE SURE TO USE THE WWPN of the SVM. Not the physical card. I see this happen all too frequently.

The switch must support NPIV which allow multiple WWPNs on a physical port.

The WWPN from ONTAP usually start with 50: for hardware(do not use) and 20: for the SVM (Use these!)

RobS63
6,879 Views

I came across this also, the svm did not show until I enabled npiv on my mds and when I zoned the svm PWWN I was able to see all the hosts I expected

DBWannaBe
9,024 Views

Thank you both for the suggestions and the comments. 

 

I should have noted that while the ESXi host is new, the switch, the vCenter, and the NetApp are not. 

 

The new ESXi hosts' switch ports were added to the existing vsan/vlan.  90% of our storage routes through this switch currently.   As far as I can tell, the switch configuration is correct because I can add this new ESXi host to an existing cluster (and add its WWPN to that igroup) and it can see and interact with the storage.  I'm simply adding a new host into the zoning.

 

If there's a log that I can provide that might show an error, which one is it? 

I spent most of today updating the vSphere to the current patch level.  I just created a new LUN and volume as well as a new igroup to check after the update, but no deal.  The new ESXi host does not see the storage though the NetApp shows that the ESXi hosts WWPN are logged in.

 

 

Our current guess is that it's vSphere that is at fault but I've found nothing conclusive to support that.  Like there's some flag that isn't getting set when new storage is trying to be added.

TMACMD
9,019 Views

If ONTAP is reporting “not logged in” that usually indicates a problem with the zoning. 

is this a new svm as well? Or existing svm?

 

 You may want to just open a case and get it resolved quickly

DBWannaBe
8,965 Views

Not a new SVM.  Just adding a new host to an existing system.  New cluster in vSphere and attempting to connect to a new igroup and LUN.  Connecting to a previously configured igroup and LUN works just fine.

cruxrealm
8,964 Views

Reading through your post,  here is what I can gather:
1. new esxi host,  old switch, old storage

2. adding the new esxi host to old cluster works.

3. new cluster using new esxi does not work.

 

It looks like the issue points to two things,   zoning and/or vsphere esxi host storage (vmhba) configuration.

 

=> you might want to check your zoning.  see @TMACMD post above. (NPIV, WWPN, new zone if your are separating it from the old cluster)
=> you also might want to check vsphere storage adapters on the new esxi host enabled/online.  Identify the correct vmhba and zone using e the vmhba (2nd number) identifier/initiator  to the virtual port id of the storage.   

DBWannaBe
8,949 Views

I'll review my zoning but I'm not sure what to look for?  This zoning works when I connect to old storage and not the new LUN.  Also, the ESXi host network adapter works when I connect to old storage but not when I connect to new LUNs. 

 

I built a new vCenter and mounted this new host into it.  I redid my WWPN's from the host to the switch zonesets and then added them to the new LUNs.  No change in behavior.  Still can't access the storage.

 

The only difference I see on my new igroup is that the protocols comes up as mixed.  As in, NetApp will accept both iscsi and FC.  All of the old igroups are just listed as FC.

 

DBWannaBe
8,867 Views

Ok, I manually created the new LUN and igroup for that LUN.  The igroup has been mapped to the LUN.

I created a new vCenter and added a freshly reformatted server to the vCenter as a new host.

I double checked that my HBA's are on compatibility matrix, and that they are supported.

I edited the Fabric to include the new WWPN's and edited the Zoning.  When I attempted to reach the storage and it failed.  vSphere does not detect any new datastores.

When I run "igroup show -instance" I can see the new WWPN's are connected to the correct LUN and they show a status of "Logged In".

This has to be a VMware issue but I'm not sure where it is?

I've opened cases with both NetApp and VMware.  We'll see how this goes.  Either way, I'll update this conversation so that we all know the result.

TMACMD
8,866 Views

Great! After you tell the host to scan storage, it logged in and ONTAP is seeing it. 

on esxi you need to right click on the host,  click storage and create a new data store. It should pick up the lun and you can format it with vmfs 6

 

 If the lun already has data then disregard. 

DBWannaBe
8,865 Views

That's the problem.

I rescan the adapters and it doesn't detect any new datastores. 

If I attempt to create a new datastore, it only sees the local hard drive on the server as a possible location.

This issue has to be a VMWare problem.  Everything looks correct.  I have a work order opened with them and am currently awaiting their response.

TMACMD
8,839 Views

One more thing came to mind. You could have resolved this in a couple of different ways. 

1. added all the aggregates to the svm and

2. after all your ONTAP hosts fc wwpns to the zone

TMACMD
8,867 Views

You did create the lun as type VMware and it’s being shared as FC?

 

 This is ONTAP? (Not E-Series? I saw an issue last week we tested with 4k size and VMware would scan it properly)

 

 what’s the output of “lun show -path /xxx -instance” look like and also “lun mapping show -path /xxx -instance”?

DBWannaBe
8,855 Views

Thank you for your suggestion to run the command "lun mapping show -instance"! 

The issue turned out to be that the LUN was automatically created on our AFF and not on our FAS even though the Storage VM was listed as the FAS.

I performed a "vol move" and got it over to where they volumes are supposed to live and everything showed up in VMware! 

 

Holy smokes, I can get on with my life!  Thanks for hanging in there with me and for your suggestions!

 

 

paul_stejskal
8,853 Views
Can you take screenshots of what the confusion is? If so, we can create a KB and hopefully push for product improvements in the future. Obviously obfuscate PII from the screenshots (use the select then crop tools in MS Paint for example).

Thank you.

TMACMD
8,852 Views

One of my peeves with the GUI is placement of things. Volumes, Luns, etc. 

i know how to do it, but it is not obvious by any means. 

I’ve had so many customers create a “sas” volume and ONTAP places it on Sata. I get a call for performance issues and that’s what it usually is!

 

 I’d be willing to bet something similar here

DBWannaBe
8,836 Views

For me, the issue with creating a new LUN is that the GUI allows you to select which SVM to use and I figured that this would create the LUN on the specified SVM but that's not what happened.  ONTAP created the LUN on our all flash and not on our SATA which is what I thought I was specifying.  Looking at the GUI I could see that the SVM was our FAS system but running the "lun mapping show -instance" command clued me in that I was actually attempting to connect to our AFF.

 

Attaching to the storage should have been allowed whichever way since these paths are redundant but for whatever reason that pathing isn't working.  This points to another issue that I have to figure out but that is outside of the scope of this problem.

TMACMD
8,826 Views

You are experiencing what Netapp calls Selective LUN Mapping

  

by default a lun is only visible through the housing node and it’s ha partner

 

 you can certainly enable more reporting nodes but you are best served using SLM

TMACMD
8,860 Views

Also, have you verified your fc switch make/model/version on the hardware matrix?

Public