Subscribe

Netapp OnTap NDU - problem with Hyper-V cluster

Hi we just recently upgraded from 7.3.2 to 8.0.3 on our FAS3140 cluster.  As usual, all ESX/vmware hosts connected to NFS storage were absolutely fine during the 'cf takeover' and 'giveback' processes....as we all windows standalone iscsi connected hosts.

We have a couple of 2008R2 Hyper-V clusters now and utilize cluster shared volumes in both of them.  These guys didn't do so well during the takeover/givebacks.  They showed all available storage as offline (CSV, quorum disk, etc) and several if not all of the VMs on these clusters rebooted by themselves.  Anyone else experience similar symptoms from HV clusters during (what is supposed to be) a non-disruptive upgrade?  Anyone have any take on what makes the clusters sooo much more sensitive to that moment in time when all sessions are in limbo until takeover or giveback is complete?

Thanks in advance for any and all replies.

Re: Netapp OnTap NDU - problem with Hyper-V cluster

First suspect is disk timeout values which were usually reset when cluster was configured. As usual sequence is - install HU and then configure cluster, timeout was changed from required value. Try reapplying HU settings.

BTW, are you using native MS DSM or NTAP DSM?

Re: Netapp OnTap NDU - problem with Hyper-V cluster

'No DSM'.....unfortunately these are blade servers, and since Microsoft recommends about a dozen NICs in a hyper-v server...dedicating one for each type of traffic, etc - we just don't have enough to multi-path to the storage system.

I'll assume you are referring to Hyper-V when you say 'try reapplying HU settings'?  You mean HV, right?  Just want to be clear on what you're recommending.  Thanks for your reply.

Re: Netapp OnTap NDU - problem with Hyper-V cluster

Yes, in this case I mean HV itself, as it is the clustered instance. But re-checking timeouts in guests does not harm too.

Re: Netapp OnTap NDU - problem with Hyper-V cluster

There is an KB article about this issue.2013348 (Disks show as offline in Windows 2008 after Data ONTAP upgrade). But I have still some doubts. Is it correct that after NDU the LUNs stay onlinte till the next server reboot? Or do the LUNS go offline right after the upgrade (after the revision number change)? This is not 100% clear for me from the KB.

I think that the LUNs stay online and go offline after the reboot of the server...

Any experience?

Re: Netapp OnTap NDU - problem with Hyper-V cluster

hi there,

thing is going from ontap 7 to ontap 8 changes the lun identifier string from 732 to 803 or sth like that and the hyper-v hosts think its a different lun. there has been a burt for this and this was corrected in future versions, you can now enable or disable the lun identifier change.

Kind regards,

Thomas

Re: Netapp OnTap NDU - problem with Hyper-V cluster

You mean that I can disable the revision number change on netapp?  As I understand it the revision number of the LUN reflect the ontap version so it changes every upgrade.

So you say that  I can disable the change and fix the revision number of the LUN for example to 802 forever?

How please?

Thanx

Jan

Re: Netapp OnTap NDU - problem with Hyper-V cluster

hi jan,

my bad, its an group option

igroup set <igroup> report_scsi_name yes/no

Kind regards,

Thomas

Re: Netapp OnTap NDU - problem with Hyper-V cluster

Do you have any links to what “SCSI name” actually is? Data ONTAP manuals explain how to enable/disable it but do not really explain what it is …

BTW documentation says it is disabled by default for Windows.

Re: Netapp OnTap NDU - problem with Hyper-V cluster

I checked manuals and also live system and I didn't find the option report_scsi_name for igroup. You meant that this option should enable/disable the revision number change? Really igroup option? I would say it should be LUN related...