Microsoft Virtualization Discussions

Netapp OnTap NDU - problem with Hyper-V cluster

gpnetsupport

Hi we just recently upgraded from 7.3.2 to 8.0.3 on our FAS3140 cluster.  As usual, all ESX/vmware hosts connected to NFS storage were absolutely fine during the 'cf takeover' and 'giveback' processes....as we all windows standalone iscsi connected hosts.

We have a couple of 2008R2 Hyper-V clusters now and utilize cluster shared volumes in both of them.  These guys didn't do so well during the takeover/givebacks.  They showed all available storage as offline (CSV, quorum disk, etc) and several if not all of the VMs on these clusters rebooted by themselves.  Anyone else experience similar symptoms from HV clusters during (what is supposed to be) a non-disruptive upgrade?  Anyone have any take on what makes the clusters sooo much more sensitive to that moment in time when all sessions are in limbo until takeover or giveback is complete?

Thanks in advance for any and all replies.

18 REPLIES 18

aborzenkov

First suspect is disk timeout values which were usually reset when cluster was configured. As usual sequence is - install HU and then configure cluster, timeout was changed from required value. Try reapplying HU settings.

BTW, are you using native MS DSM or NTAP DSM?

gpnetsupport

'No DSM'.....unfortunately these are blade servers, and since Microsoft recommends about a dozen NICs in a hyper-v server...dedicating one for each type of traffic, etc - we just don't have enough to multi-path to the storage system.

I'll assume you are referring to Hyper-V when you say 'try reapplying HU settings'?  You mean HV, right?  Just want to be clear on what you're recommending.  Thanks for your reply.

aborzenkov

Yes, in this case I mean HV itself, as it is the clustered instance. But re-checking timeouts in guests does not harm too.

maskajan09

There is an KB article about this issue.2013348 (Disks show as offline in Windows 2008 after Data ONTAP upgrade). But I have still some doubts. Is it correct that after NDU the LUNs stay onlinte till the next server reboot? Or do the LUNS go offline right after the upgrade (after the revision number change)? This is not 100% clear for me from the KB.

I think that the LUNs stay online and go offline after the reboot of the server...

Any experience?

thomas_glodde

hi there,

thing is going from ontap 7 to ontap 8 changes the lun identifier string from 732 to 803 or sth like that and the hyper-v hosts think its a different lun. there has been a burt for this and this was corrected in future versions, you can now enable or disable the lun identifier change.

Kind regards,

Thomas

maskajan09

You mean that I can disable the revision number change on netapp?  As I understand it the revision number of the LUN reflect the ontap version so it changes every upgrade.

So you say that  I can disable the change and fix the revision number of the LUN for example to 802 forever?

How please?

Thanx

Jan

thomas_glodde

hi jan,

my bad, its an group option

igroup set <igroup> report_scsi_name yes/no

Kind regards,

Thomas

aborzenkov

Do you have any links to what “SCSI name” actually is? Data ONTAP manuals explain how to enable/disable it but do not really explain what it is …

BTW documentation says it is disabled by default for Windows.

maskajan09

I checked manuals and also live system and I didn't find the option report_scsi_name for igroup. You meant that this option should enable/disable the revision number change? Really igroup option? I would say it should be LUN related...

thomas_glodde

hi there,

priv set advanced

igroup set

then you see a few options, i think it came with 7.3.6+ and 8.0.1+.

regards,

thomas

aborzenkov

I understand that this is new option. My question was, what does this option mean (or does).

thomas_glodde

hi,

when watching luns on a server there is an id string like NETAPP 00737 which changes to sth like NETAPP 00803 when updating. With the new option this string can be kept, which is needed for 2008r2 mscs.

Kind regards,

Thomas

maskajan09

Well priv set diag( or advanced) was my first thing I have done after igroup set in normal mode didn't show the option. But the result in diag mode is the same as in normal mode. At least when I try "igroup set".

Isn't it hidden option?

But the more important thing is why the revision number should be affected by igroup? I would say it is LUN related...

Could you please point out some documentation ? I tried to search ontap8.0.2 manuals for report_scsi_name and found nothing...

Thank you

thomas_glodde

first:

filer02> version
NetApp Release 8.1.2 7-Mode: Tue Oct 30 19:56:51 PDT 2012
filer02> priv set diag
Warning: These diagnostic commands are for use by NetApp
         personnel only.
filer02*> igroup set
usage:
igroup set [ -f ] <initiator_group> <attribute> <value>
  - sets an attribute on an initiator group

    The current attributes and values supported are:

    ostype: solaris, windows, hpux, aix, linux, netware, vmware,
       openvms, xen and hyper_v. The ostype attribute sets the
       operating system type of the initiators in the initiator group.

    throttle_reserve: value of 0->99
       The throttle_reserve attribute reserves the given
       percentage of SCSI cmdblks for this initiator

    throttle_borrow: yes, or no.
       If yes, then an igroup can borrow cmdblks if it
       exceeds its reserve and cmdblks are available.

    alua: yes, or no.
       If yes, then the initiators in the igroup can
       support Asymmetric Logical Unit Access.

    report_scsi_name: yes, or no.
       If yes, then SCSI Name String (8h) descriptor
       will be reported as part of INQUIRY VPD 0x83 page.

    The -f flag will override all warnings.

For more information, try 'man na_igroup'
filer02*>

second, i will try to get the kb/bug id

maskajan09

Well I'm on 8.0.2. and after priv set diag and igroup set the option is not there.

The KB id should be 2013348 but it is not clear for me so that's why I asked here.

thomas_glodde

hrm, then options comes with 8.1+ maybe, dont have an older ontap to check atm.

kb is the right one, it shows you REV_736 vs REV_0.2 as an example of the string change.

maskajan09

So how to prevent the revision number change on ontap8.0.2? Is it possible w/o the igroup option?

If it is not possible ... when go the LUNs offline? Right after the ontap upgrade (to 8.0.4) or after the server reboot? I need to be 100% sure about this.

thomas_glodde

afaik its not possible. you need to do a disruptive update, eg get the datastores online after the update. i think (not 100% sure) the version change happens on giveback as the new updated filer will then continue to serve the luns.

Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public