Microsoft Virtualization Discussions
Microsoft Virtualization Discussions
Hi we just recently upgraded from 7.3.2 to 8.0.3 on our FAS3140 cluster. As usual, all ESX/vmware hosts connected to NFS storage were absolutely fine during the 'cf takeover' and 'giveback' processes....as we all windows standalone iscsi connected hosts.
We have a couple of 2008R2 Hyper-V clusters now and utilize cluster shared volumes in both of them. These guys didn't do so well during the takeover/givebacks. They showed all available storage as offline (CSV, quorum disk, etc) and several if not all of the VMs on these clusters rebooted by themselves. Anyone else experience similar symptoms from HV clusters during (what is supposed to be) a non-disruptive upgrade? Anyone have any take on what makes the clusters sooo much more sensitive to that moment in time when all sessions are in limbo until takeover or giveback is complete?
Thanks in advance for any and all replies.
First suspect is disk timeout values which were usually reset when cluster was configured. As usual sequence is - install HU and then configure cluster, timeout was changed from required value. Try reapplying HU settings.
BTW, are you using native MS DSM or NTAP DSM?
'No DSM'.....unfortunately these are blade servers, and since Microsoft recommends about a dozen NICs in a hyper-v server...dedicating one for each type of traffic, etc - we just don't have enough to multi-path to the storage system.
I'll assume you are referring to Hyper-V when you say 'try reapplying HU settings'? You mean HV, right? Just want to be clear on what you're recommending. Thanks for your reply.
Yes, in this case I mean HV itself, as it is the clustered instance. But re-checking timeouts in guests does not harm too.
There is an KB article about this issue.2013348 (Disks show as offline in Windows 2008 after Data ONTAP upgrade). But I have still some doubts. Is it correct that after NDU the LUNs stay onlinte till the next server reboot? Or do the LUNS go offline right after the upgrade (after the revision number change)? This is not 100% clear for me from the KB.
I think that the LUNs stay online and go offline after the reboot of the server...
thing is going from ontap 7 to ontap 8 changes the lun identifier string from 732 to 803 or sth like that and the hyper-v hosts think its a different lun. there has been a burt for this and this was corrected in future versions, you can now enable or disable the lun identifier change.
You mean that I can disable the revision number change on netapp? As I understand it the revision number of the LUN reflect the ontap version so it changes every upgrade.
So you say that I can disable the change and fix the revision number of the LUN for example to 802 forever?
my bad, its an group option
igroup set <igroup> report_scsi_name yes/no
Do you have any links to what “SCSI name” actually is? Data ONTAP manuals explain how to enable/disable it but do not really explain what it is …
BTW documentation says it is disabled by default for Windows.
I checked manuals and also live system and I didn't find the option report_scsi_name for igroup. You meant that this option should enable/disable the revision number change? Really igroup option? I would say it should be LUN related...
priv set advanced
then you see a few options, i think it came with 7.3.6+ and 8.0.1+.
I understand that this is new option. My question was, what does this option mean (or does).
when watching luns on a server there is an id string like NETAPP 00737 which changes to sth like NETAPP 00803 when updating. With the new option this string can be kept, which is needed for 2008r2 mscs.
Well priv set diag( or advanced) was my first thing I have done after igroup set in normal mode didn't show the option. But the result in diag mode is the same as in normal mode. At least when I try "igroup set".
Isn't it hidden option?
But the more important thing is why the revision number should be affected by igroup? I would say it is LUN related...
Could you please point out some documentation ? I tried to search ontap8.0.2 manuals for report_scsi_name and found nothing...
NetApp Release 8.1.2 7-Mode: Tue Oct 30 19:56:51 PDT 2012
filer02> priv set diag
Warning: These diagnostic commands are for use by NetApp
filer02*> igroup set
igroup set [ -f ] <initiator_group> <attribute> <value>
- sets an attribute on an initiator group
The current attributes and values supported are:
ostype: solaris, windows, hpux, aix, linux, netware, vmware,
openvms, xen and hyper_v. The ostype attribute sets the
operating system type of the initiators in the initiator group.
throttle_reserve: value of 0->99
The throttle_reserve attribute reserves the given
percentage of SCSI cmdblks for this initiator
throttle_borrow: yes, or no.
If yes, then an igroup can borrow cmdblks if it
exceeds its reserve and cmdblks are available.
alua: yes, or no.
If yes, then the initiators in the igroup can
support Asymmetric Logical Unit Access.
report_scsi_name: yes, or no.
If yes, then SCSI Name String (8h) descriptor
will be reported as part of INQUIRY VPD 0x83 page.
The -f flag will override all warnings.
For more information, try 'man na_igroup'
second, i will try to get the kb/bug id
Well I'm on 8.0.2. and after priv set diag and igroup set the option is not there.
The KB id should be 2013348 but it is not clear for me so that's why I asked here.
hrm, then options comes with 8.1+ maybe, dont have an older ontap to check atm.
kb is the right one, it shows you REV_736 vs REV_0.2 as an example of the string change.
So how to prevent the revision number change on ontap8.0.2? Is it possible w/o the igroup option?
If it is not possible ... when go the LUNs offline? Right after the ontap upgrade (to 8.0.4) or after the server reboot? I need to be 100% sure about this.
afaik its not possible. you need to do a disruptive update, eg get the datastores online after the update. i think (not 100% sure) the version change happens on giveback as the new updated filer will then continue to serve the luns.