ONTAP Discussions

How to tell if there would be any issues after upgrade?


We have completed OnTap Upgrade from 9.6p9 to 9.7p8 following Upgrade Advisor. Everything looks good, no any errors in any logs. After checked vSphere (NFS datastores), there are no any errors/issues neither. OCUM or Event Log didn't indicate any issues. 

After a week or so, they experienced some performance issues with a few particular VM's among thousands VM's, and then they suspect these are caused by Upgrade, and ask Storage people to look into it. OnTap covers almost everything, people can relate any issue to it without presenting any evidences or any indications, it is like "presumption of guilty" to me. Of cause, I don't want to say that to them. 


But, what should I say, what my professional response should be? I hope some experts here can help me. 


Re: How to tell if there would be any issues after upgrade?


"After a week or so" ...ummmhuh      


As far as verify.   There's a few you can use.   


the IMT - https://mysupport.netapp.com/matrix/#welcome   Will check for interop between NetApp software/hardware/etc  and 3rd party.      For example  ONTAP version and the VSC.    Or between  ONTAP and VMware using iSCSI etc.    there's a lot in there.   


aiq.netapp.com  -> check for health issues and you can also run the upgrade advisor from here.   


Unified manager and the System manager should alert if there are any errors. 

You can also manually check the event log inside ONTAP 


doesn't hurt to open a Support Case either.   



View solution in original post

Re: How to tell if there would be any issues after upgrade?


Hi Heightsnj,


Something else you can check is compare the volumes reported as having performance issues against the ones that are reported as fine and compare the performance statistics using the following command:


::> qos statistics volume latency show

Reference document: Display latency breakdown data per volume 


Keep in mind the volumes you are comparing might have different clientside workloads, but in the above output mainly look for any major outliers. You can also perform a takeover of the node that owns the disks for the problematic volumes or an aggregate relocate and compare the statistics before takeover and during a takeover or aggregate relocate. This would essentially eliminate anything that is specific to the node hardware-wise.


Another check like the above would be to compare the statistics for any errors when the LIF lives on the home node and when you migrate the LIF to another node.


::> node run -node <NODENAME> -command "ifstat -a"

Reference document:  Viewing or clearing network interface statistics 


As @SpindleNinja you can open a NetApp Support case for a more thorough analysis of the Cluster. 





Team NetApp

Team NetApp
Earn Rewards for Your Review!
GPI Review Banner
All Community Forums