ONTAP Discussions
ONTAP Discussions
We have completed OnTap Upgrade from 9.6p9 to 9.7p8 following Upgrade Advisor. Everything looks good, no any errors in any logs. After checked vSphere (NFS datastores), there are no any errors/issues neither. OCUM or Event Log didn't indicate any issues.
After a week or so, they experienced some performance issues with a few particular VM's among thousands VM's, and then they suspect these are caused by Upgrade, and ask Storage people to look into it. OnTap covers almost everything, people can relate any issue to it without presenting any evidences or any indications, it is like "presumption of guilty" to me. Of cause, I don't want to say that to them.
But, what should I say, what my professional response should be? I hope some experts here can help me.
Solved! See The Solution
"After a week or so" ...ummmhuh
As far as verify. There's a few you can use.
the IMT - https://mysupport.netapp.com/matrix/#welcome Will check for interop between NetApp software/hardware/etc and 3rd party. For example ONTAP version and the VSC. Or between ONTAP and VMware using iSCSI etc. there's a lot in there.
aiq.netapp.com -> check for health issues and you can also run the upgrade advisor from here.
Unified manager and the System manager should alert if there are any errors.
You can also manually check the event log inside ONTAP
doesn't hurt to open a Support Case either.
"After a week or so" ...ummmhuh
As far as verify. There's a few you can use.
the IMT - https://mysupport.netapp.com/matrix/#welcome Will check for interop between NetApp software/hardware/etc and 3rd party. For example ONTAP version and the VSC. Or between ONTAP and VMware using iSCSI etc. there's a lot in there.
aiq.netapp.com -> check for health issues and you can also run the upgrade advisor from here.
Unified manager and the System manager should alert if there are any errors.
You can also manually check the event log inside ONTAP
doesn't hurt to open a Support Case either.
Hi Heightsnj,
Something else you can check is compare the volumes reported as having performance issues against the ones that are reported as fine and compare the performance statistics using the following command:
::> qos statistics volume latency show
Reference document: Display latency breakdown data per volume
Keep in mind the volumes you are comparing might have different clientside workloads, but in the above output mainly look for any major outliers. You can also perform a takeover of the node that owns the disks for the problematic volumes or an aggregate relocate and compare the statistics before takeover and during a takeover or aggregate relocate. This would essentially eliminate anything that is specific to the node hardware-wise.
Another check like the above would be to compare the statistics for any errors when the LIF lives on the home node and when you migrate the LIF to another node.
::> node run -node <NODENAME> -command "ifstat -a"
Reference document: Viewing or clearing network interface statistics
As @SpindleNinja you can open a NetApp Support case for a more thorough analysis of the Cluster.
Regards,
Team NetApp