One of the methods available for upgrading between 8.3x to 9.1 is the Automated Nondisuptive Upgrade (ANDU) method. This method simplifies the upgrade process and removes much of the burden of the process. There still remains a few items that need to be reviewed and performed prior to using the ANDU method. Following is an outline of the manual checks that the ANDU requires. For the complete documentation on the process use the Upgrade Express Guide.
tl-dr ( how do I upgrade ONTAP when clusters are both a source and a destination for SnapMirrors. Instructions say destination first. Whaaa? )
We are planning upgrade from 8.3.1 to 9.1P5 . The AutoSupportUpgrade Advisor Plangenerated for this task, gives the following advice for suspending SnapMirror replication, and the correct order in which to upgrade the Peers of the SnapMirror relationship:
Suspending SnapMirror operations
Cluster mycluster2 is running SnapMirror.
To prevent SnapMirror transfers from failing, you must suspend SnapMirror operations and upgrade destination nodes before upgrading source nodes.
(i) Suspend SnapMirror transfers for a destination volume
(ii) Upgrade the node that contains the destination volume
(iii) Upgrade the node that contains the source volume
(iv) Resume the SnapMirror transfers for the destination volume.
Note: SnapMirror transfers for all other destination volumes can continue while the nodes that contain the original destination and source volumes are upgraded.
Our problem is that mycluster2 and the Cluster it is SnapMirroring to, mycluster1 have Volumes where SnapMirror replication is going in both directions:
That certainly makes a tough decision. As you have it you would need to suspend SnapMirrors originating from the cluster you upgrade first. The alternative though to this problem is configuring version-flexible SnapMirror relationships. That is described starting on page 99 here:
Thanks for the tip Amandeep. I 'm not running any Snapmirroring on this filer at the moment and to be honest disabling deswizzle is not something I did in the past before an OS upgrade. Unless if things have changed with the latest CDOT versions.
Upgraded from 8.3.1 to 9.1P5 using this method, just before 9.1P6 came out. Appeared to all be fine, but actually tanked performance on one of our NLSAS with FlashPool Aggregates within an hour of the upgrade. Support have been unable to identify why, and we have had to move high workload volumes off the aggregate to return to acceptable perfomance. Recommmend you collect perfstat using GUIPerfstat on your systems both before and after the upgrade, covering periods of low load and high load (e.g. Full Backups of VMs)
We have seen performance drop and not increase. Case has been open with support since 9th August 2017. Be smart, and be ready to be able to back up any claim that an upgrade has reduced perfomance, and to be able to give infomation so that NetApp support can find the cause and solve it. In the graph below of Aggregate Latency, where the line touches the x-axis is the upgrade. Line of best fit jumps 5ms up in latency within 20mins of the upgrade completing, permanently. This caused dropouts on VMs during peak load, and we were unable to complete Full Backups in a timely fasion.
NetApp Support are doing what they can, but without perfstat data from just before the upgrade, they are having a tough time finding the source of the loss in performance. Get perfstat data people. GET IT.