2017-12-12 09:14 AM
This is simply to ask you to review your configuration before upgrading to Ontap 9.2, still upgrade, just check first.
I discovered an issue just over a month ago, after an upgrade to Ontap 9.2. It has now been confirmed by NetApp, but its not a bug, just a change in network stack that changes the way it routes particular protocols.
Had the change to the Ontap OS been known, I would have made infrastructure changes before the upgrade, rather than having to raise support tickets and now have to make changes to to restore resilience to our infrastructure.
I will state that I have been very impressed with the performance of the new all flash NetApps and with the exception of one major bug, the systems have been bullet proof in general operation on our environment for the last year.
NetApp have removed a feature called “Fastpath” from Ontap 9.2, this feature stores the interface of each incoming network packet and ensures it goes back out the same interface. This was originally implemented for performance as it saved the time of checking the routing table. This feature has enabled servers to send storage traffic to any NetApp virtual interface and teh packet would egress from the same interface irrespective of the network infrastructure.
During the upgrade to 9.2 we had several outages and lost monitoring from On Command Unified Manager, we restored monitoring by moving OCUM to another subnet but we had to live with some loss of NFS resilience until we had confirmation of the cause..
Though each virtual interface still has a profile (data, management, intercluster, etc.) for incoming traffic, the response packet can go out through any interface within the same SVM and now uses the routing table.
The loss of monitoring from OCUM was caused by the https responses from polling via the cluster management interface, leaving the NetApp via an intercluster interface, which happened to be on the OCUM server’s subnet.
NFS issues were caused by using a NAS interface on the same node as the SVM admin interface, once I realised we moved all servers NFS to the node without the admin interface.
A new feature of Ontap 9.2, a tcpdump like command, allowed viewing traffic in wireshark and confirmation of the asymmetric routing from the netapp side.
We are now in the process of moving intercluster and admin interfaces to their own subnets. Hopefully upon completion 9.3 will be GA and we can gain some more space back.
Solved! See The Solution