and thank you for your help in advance to my first post request.
FAULT - Intermitted interface connection drop out from NFSv4 clients and ping request.
Overview - we have approx. 200 HPC servers connecting to a FAS8200 HA pair running clustered ontap 9.5P8, the clients are load balanced to both nodes via their IP addresses from the client side. the nodes have 1 HPC only interface consisting of 2 physical ports using multimode. node 1 interface losing connectivity to all clients connected to it, its very intermittent and shows no pattern or logic like load issues. we have no errors on interfaces, ports , nodes -nothing , same with network (clients and Nodes are off the same physical switch and same VLAN), no errors from client side other than losing connectivity which lasts a few seconds to a couple of min. we have even done packet analysis on the interface and this shows nothing other than a complete stop in communication. i have even moved interface to node 2 and the fault migrates to it (indicating no filer hardware issues)
has anyone experience anything like this before, HPC and network engineers adamant no fault there either ?
Yes I have captured packets via Wireshark but unfortunately apart from a few resends throughout the whole trace nothing unusual happens before the communications just stops. I have raised a call with NETAPP and sent the packet trace to them and they confirm nothing that would cause loss of communication and see nothing wrong with filer or configuration.
Just trying here to pick more brains here as this keep happening for no reason and causing user headaches. Again networks and HPC clients (all connected to same switch) see no errors apart from loss of connection.