ONTAP Discussions

cluster-switch create fails

lowest
934 Views

Hi folks. We have a 6 node 8200 cluster running 9.7P11 with CN1610 cluster switches. All is well except cluster-switch show only show one of the cluster interconnect switches. If I try an add the missing switch with cluster-switch create I get an IP address not reachable error but I can ping the switch and from the switch I can ping the cluster. network device-discovery show lists both switches. The cluster and node management ports on this cluster are over an ifgrp and the only difference I can see on the switches is the MAC address connecting to the switch management port. Any advice would be great.

6 REPLIES 6

lowest
908 Views

Many thanks but neither of those match the symptoms I'm seeing.

Ontapforrum
815 Views

Apart from the fact that you don't see one of the switch via system cluster-switch show, is all Cluster LIFs ok? Does event logs show errors on NetApp side related to cluster switch?

 

how does it look:
::> system health subsystem show
::> system health alert show -monitor cluster-switch -instance
::> system cluster-switch show -snmp-config

 

Could be the port faulty? Is there a free port to test it? Replacing cable ? (If there are no errors reported from NetApp & Cluster side then, its worth raising a ticket with NetApp and if there is anything that needs clearing out.

 

Worth collecting this output for NetApp support:
Health check commands for NetApp CN1610 and CN1601 switches
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Systems/FAS_Systems/Health_check_commands_for_NetApp_CN1610_and_CN1601_switches

 


HOW TO: check port error counters on a NetApp CN1610 cluster switch
https://mysupport.netapp.com/site/article?lang=en&page=%2FAdvice_and_Troubleshooting%2FData_Storage_Systems%2FFabric,_Interconnect_and_Management_Swit...

 

https://kb.netapp.com/Advice_and_Troubleshooting%2FFlash_Storage%2FAFF_Series%2FNetwork_port_is_down_on_CN1610

 

TMACMD
878 Views

Can you SSH to the switch?

Is the RCF file correct and current (should be v1.2)?

Did you try to reboot the switch and try again?

Is the FastPath code up to date (should be 1.3.0.3)?

 

lowest
835 Views

RCF is indeed 1.2

Yes, rebooted both switches which for one switch brought it into compliance but only for a few hours then connectivity was lost again.

FastPath is one behind the last release at 1.3.0.2 but 1.3.0.2 is shown as supported by 9.7.

TMACMD
829 Views

So you have rebooted. Why not apply the latest FastPath update and see if that fixes the problem?

 

 It’s trivial to update

 best case it fixes the problem

 worst case the original issue is still there

 (then is open a support case as it is may be a hardware issue and logs may be useful)

 

 heck, I do not recall the syntax offhand but you may be able to look at the switch logs and even ontap event logs. Maybe something shows up that gives a hint

Public