General Discussion

False HA interconnect Errors - 4 Node Cluster

Storage_Giv
5,129 Views

Hello, 

 

We have a 4 Node cluster setup which is working fine but keep receiving " callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down." alerts every 2-3 hours. 

 

All Links are healthy but the errors wont stop coming.

 

Cluster1::*> ha inter status show
(system ha interconnect status show)

Node: node-n1
Link 0 Status: up
Link 1 Status: up
Is Link 0 Active: true
Is Link 1 Active: true
IC RDMA Connection: up

Node: node-n2
Link 0 Status: up
Link 1 Status: up
Is Link 0 Active: true
Is Link 1 Active: true
IC RDMA Connection: up

Node: node-n3
Link 0 Status: up
Link 1 Status: up
Is Link 0 Active: true
Is Link 1 Active: true
IC RDMA Connection: up

Node: node-n4
Link 0 Status: up
Link 1 Status: up
Is Link 0 Active: true
Is Link 1 Active: true
IC RDMA Connection: up
4 entries were displayed.

Cluster1::*>

 

Any suggestions?

 

Thanks

6 REPLIES 6

Ontapforrum
5,105 Views

Hi,

 

Could you give us this info:

Filer Model?
Ontap version?

 

Also these output:
::> system health status show
::> system health subsystem show
::> event log show -message-name callhome* [Not everything needed, most recent ones]

 

::> set adv
:*> storage failover interconnect show-link -node *

 

Thanks!

 

Storage_Giv
5,098 Views

OnTap Version 9.6P3

AFF320 - 4Node Switch Cluster

 

Cluster1::*> system health status show
Status
---------------
ok

Cluster1::*> system health subsystem show
Subsystem Health
----------------- ------------------
SAS-connect ok
Environment ok
Memory ok
Service-Processor ok
Switch-Health ok
CIFS-NDO ok
Motherboard ok
IO ok
MetroCluster ok
MetroCluster_Node ok
FHM-Switch ok
FHM-Bridge ok
SAS-connect_Cluster
ok
13 entries were displayed.

Cluster1::*> event log show -message-name callhome*
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
4/14/2020 03:00:00 node-n1
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/14/2020 03:00:00 node-n2
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/13/2020 14:00:00 node-n1
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/13/2020 14:00:00 node-n2
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/13/2020 01:00:00 node-n1
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/13/2020 01:00:00 node-n2
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/12/2020 12:00:00 node-n1
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/12/2020 12:00:00 node-n2
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/11/2020 23:00:00 node-n1
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/11/2020 23:00:00 node-n2
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/11/2020 10:00:00 node-n1
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/11/2020 10:00:00 node-n2
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.
4/10/2020 21:00:00 node-n2
ALERT callhome.hainterconnect.down: Call home for HA INTERCONNECT DOWN due to all links are down.

 

Ontapforrum
5,091 Views

Could you answer these queries:

1) AFF-A320 : As you mentioned it's a 4-Node-switched_cluster. Has anything changed recently ? Have you added a new HA Pair to existing 2 Node Cluster ?

2) Cisco Nexus Cluster Network switches : Can we know Firmware versions ?

Ontapforrum
5,040 Views

Check if you are hitting this issue: [Raise a ticket with NetApp]

https://kb.netapp.com/app/answers/answer_view/a_id/1091153

 

Storage_Giv
4,979 Views

yes, its a bug. Raised issue with support and they replied back confirming its a bug.

 

http://burtview.netapp.com/burts/1233806

 

andris
4,866 Views

Well, OK. But it's important to verify you're using the correct reference config file for the cluster switches.  You need v1.3 or 1.4 variants to support the shared cluster and HA configuration that the AFF A320 requires.

Public