Subscribe

Old Netapp is Cluster Interconnect Problem

I have an older Fas270C in our dev area. What a workhorse, it is still going after 7 years.  We have had a problem come up. The First volume failed and caused us to Force Failover.  We had spares, but I guess there is something special about the very first disk.  New drive in there and it has rebuilt. The device is up and data serving fine.  The parter controller will not come back with: cf giveback.

We also swapped the controller for the parter controller. Just to make sure the partner controller failed.  We checked IPs and both sides' IPs are up and fine. We have iSCSI running over both and data is flowing.

This looks like our main issue: cf.nm.nicReset:warning]: Initiating soft reset on Cluster Interconnect card 0 due to rendezvous jammed

Got any ideas how to diagnose this? 

storm2(takeover)> ifconfig -a

e0a: flags=48e8043<UP,BROADCAST,RUNNING,MULTICAST,MULTIHOST,PARTNER_UP,NOWINS> mtu 1500

          inet 10.100.1.51 netmask 0xffffff00 broadcast 10.100.1.255

          partner inet 10.100.1.50 (e0a)

          ether 00:a0:98:03:9d:d5 (auto-1000t-fd-up) flowcontrol full

e0b: flags=48e8043<UP,BROADCAST,RUNNING,MULTICAST,MULTIHOST,PARTNER_UP,NOWINS> mtu 1500

          inet 10.100.0.251 netmask 0xffffff00 broadcast 10.100.0.255

          partner inet 10.100.1.10 (e0b)

          ether 00:a0:98:03:9d:d6 (auto-1000t-fd-up) flowcontrol full

lo: flags=19e8049<UP,LOOPBACK,RUNNING,MULTICAST,MULTIHOST,PARTNER_UP,TCPCKSUM> mtu 8160

          inet 127.0.0.1 netmask 0xff000000 broadcast 127.0.0.1

          ether f0:7b:af:37:04:00 (VIA Provider)

storm2(takeover)> Sat Feb  8 09:44:03 EST [storm2 (takeover): cf.nm.nicReset:warning]: Initiating soft reset on Cluster Interconnect card 0 due to rendezvous jammed

storm2(takeover)> Sat Feb  8 09:46:06 EST [storm2 (takeover): cf.nm.nicReset:warning]: Initiating soft reset on Cluster Interconnect card 0 due to rendezvous jammed

Sat Feb  8 09:47:08 EST [storm2 (takeover): cf.nm.nicReset:warning]: Initiating soft reset on Cluster Interconnect card 0 due to rendezvous jammed

---------------------

cf monitor

  current time: 08Feb2014 09:40:46

  TAKEOVER 10:44:00, partner 'storm1', cluster monitor enabled

storm2(takeover)> cf giveback

Partner not waiting for giveback, giveback cancelled.

To do a giveback without checking for partner readiness, please either set option "cf.giveback.check.partner" to "off" before doing "cf giveback" again, or do "cf giveback -f".

The first choice disables checking for all future "cf giveback", until it's turned back to "on". The second choice is good for this giveback only.

storm2(takeover)> Sat Feb  8 09:41:59 EST [storm2 (takeover): cf.nm.nicReset:warning]: Initiating soft reset on Cluster Interconnect card 0 due to rendezvous jammed

Re: Old Netapp is Cluster Interconnect Problem

I meet same problem,just like you, do you have resolve this problem.

Re: Old Netapp is Cluster Interconnect Problem

I'd check BURT 412390, Fixed in 7.3.4P2 and 7.3.3P5

Re: Old Netapp is Cluster Interconnect Problem

Many thanks for your timely help, I've been troubled for a long time for this question,  If do not patch microcode upgrades . It frequently happen?