ONTAP Discussions

fas8020 went into non ha mode

RAKESHMOLE2
15,384 Views

I have a new FAS8020 with two controllers , for some reason it is in non HA mode.

How can i correct this?

1 ACCEPTED SOLUTION

JGPSHNTAP
15,384 Views

Is the cluster production yet?  E0a and e0b are generally two network adapters that make up your ifgrp configuration.  But, your cluster should be clusterable iwthout those.. When an 8020 is cabled up properly, the cluster is connected with the interconnections.

Take my advice, run config advisor.

View solution in original post

11 REPLIES 11

JGPSHNTAP
15,311 Views

We need a little more information..  I assume we are talking 7-mode

Is it cabled up properly for HA

Type cf status

Also, run config advisor against the nodes to make sure things are ok.

RAKESHMOLE2
15,312 Views

Hi,

i think you may be right about the cabling.

For HA does the e0a and e0b have to be connected to network switches with IP?

cf status says Controller in NON HA mode

JGPSHNTAP
15,385 Views

Is the cluster production yet?  E0a and e0b are generally two network adapters that make up your ifgrp configuration.  But, your cluster should be clusterable iwthout those.. When an 8020 is cabled up properly, the cluster is connected with the interconnections.

Take my advice, run config advisor.

RAKESHMOLE2
15,312 Views

hi,

I thought so too. when i installed all my previous netapps, i never had to cable cluster internconnect as it always was internal.

I ran config advisor it didnt come up with anything. This is not production yet

aborzenkov
15,312 Views

Boot in maintenance mode on each controller and paste output of "ha-config show"

RAKESHMOLE2
15,312 Views

here it is

aborzenkov
15,312 Views

And for the second controller? And please, paste output inline as text.

RAKESHMOLE2
15,312 Views

second controller

*> Jul 30 02:11:43 [acp-vfiler@localhost:acp.locked.wrench.port.up:info]: The on-board locked wrench port is up.

ha-config show

   Chassis HA configuration: ha

Controller HA configuration: ha

Hi aborzenkov

Can you please explain me the cabling for the cluster interconnect for e0a and e0b., what does it mean by for HA mode connect it to data switches, what is the difference between data switch and data network switch?

aborzenkov
9,327 Views

What is value of “options cf.mode” on both controllers (when booted normally, not in maintenance)?

Poster you show is for C-Mode, it does not apply to 7-Mode.

RAKESHMOLE2
9,327 Views

Hi Everybody,

Thank you for your help. It was incorrect disk shelf to controller cabling that caused the issue .

After the cabling was corrected the HA came up fine.

RAKESHMOLE2
15,312 Views

also some noticeable ha errors i see while i reboot

un 29 10:04:59 [localhost:diskown.isEnabled:info]: software ownership has been enabled for this system

add host 127.0.10.1Jun 29 10:04:59 [localhost:config.noPartnerDisks:CRITICAL]: No disks were detected for the partner; this node will be unable to takeover correctly

WAFL CPLEDGER is enabled. Checklist = 0x7ff841ff

: gateway 127.0.20.1

Jun 29 10:04:59 [localhost:callhome.dsk.config:warning]: Call home for DISK CONFIGURATION ERROR

Jun 29 10:05:01 [localhost:wafl.memory.status:info]: 15254MB of memory is currently available for the WAFL file system.

Jun 29 10:05:01 [localhost:dcs.framework.enabled:info]: The DCS framework is enabled on this node.

Jun 29 10:05:01 [localhost:snmp.link.up:info]: Interface 6 is up

Jun 29 10:05:01 [localhost:netif.linkUp:info]: Ethernet e0P: Link up.

Jun 29 10:05:02 [localhost:fmmb.current.lock.disk:info]: Disk 0a.01.1 is a local HA mailbox disk.

Jun 29 10:05:02 [localhost:fmmb.current.lock.disk:info]: Disk 0a.01.2 is a local HA mailbox disk.

Jun 29 10:05:02 [localhost:fmmb.instStat.change:info]: normal mailbox instance on local side.

Jun 29 10:05:02 [localhost:fmmb.instStat.change:info]: no mailbox instance on partner side.

Jun 29 10:05:02 [localhost:raid.cksum.replay.summary:info]: Replayed 0 checksum blocks.

Jun 29 10:05:02 [localhost:raid.stripe.replay.summary:info]: Replayed 0 stripes.

Jun 29 10:05:02 [localhost:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID '71fc60fc-0892-43ed-943c-cc1e327dc760' was built in 0 msec, after scanning 0 inodes and restarting -1 times with a final result of starting.

Sun Jun 29 15:05:03 GMT [localhost:rc:notice]: The system was down for 662 seconds

Jun 29 10:05:02 [localhost:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID '71fc60fc-0892-43ed-943c-cc1e327dc760' was built in 42 msec, after scanning 12 inodes and restarting 19 times with a final result of success.

hostname: hostname must be in /etc/hosts

Jun 29 10:05:03 [localhost:cf.fm.launch:info]: Launching failover monitor

-e0e: bad value

Jun 29 10:05:03 [localhost:cf.fm.notkoverClusterDisable:warning]: Failover monitor: takeover disabled (restart)

Cannot determine whether configuration should be HA or non-HA. Chassis is in ha configuration, controller is in ha configurati

on, and the HA mode is disabled. Boot into Maintenance mode and run the 'ha-config modify' command to set the controller and chassis configuration to or HA or non-HA, as appropriate. Setting the wrong configuration might lead to data loss. If you need assistance, contact support.

ifconfig: ifconfig: -e0e: bad address

Jun 29 10:05:03 [localhost:kern.syslog.msg:notice]: The system was down for 662 seconds

n 29 10:05:03 [localhost:mgr.opsmgr.autoreg.norec:warning]: Data ONTAP could not perform automatic registration for OnCommand Unified Manager because Data ONTAP could not find SRV records for the server or because the server is not located on this subnet.

Jun 29 10:05:04 [localhost:haosc.config.unknown.ls:ALERT]: Cannot determine whether the configuration should be stand-alone or HA. Chassis is in ha configuration, controller is in ha configuration, and the HA mode is disabled.

Jun 29 10:05:04 [localhost:snmp.link.up:info]: Interface 3 is up

Jun 29 10:05:04 [localhost:netif.linkUp:info]: Ethernet e0e: Link up.

Jun 29 10:05:04 [localhost:mgr.boot.disk_done:info]: NetApp Release 8.2.1 7-Mode boot complete. Last disk update written at Sun Jun 29 14:54:01 GMT 2014

Jun 29 10:05:04 [localhost:perf.archive.start:info]: Performance archiver started. Sampling 33 objects and 518 counters.

Jun 29 10:05:04 [localhost:snmp.link.up:info]: Interface 7 is up

Jun 29 10:05:04 [localhost:snmp.link.up:info]: Interface 8 is up

Jun 29 10:05:04 [localhost:mgr.boot.reason_ok:notice]: System rebooted after a reboot command.

Jun 29 10:05:04 [localhost:callhome.reboot.reboot:info]: Call home for REBOOT (reboot command)

Jun 29 10:05:04 [localhost:replication.upgrade.complete:info]: Post-upgrade operations for SnapMirror and SnapVault are complete.

Jun 29 10:05:04 [localhost:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing.

Jun 29 10:05:04 [localhost:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes file is missing.

Jun 29 10:05:04 [localhost:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing.

Jun 29 10:05:04 [localhost:coredump.findcore.nocore:debug]: No unsaved cores could be found

Jun 29 10:05:05 [localhost:ip.drd.vfiler.info:info]: Although vFiler units are licensed, the routing daemon runs in the default IP space only.

Jun 29 10:05:06 [localhost:snmp.link.down:info]: Interface 5 is down.

Jun 29 10:05:06 [localhost:netif.linkDown:info]: Ethernet e0M: Link down, check cable.

Jun 29 10:05:07 [localhost:snmp.link.up:info]: Interface 6 is up

Jun 29 10:05:07 [localhost:netif.linkUp:info]: Ethernet e0P: Link up.

Jun 29 10:05:08 [localhost:snmp.link.down:info]: Interface 1 is down.

Jun 29 10:05:08 [localhost:netif.linkDown:info]: Ethernet e0a: Link down, check cable.

Jun 29 10:05:08 [localhost:snmp.link.down:info]: Interface 2 is down.

Jun 29 10:05:08 [localhost:netif.linkDown:info]: Ethernet e0b: Link down, check cable.

Jun 29 10:05:08 [localhost:snmp.link.down:info]: Interface 4 is down.

Jun 29 10:05:08 [localhost:netif.linkDown:info]: Ethernet e0f: Link down, check cable.

Sun Jun 29 14:56:51 GMT [localhost:netif.linkDown:info]: Ethernet Wrench Port: Link down, check cable. 

Sun Jun 29 14:56:54 GMT [localhost:diskown.isEnabled:info]: software ownership has been enabled for this system 

Sun Jun 29 14:56:56 GMT [localhost:wafl.memory.status:info]: 15294MB of memory is currently available for the WAFL file system. 

Sun Jun 29 14:56:56 GMT [localhost:dcs.framework.enabled:info]: The DCS framework is enabled on this node. 

Sun Jun 29 14:56:56 GMT [localhost:snmp.link.up:info]: Interface 6 is up 

Sun Jun 29 14:56:56 GMT [localhost:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jun 29 14:56:57 GMT [localhost:snmp.link.up:info]: Interface 3 is up 

Sun Jun 29 14:56:57 GMT [localhost:netif.linkUp:info]: Ethernet e0e: Link up. 

Sun Jun 29 14:57:01 GMT [localhost:snmp.link.down:info]: Interface 5 is down. 

Sun Jun 29 14:57:01 GMT [localhost:netif.linkDown:info]: Ethernet e0M: Link down, check cable. 

Sun Jun 29 14:57:03 GMT [localhost:snmp.link.down:info]: Interface 1 is down. 

Sun Jun 29 14:57:03 GMT [localhost:netif.linkDown:info]: Ethernet e0a: Link down, check cable. 

Sun Jun 29 14:57:03 GMT [localhost:snmp.link.down:info]: Interface 2 is down. 

Sun Jun 29 14:57:03 GMT [localhost:netif.linkDown:info]: Ethernet e0b: Link down, check cable. 

Sun Jun 29 14:57:03 GMT [localhost:snmp.link.down:info]: Interface 4 is down. 

Sun Jun 29 14:57:03 GMT [localhost:netif.linkDown:info]: Ethernet e0f: Link down, check cable. 

Sun Jun 29 14:58:49 GMT [localhost:snmp.link.up:info]: Interface 7 is up 

Sun Jun 29 14:58:49 GMT [localhost:snmp.link.up:info]: Interface 8 is up 

Sun Jun 29 14:58:49 GMT [localhost:mgr.boot.reason_ok:notice]: System rebooted after a reboot command. 

Sun Jun 29 14:58:49 GMT [localhost:callhome.reboot.reboot:info]: Call home for REBOOT (reboot command) 

Sun Jun 29 14:58:50 GMT [localhost:snmp.link.down:info]: Interface 6 is down. 

Sun Jun 29 14:58:50 GMT [localhost:netif.linkInfo:info]: Ethernet e0P: Link configured down. 

Sun Jun 29 14:58:54 GMT [localhost:snmp.link.up:info]: Interface 6 is up 

Sun Jun 29 14:58:54 GMT [localhost:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jun 29 14:59:10 GMT [localhost:zapi.sf.up.ready:info]: ZAPI: system node stable after startup. 

Sun Jun 29 14:59:22 GMT [acp-vfiler@localhost:acp.locked.wrench.port.up:info]: The on-board locked wrench port is up. 

Sun Jun 29 15:04:57 GMT [localhost:netif.linkDown:info]: Ethernet Wrench Port: Link down, check cable. 

Sun Jun 29 15:04:59 GMT [localhost:diskown.isEnabled:info]: software ownership has been enabled for this system 

Sun Jun 29 15:04:59 GMT [localhost:config.noPartnerDisks:CRITICAL]: No disks were detected for the partner; this node will be unable to takeover correctly 

Sun Jun 29 15:04:59 GMT [localhost:callhome.dsk.config:warning]: Call home for DISK CONFIGURATION ERROR 

Sun Jun 29 15:05:01 GMT [localhost:wafl.memory.status:info]: 15254MB of memory is currently available for the WAFL file system. 

Sun Jun 29 15:05:01 GMT [localhost:dcs.framework.enabled:info]: The DCS framework is enabled on this node. 

Sun Jun 29 15:05:01 GMT [localhost:snmp.link.up:info]: Interface 6 is up 

Sun Jun 29 15:05:01 GMT [localhost:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jun 29 15:05:02 GMT [localhost:fmmb.current.lock.disk:info]: Disk 0a.01.1 is a local HA mailbox disk. 

Sun Jun 29 15:05:02 GMT [localhost:fmmb.current.lock.disk:info]: Disk 0a.01.2 is a local HA mailbox disk. 

Sun Jun 29 15:05:02 GMT [localhost:fmmb.instStat.change:info]: normal mailbox instance on local side. 

Sun Jun 29 15:05:02 GMT [localhost:fmmb.instStat.change:info]: no mailbox instance on partner side. 

Sun Jun 29 15:05:02 GMT [localhost:raid.cksum.replay.summary:info]: Replayed 0 checksum blocks. 

Sun Jun 29 15:05:02 GMT [localhost:raid.stripe.replay.summary:info]: Replayed 0 stripes. 

Sun Jun 29 15:05:02 GMT [localhost:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID '71fc60fc-0892-43ed-943c-cc1e327dc760' was built in 0 msec, after scanning 0 inodes and restarting -1 times with a final result of starting. 

Sun Jun 29 15:05:02 GMT [localhost:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID '71fc60fc-0892-43ed-943c-cc1e327dc760' was built in 42 msec, after scanning 12 inodes and restarting 19 times with a final result of success. 

Sun Jun 29 15:05:03 GMT [localhost:cf.fm.launch:info]: Launching failover monitor 

Sun Jun 29 15:05:03 GMT [localhost:cf.fm.notkoverClusterDisable:warning]: Failover monitor: takeover disabled (restart) 

Sun Jun 29 15:05:03 GMT [localhost:tar.csum.match:info]: Stored checksum matches, not extracting local://mnt/prestage/mroot.tgz. 

Sun Jun 29 15:05:03 GMT [localhost:tar.csum.match:info]: Stored checksum matches, not extracting local://mnt/prestage/pmroot.tgz. 

Sun Jun 29 15:05:03 GMT [localhost:fcmon.status:info]: FCMON is running 

Sun Jun 29 15:05:03 GMT [localhost:zapi.sf.up.ready:info]: ZAPI: system node stable after startup. 

Sun Jun 29 15:05:03 GMT [localhost:dfu.firmwareUpToDate:info]: Firmware is up-to-date on all disks. 

Sun Jun 29 15:05:03 GMT [localhost:cf.fsm.backupMailboxError:warning]: Failover monitor: partner mailbox error detected. 

Sun Jun 29 15:05:03 GMT [localhost:cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of partner disabled (Controller Failover takeover disabled). 

Sun Jun 29 15:05:03 GMT [localhost:cf.fm.notkoverBadMbox:warning]: Failover monitor: uninitialized backup mailbox data detected 

Sun Jun 29 15:05:03 GMT [localhost:snmp.link.down:info]: Interface 6 is down. 

Sun Jun 29 15:05:03 GMT [localhost:netif.linkInfo:info]: Ethernet e0P: Link configured down. 

Sun Jun 29 15:05:03 GMT [localhost:reg.transaction.commitFail:warning]: registry: Cannot commit transaction in 'Periodic config update'. Error: Invalid argument (name=options.system.hostname) (value=TRN1NETAPPA)  

Sun Jun 29 15:05:03 GMT [localhost:mgr.opsmgr.autoreg.norec:warning]: Data ONTAP could not perform automatic registration for OnCommand Unified Manager because Data ONTAP could not find SRV records for the server or because the server is not located on this subnet. 

Sun Jun 29 15:05:04 GMT [localhost:haosc.config.unknown.ls:ALERT]: Cannot determine whether the configuration should be stand-alone or HA. Chassis is in ha configuration, controller is in ha configuration, and the HA mode is disabled. 

Sun Jun 29 15:05:04 GMT [localhost:snmp.link.up:info]: Interface 3 is up 

Sun Jun 29 15:05:04 GMT [localhost:netif.linkUp:info]: Ethernet e0e: Link up. 

Sun Jun 29 15:05:04 GMT [localhost:mgr.boot.disk_done:info]: NetApp Release 8.2.1 7-Mode boot complete. Last disk update written at Sun Jun 29 14:54:01 GMT 2014  

Sun Jun 29 15:05:04 GMT [localhost:perf.archive.start:info]: Performance archiver started. Sampling 33 objects and 518 counters. 

Sun Jun 29 15:05:04 GMT [localhost:snmp.link.up:info]: Interface 7 is up 

Sun Jun 29 15:05:04 GMT [localhost:snmp.link.up:info]: Interface 8 is up 

Sun Jun 29 15:05:04 GMT [localhost:mgr.boot.reason_ok:notice]: System rebooted after a reboot command. 

Sun Jun 29 15:05:04 GMT [localhost:callhome.reboot.reboot:info]: Call home for REBOOT (reboot command) 

Sun Jun 29 15:05:04 GMT [localhost:replication.upgrade.complete:info]: Post-upgrade operations for SnapMirror and SnapVault are complete. 

Sun Jun 29 15:05:04 GMT [localhost:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing. 

Sun Jun 29 15:05:04 GMT [localhost:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes file is missing. 

Sun Jun 29 15:05:04 GMT [localhost:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing. 

Sun Jun 29 15:05:05 GMT [localhost:ip.drd.vfiler.info:info]: Although vFiler units are licensed, the routing daemon runs in the default IP space only. 

Sun Jun 29 15:05:06 GMT [localhost:snmp.link.down:info]: Interface 5 is down. 

Sun Jun 29 15:05:06 GMT [localhost:netif.linkDown:info]: Ethernet e0M: Link down, check cable. 

Sun Jun 29 15:05:07 GMT [localhost:snmp.link.up:info]: Interface 6 is up 

Sun Jun 29 15:05:07 GMT [localhost:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jun 29 15:05:08 GMT [localhost:snmp.link.down:info]: Interface 1 is down. 

Sun Jun 29 15:05:08 GMT [localhost:netif.linkDown:info]: Ethernet e0a: Link down, check cable. 

Sun Jun 29 15:05:08 GMT [localhost:snmp.link.down:info]: Interface 2 is down. 

Sun Jun 29 15:05:08 GMT [localhost:netif.linkDown:info]: Ethernet e0b: Link down, check cable. 

Sun Jun 29 15:05:08 GMT [localhost:snmp.link.down:info]: Interface 4 is down. 

Sun Jun 29 15:05:08 GMT [localhost:netif.linkDown:info]: Ethernet e0f: Link down, check cable. 

Sun Jun 29 15:05:16 GMT [localhost:tar.csum.match:info]: Stored checksum matches, not extracting /mroot_late.tgz. 

Sun Jun 29 15:05:16 GMT [localhost:tar.csum.match:info]: Stored checksum matches, not extracting /platform/pmroot_late.tgz. 

Sun Jun 29 15:05:17 GMT [localhost:ha.takeoverImpNotDef:error]: Takeover of the partner node is impossible due to reason Controller Failover takeover disabled. 

Sun Jun 29 15:05:28 GMT [localhost:sp.notConfigured:warning]: The system's Service Processor (SP) is not configured. Use the 'sp setup' command to configure it. 

Sun Jun 29 15:05:28 GMT [localhost:sp.network.link.down:warning]: Service Processor (SP) network port link down due to cable or network errors. 

Sun Jun 29 15:05:30 GMT [localhost:splog.running.normally:info]: Process splogd is operating normally. 

Sun Jun 29 15:05:35 GMT [acp-vfiler@localhost:acp.locked.wrench.port.up:info]: The on-board locked wrench port is up. 

Sun Jun 29 15:05:37 GMT [localhost:snmp.warmstart.trap:info]: SNMP daemon was reinitialized with no configuration changes. 

Login incorrect

Password:

Login incorrect

Password:

> Sun Jun 29 10:07:10 EST [localhost:console_login_mgr:info]: root logged in from console

>

> cf

Controller is in Non-HA mode.

Public