ONTAP Discussions

Random FAS3210 Reboots.. Scripts?!

TCEDRYAN655
6,733 Views

Hi Everyone,

Just started up at a new gig, with a FAS3210, clustered running ONTap 8.1.4.

So, get this.. The day before I started, it randomly rebooted both filers at 1 minute apart. I'll copy past the logs pulled from the time of boot. Never seen this before, but can't see no trace of a user initiation of reboot, nor a script scheduled to do so

Check it out:

vFiler1 (First to reboot)

[Paris1:iscsi.notice:notice]: ISCSI: New session from initiator iqn.1991-05.com.microsoft:wynn.mydomainname.com at IP addr 10.250.175.41 

Sun Jul 27 13:13:34 EDT [Paris1:iscsi.notice:notice]: ISCSI: New session from initiator iqn.1991-05.com.microsoft:cosmo.mydomainname.com at IP addr 10.250.175.60 

Sun Jul 27 13:13:49 EDT [Paris1:iscsi.notice:notice]: ISCSI: New session from initiator iqn.1991-05.com.microsoft:montecarlo.mydomainname.com at IP addr 10.250.175.26 

Sun Jul 27 17:28:33 GMT [Paris1: rc:info]: relog syslog Sun Jul 27 13:26:33 EDT [Paris1:sas.port.down:debug]: SAS port "0a" went down. 

Sun Jul 27 17:28:33 GMT [Paris1: rc:info]: relog syslog Sun Jul 27 13:26:33 EDT [Paris1:sas.port.down:debug]: SAS port "0b" went down. 

Sun Jul 27 17:28:40 GMT [Paris1: ddns_loop:info]: Lookup of Paris1.mydomainname.com failed with DNS server 192.168.7.20: Connection timed out.

Sun Jul 27 13:28:22 EDT [Paris1:diskown.isEnabled:info]: software ownership has been enabled for this system 

Sun Jul 27 13:28:22 EDT [Paris1:dcs.framework.enabled:info]: The DCS framework is enabled on this node. 

Sun Jul 27 13:28:23 EDT [Paris1:wafl.memory.status:info]: 2215MB of memory is currently available for the WAFL file system. 

Sun Jul 27 13:28:23 EDT [Paris1:fmmb.current.lock.disk:info]: Disk 0a.20.0 is a local HA mailbox disk. 

Sun Jul 27 13:28:23 EDT [Paris1:fmmb.current.lock.disk:info]: Disk 0a.20.1 is a local HA mailbox disk. 

Sun Jul 27 13:28:23 EDT [Paris1:fmmb.instStat.change:info]: normal mailbox instance on local side. 

Sun Jul 27 13:28:23 EDT [Paris1:fmmb.current.lock.disk:info]: Disk 0b.00.1 is a partner HA mailbox disk. 

Sun Jul 27 13:28:23 EDT [Paris1:fmmb.current.lock.disk:info]: Disk 0b.00.11 is a partner HA mailbox disk. 

Sun Jul 27 13:28:23 EDT [Paris1:fmmb.instStat.change:info]: normal mailbox instance on partner side. 

Sun Jul 27 13:28:23 EDT [Paris1:cf.fm.partner:info]: Failover monitor: partner 'Paris2' 

Sun Jul 27 13:28:23 EDT [Paris1:coredump.host.spare.none:info]: No sparecore disk was found for host 0. 

Sun Jul 27 13:28:23 EDT [Paris1:raid.vol.replay.nvram:info]: Performing raid replay on volume(s) 

Sun Jul 27 13:28:23 EDT [Paris1:raid.cksum.replay.summary:info]: Replayed 0 checksum blocks. 

Sun Jul 27 13:28:24 EDT [Paris1:raid.stripe.replay.summary:info]: Replayed 0 stripes. 

Sun Jul 27 13:28:24 EDT [Paris1:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr1' UUID '83f25832-4a8c-11e0-abdd-00a098144052' was built in 0 msec, after scanning 0 inodes and restarting -1 times with a final result of starting. 

Sun Jul 27 13:28:24 EDT [Paris1:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID 'ed346ef0-0037-11dc-a9de-00a098144052' was built in 0 msec, after scanning 0 inodes and restarting -1 times with a final result of starting. 

Sun Jul 27 13:28:24 EDT [Paris1:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID 'ed346ef0-0037-11dc-a9de-00a098144052' was built in 57 msec, after scanning 10 inodes and restarting 22 times with a final result of success. 

Sun Jul 27 13:28:24 EDT [Paris1:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr1' UUID '83f25832-4a8c-11e0-abdd-00a098144052' was built in 467 msec, after scanning 49 inodes and restarting 24 times with a final result of success. 

Sun Jul 27 13:28:24 EDT [Paris1:netif.linkUp:info]: Ethernet c0b: Link up. 

Sun Jul 27 13:28:25 EDT [Paris1:shelf.config.mpha:info]: All attached storage on the system is multi-pathed HA. 

Sun Jul 27 13:28:25 EDT [Paris1:cf.fm.launch:info]: Launching failover monitor 

Sun Jul 27 13:28:25 EDT [Paris1:cf.fm.partner:info]: Failover monitor: partner 'Paris2' 

Sun Jul 27 13:28:25 EDT [Paris1:netif.linkUp:info]: Ethernet e0M: Link up. 

Sun Jul 27 13:28:25 EDT [Paris1:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jul 27 13:28:25 EDT [Paris1:cf.fsm.takeoverOfPartnerDisabled:notice]: Failover monitor: takeover of Paris2 disabled (partner booting). 

Sun Jul 27 13:28:26 EDT [Paris1:netif.linkUp:info]: Ethernet e1b: Link up. 

Sun Jul 27 13:28:26 EDT [Paris1:netif.linkUp:info]: Ethernet e1c: Link up. 

Sun Jul 27 13:28:26 EDT [Paris1:netif.linkUp:info]: Ethernet e0b: Link up. 

Sun Jul 27 13:28:26 EDT [Paris1:netif.linkUp:info]: Ethernet e1d: Link up. 

Sun Jul 27 13:28:27 EDT [Paris1:netif.linkUp:info]: Ethernet e1a: Link up. 

Sun Jul 27 13:28:27 EDT [Paris1:netif.linkUp:info]: Ethernet e0a: Link up. 

Sun Jul 27 13:28:27 EDT [iwarp-vfiler@Paris1:ctrl.rdma.heartBeat:info]: High-availability interconnect status: Starting heartbeat to 192.168.2.69 

Sun Jul 27 13:28:29 EDT [Paris1:vdisk.onlineComplete:info]: Local LUN(s) online completed. 

Sun Jul 27 13:28:29 EDT [Paris1:tar.csum.match:info]: Stored checksum matches, not extracting local://tmp/prestage/mroot.tgz. 

Sun Jul 27 13:28:29 EDT [Paris1:tar.csum.match:info]: Stored checksum matches, not extracting local://tmp/prestage/pmroot.tgz. 

Sun Jul 27 13:28:29 EDT [Paris1:fcmon.status:info]: FCMON is running 

Sun Jul 27 13:28:29 EDT [Paris1:dfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk drives 

Sun Jul 27 13:28:30 EDT [Paris1:netif.linkInfo:info]: Ethernet e0P: Link configured down. 

Sun Jul 27 13:28:30 EDT [Paris1:netif.linkInfo:info]: Ethernet e1a: Link being reconfigured. 

Sun Jul 27 13:28:30 EDT [Paris1:netif.linkInfo:info]: Ethernet e1b: Link being reconfigured. 

Sun Jul 27 13:28:30 EDT [Paris1:perf.archive.start:info]: Performance archiver started. Sampling 29 objects and 421 counters. 

Sun Jul 27 13:28:31 EDT [Paris1:pvif.switchLink:warning]: LAN-Trunk: switching to e1c 

Sun Jul 27 13:28:33 EDT [Paris1:iscsi.service.startup:info]: iSCSI service startup 

Sun Jul 27 13:28:33 EDT [Paris1:scsitarget.vtic.up:notice]: The VTIC is up. 

Sun Jul 27 13:28:33 EDT [Paris1:cf.fsm.takeoverByPartnerEnabled:notice]: Failover monitor: takeover of Paris1 by Paris2 enabled 

Sun Jul 27 13:28:34 EDT [Paris1:cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of Paris2 enabled 

Sun Jul 27 13:28:34 EDT [Paris1:netif.linkUp:info]: Ethernet e1a: Link up. 

Sun Jul 27 13:28:35 EDT [Paris1:netif.linkUp:info]: Ethernet e1b: Link up. 

Sun Jul 27 13:28:46 EDT [Paris1:iscsi.notice:notice]: ISCSI: New session from initiator iqn.1991-05.com.microsoft:wynn.mydomainname.com at IP addr 10.250.175.41 

Sun Jul 27 13:28:48 EDT [Paris1:mgr.opsmgr.autoreg.norec:warning]: Data ONTAP could not perform automatic registration for OnCommand Unified Manager because Data ONTAP could not find SRV records for the server or because the server is not located on this subnet. 

Sun Jul 27 13:28:48 EDT [Paris1:mgr.boot.disk_done:info]: NetApp Release 8.1.4 7-Mode boot complete. Last disk update written at Sun Jul 27 17:26:27 GMT 2014  

Sun Jul 27 13:28:48 EDT [Paris1:cf.hwassist.notifyEnableOn:info]: HA hw_assist: hw_assist functionality on the partner node has been enabled by the user. 

Sun Jul 27 13:28:49 EDT [Paris1:mgr.boot.reason_ok:notice]: System rebooted after power-on.  

Sun Jul 27 13:28:49 EDT [Paris1:callhome.reboot.poweron:info]: Call home for REBOOT (power on) 

Sun Jul 27 13:28:49 EDT [Paris1:cifs.startup.local.succeeded:info]: CIFS: CIFS local server is running. 

Sun Jul 27 13:28:49 EDT [Paris1:ip.drd.vfiler.info:info]: Although vFiler units are licensed, the routing daemon runs in the default IP space only. 

Sun Jul 27 13:28:49 EDT [Paris1:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes file is missing. 

Sun Jul 27 13:28:51 EDT [Paris1:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jul 27 13:29:02 EDT [Paris1:tar.csum.match:info]: Stored checksum matches, not extracting /mroot_late.tgz. 

Sun Jul 27 13:29:02 EDT [Paris1:tar.csum.match:info]: Stored checksum matches, not extracting /platform/pmroot_late.tgz. 

Sun Jul 27 13:29:02 EDT [Paris1:cf.hwassist.DefaultPrtnrAddr:notice]: The system automatically chose 192.168.4.49 as the hardware-assist partner address. If you want to use a different IP address, change it using the command 'options cf.hw_assist.partner.address'. 

Sun Jul 27 13:29:12 EDT [Paris1:nbt.nbns.registrationComplete:info]: NBT: All CIFS name registrations have completed for the local server. 

Sun Jul 27 13:29:19 EDT [acp-vfiler@Paris1:acp.locked.wrench.port.up:info]: The on-board locked wrench port is up. 

Sun Jul 27 13:29:29 EDT [Paris1:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954293G16R1) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status. 

Sun Jul 27 13:29:29 EDT [Paris1:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954293G16R1) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status. 

Sun Jul 27 13:29:30 EDT [Paris1:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954292G1C0T) shelf 20 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status. 

Sun Jul 27 13:29:30 EDT [Paris1:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954292G1C0T) shelf 20 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status. 

Sun Jul 27 13:29:49 EDT [Paris1:callhome.performance.snap:info]: Call home for PERFORMANCE SNAPSHOT 

Sun Jul 27 13:31:56 EDT [Paris1:iscsi.notice:notice]: ISCSI: New session from initiator iqn.1991-05.com.microsoft:montecarlo.mydomainname.com at IP addr 10.250.175.26 

Sun Jul 27 13:32:03 EDT [Paris1:cf.hwassist.socBindFailed:error]: hw_assist: bind failed to port 4444 on IP address 192.168.4.45. Error 49 

Filer2 from time of reboot:

Sun Jul 27 13:00:00 EDT [Paris2:kern.uptime.filer:info]:   1:00pm up 46 days, 12:54 0 NFS ops, 83163294 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 13:00:35 EDT [Paris2:wafl.quota.qtree.exceeded:notice]: tid 6: tree quota exceeded on volume SharedFiles3. Additional warnings will be suppressed for approximately 60 minutes or until a 'quota resize' is performed. 

Sun Jul 27 13:05:24 EDT [Paris2: Gb_Enet/e1d:error]: arp: 00:15:17:9f:af:28 attempts to modify permanent entry for 192.168.4.47 ( 00:a0:98:14:4c:1c ) on LAN-Trunk

Sun Jul 27 13:19:17 EDT [Paris2: Gb_Enet/e1d:error]: arp: 00:15:17:9f:af:28 attempts to modify permanent entry for 192.168.4.47 ( 00:a0:98:14:4c:1c ) on LAN-Trunk

Sun Jul 27 13:23:55 EDT [Paris2: Gb_Enet/e1d:error]: arp: 00:15:17:9f:af:28 attempts to modify permanent entry for 192.168.4.47 ( 00:a0:98:14:4c:1c ) on LAN-Trunk

Sun Jul 27 17:29:00 GMT [localhost: rc:notice]: The system was down for 156 seconds

Sun Jul 27 17:29:05 GMT [Paris2: rc:info]: relog syslog Sun Jul 27 13:26:12 EDT [Paris2:sas.port.down:debug]: SAS port "0a" went down. 

Sun Jul 27 17:29:05 GMT [Paris2: rc:info]: relog syslog Sun Jul 27 13:26:12 EDT [Paris2:sas.port.down:debug]: SAS port "0b" went down. 

Sun Jul 27 17:29:12 GMT [Paris2: ddns_loop:info]: Lookup of Paris2.mydomainname.com failed with DNS server 192.168.7.20: Connection timed out.

Sun Jul 27 17:28:48 GMT [Paris2:diskown.isEnabled:info]: software ownership has been enabled for this system 

Sun Jul 27 17:28:48 GMT [Paris2:dcs.framework.enabled:info]: The DCS framework is enabled on this node. 

Sun Jul 27 17:28:49 GMT [Paris2:wafl.memory.status:info]: 2214MB of memory is currently available for the WAFL file system. 

Sun Jul 27 17:28:50 GMT [Paris2:fmmb.current.lock.disk:info]: Disk 0b.00.1 is a local HA mailbox disk. 

Sun Jul 27 17:28:50 GMT [Paris2:fmmb.current.lock.disk:info]: Disk 0b.00.11 is a local HA mailbox disk. 

Sun Jul 27 17:28:50 GMT [Paris2:fmmb.instStat.change:info]: normal mailbox instance on local side. 

Sun Jul 27 17:28:50 GMT [Paris2:fmmb.current.lock.disk:info]: Disk 0a.20.0 is a partner HA mailbox disk. 

Sun Jul 27 17:28:50 GMT [Paris2:fmmb.current.lock.disk:info]: Disk 0a.20.1 is a partner HA mailbox disk. 

Sun Jul 27 17:28:50 GMT [Paris2:fmmb.instStat.change:info]: normal mailbox instance on partner side. 

Sun Jul 27 17:28:50 GMT [Paris2:cf.fm.partner:info]: Failover monitor: partner 'Paris1' 

Sun Jul 27 17:28:50 GMT [Paris2:coredump.host.spare.none:info]: No sparecore disk was found for host 0. 

Sun Jul 27 17:28:50 GMT [Paris2:raid.vol.replay.nvram:info]: Performing raid replay on volume(s) 

Sun Jul 27 17:28:50 GMT [Paris2:raid.cksum.replay.summary:info]: Replayed 0 checksum blocks. 

Sun Jul 27 17:28:51 GMT [Paris2:raid.stripe.replay.summary:info]: Replayed 0 stripes. 

Sun Jul 27 17:28:51 GMT [Paris2:netif.linkUp:info]: Ethernet c0b: Link up. 

Sun Jul 27 17:28:51 GMT [Paris2:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr1' UUID '23ac397e-4a8d-11e0-ab3f-00a098144c1a' was built in 0 msec, after scanning 0 inodes and restarting -1 times with a final result of starting. 

Sun Jul 27 17:28:51 GMT [Paris2:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID '4aceaefc-4a73-11e0-96e6-00a098144c1a' was built in 0 msec, after scanning 0 inodes and restarting -1 times with a final result of starting. 

Sun Jul 27 17:28:51 GMT [Paris2:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr0' UUID '4aceaefc-4a73-11e0-96e6-00a098144c1a' was built in 35 msec, after scanning 12 inodes and restarting 17 times with a final result of success. 

Sun Jul 27 17:28:51 GMT [Paris2:netif.linkUp:info]: Ethernet e0M: Link up. 

Sun Jul 27 17:28:51 GMT [Paris2:shelf.config.mpha:info]: All attached storage on the system is multi-pathed HA. 

Sun Jul 27 17:28:52 GMT [Paris2:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jul 27 17:28:52 GMT [Paris2:netif.linkUp:info]: Ethernet e0b: Link up. 

Sun Jul 27 17:28:52 GMT [Paris2:netif.linkUp:info]: Ethernet e1b: Link up. 

Sun Jul 27 17:28:52 GMT [Paris2:netif.linkUp:info]: Ethernet e1a: Link up. 

Sun Jul 27 17:28:53 GMT [Paris2:wafl.aggr.btiddb.build:info]: Buftreeid database for aggregate 'aggr1' UUID '23ac397e-4a8d-11e0-ab3f-00a098144c1a' was built in 1927 msec, after scanning 41 inodes and restarting 20 times with a final result of success. 

Sun Jul 27 17:28:53 GMT [Paris2:netif.linkUp:info]: Ethernet e0a: Link up. 

Sun Jul 27 17:28:53 GMT [Paris2:wafl.root.overwritesUnsafe:warning]: Root volume vol0 does not protect overwrites in space reserved files. 

Sun Jul 27 17:28:53 GMT [Paris2:netif.linkUp:info]: Ethernet e1d: Link up. 

Sun Jul 27 17:28:53 GMT [Paris2:netif.linkUp:info]: Ethernet e1c: Link up. 

Sun Jul 27 17:28:55 GMT [Paris2:cf.fm.launch:info]: Launching failover monitor 

Sun Jul 27 17:28:55 GMT [Paris2:cf.fm.partner:info]: Failover monitor: partner 'Paris1' 

Sun Jul 27 17:28:55 GMT [iwarp-vfiler@Paris2:ctrl.rdma.heartBeat:info]: High-availability interconnect status: Starting heartbeat to 192.168.2.155 

Sun Jul 27 17:29:00 GMT [Paris2:tar.csum.match:info]: Stored checksum matches, not extracting local://tmp/prestage/mroot.tgz. 

Sun Jul 27 17:29:00 GMT [Paris2:tar.csum.match:info]: Stored checksum matches, not extracting local://tmp/prestage/pmroot.tgz. 

Sun Jul 27 17:29:01 GMT [Paris2:fcmon.status:info]: FCMON is running 

Sun Jul 27 17:29:01 GMT [Paris2:fci.config.missing:warning]: Fibre Channel adapter 0c is configured in the boot environment but the on-disk configuration information is missing. 

Sun Jul 27 17:29:01 GMT [Paris2:fci.config.missing:warning]: Fibre Channel adapter 0d is configured in the boot environment but the on-disk configuration information is missing. 

Sun Jul 27 17:29:01 GMT [Paris2:dfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk drives 

Sun Jul 27 17:29:01 GMT [Paris2:netif.linkInfo:info]: Ethernet e0P: Link configured down. 

Sun Jul 27 17:29:01 GMT [Paris2:cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of Paris1 enabled 

Sun Jul 27 17:29:02 GMT [Paris2:scsitarget.vtic.up:notice]: The VTIC is up. 

Sun Jul 27 17:29:02 GMT [Paris2:netif.linkInfo:info]: Ethernet e1a: Link being reconfigured. 

Sun Jul 27 17:29:02 GMT [Paris2:netif.linkInfo:info]: Ethernet e1b: Link being reconfigured. 

Sun Jul 27 17:29:02 GMT [Paris2:cf.fsm.takeoverByPartnerEnabled:notice]: Failover monitor: takeover of Paris2 by Paris1 enabled 

Sun Jul 27 17:29:03 GMT [Paris2:perf.archive.start:info]: Performance archiver started. Sampling 29 objects and 421 counters. 

Sun Jul 27 17:29:04 GMT [Paris2:pvif.switchLink:warning]: LAN-Trunk: switching to e1c 

Sun Jul 27 17:29:04 GMT [Paris2:netif.linkUp:info]: Ethernet e0M: Link up. 

Sun Jul 27 17:29:05 GMT [Paris2:iscsi.service.startup:info]: iSCSI service startup 

Sun Jul 27 17:29:07 GMT [Paris2:netif.linkUp:info]: Ethernet e1b: Link up. 

Sun Jul 27 17:29:07 GMT [Paris2:netif.linkUp:info]: Ethernet e1a: Link up. 

Sun Jul 27 17:29:21 GMT [Paris2:mgr.opsmgr.autoreg.norec:warning]: Data ONTAP could not perform automatic registration for OnCommand Unified Manager because Data ONTAP could not find SRV records for the server or because the server is not located on this subnet. 

Sun Jul 27 17:29:21 GMT [Paris2:cmds.sysconf.logErr:error]: sysconfig: Unless directed by NetApp Global Services volume vol0 should have the volume option create_ucode set to On. . 

Sun Jul 27 17:29:21 GMT [Paris2:callhome.sys.config:error]: Call home for SYSTEM CONFIGURATION WARNING 

Sun Jul 27 17:29:21 GMT [Paris2:mgr.boot.disk_done:info]: NetApp Release 8.1.4 7-Mode boot complete. Last disk update written at Sun Jul 27 17:26:11 GMT 2014  

Sun Jul 27 17:29:21 GMT [Paris2:cf.hwassist.notifyEnableOn:info]: HA hw_assist: hw_assist functionality on the partner node has been enabled by the user. 

Sun Jul 27 17:29:21 GMT [Paris2:mgr.boot.reason_ok:notice]: System rebooted after power-on.  

Sun Jul 27 17:29:21 GMT [Paris2:callhome.reboot.poweron:info]: Call home for REBOOT (power on) 

Sun Jul 27 17:29:21 GMT [Paris2:cifs.startup.local.succeeded:info]: CIFS: CIFS local server is running. 

Sun Jul 27 17:29:21 GMT [Paris2:cf.hwassist.hwasstActive:info]: hw_assist: hw_assist functionality is active on IP address: 192.168.4.49 port: 4444 

Sun Jul 27 17:29:21 GMT [Paris2:ip.drd.vfiler.info:info]: Although vFiler units are licensed, the routing daemon runs in the default IP space only. 

Sun Jul 27 17:29:21 GMT [Paris2:net.if.mgmt.sameSubnet:warning]: ifconfig: IP address '192.168.4.47' configured on dedicated management port 'e0M' is on the same subnet as IP address '192.168.4.49' configured on data port LAN-Trunk. Management IP addresses must be on dedicated management subnets. 

Sun Jul 27 17:29:22 GMT [Paris2:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes file is missing. 

Sun Jul 27 17:29:23 GMT [Paris2:netif.linkUp:info]: Ethernet e0P: Link up. 

Sun Jul 27 17:29:31 GMT [Paris2:nbt.nbns.socketError:error]: NBT: Cannot send on NBNS socket to WINS server. Error 0x41: No route to host. 

Sun Jul 27 17:29:31 GMT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 17:29:31 GMT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 17:29:31 GMT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 17:29:35 GMT [Paris2:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954293G16R1) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status. 

Sun Jul 27 17:29:35 GMT [Paris2:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954293G16R1) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status. 

Sun Jul 27 17:29:35 GMT [Paris2:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954292G1C0T) shelf 20 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status. 

Sun Jul 27 17:29:35 GMT [Paris2:ses.status.ACPInfo:info]: DS4243 (S/N SHU0954292G1C0T) shelf 20 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status. 

Sun Jul 27 17:29:40 GMT [Paris2:cf.hwassist.DefaultPrtnrAddr:notice]: The system automatically chose 192.168.4.45 as the hardware-assist partner address. If you want to use a different IP address, change it using the command 'options cf.hw_assist.partner.address'. 

Sun Jul 27 17:29:47 GMT [Paris2:tar.csum.match:info]: Stored checksum matches, not extracting /mroot_late.tgz. 

Sun Jul 27 17:29:47 GMT [Paris2:tar.csum.match:info]: Stored checksum matches, not extracting /platform/pmroot_late.tgz. 

Sun Jul 27 17:29:50 GMT [Paris2:net.if.mgmt.sameSubnet:warning]: ifconfig: IP address '192.168.4.47' configured on dedicated management port 'e0M' is on the same subnet as IP address '192.168.4.49' configured on data port LAN-Trunk. Management IP addresses must be on dedicated management subnets. 

Sun Jul 27 17:29:52 GMT [acp-vfiler@Paris2:acp.locked.wrench.port.up:info]: The on-board locked wrench port is up. 

Sun Jul 27 13:30:14 EDT [Paris2:net.if.mgmt.sameSubnet:warning]: ifconfig: IP address '192.168.4.47' configured on dedicated management port 'e0M' is on the same subnet as IP address '192.168.4.49' configured on data port LAN-Trunk. Management IP addresses must be on dedicated management subnets. 

Sun Jul 27 13:30:21 EDT [Paris2:callhome.performance.snap:info]: Call home for PERFORMANCE SNAPSHOT 

Sun Jul 27 13:30:23 EDT [Paris2:nbt.nbss.socketError:error]: NBT: Cannot connect to server 192.168.7.21 over NBSS socket for port 139. Error 0x41: No route to host. 

Sun Jul 27 13:30:23 EDT [Paris2:auth.dc.NoDCConnection:error]: AUTH: Unable to connect to any Domain Controller for the HORIZON domain. Use 'cifs domaininfo' for a listing of DCs tried. 

Sun Jul 27 13:31:06 EDT [Paris2:nbt.nbss.socketError:error]: NBT: Cannot connect to server 192.168.7.21 over NBSS socket for port 139. Error 0x23: Resource temporarily unavailable. 

Sun Jul 27 13:31:07 EDT [Paris2:wafl.quota.qtree.exceeded:notice]: tid 6: tree quota exceeded on volume SharedFiles3. Additional warnings will be suppressed for approximately 60 minutes or until a 'quota resize' is performed. 

Sun Jul 27 13:31:26 EDT [Paris2:nbt.nbss.socketError:error]: NBT: Cannot connect to server 192.168.7.21 over NBSS socket for port 139. Error 0x23: Resource temporarily unavailable. 

Sun Jul 27 13:34:21 EDT [Paris2:net.if.mgmt.sameSubnet:warning]: ifconfig: IP address '192.168.4.47' configured on dedicated management port 'e0M' is on the same subnet as IP address '192.168.4.49' configured on data port LAN-Trunk. Management IP addresses must be on dedicated management subnets. 

Sun Jul 27 13:34:26 EDT [Paris2:nbt.nbss.socketError:error]: NBT: Cannot connect to server 192.168.7.21 over NBSS socket for port 139. Error 0x23: Resource temporarily unavailable. 

Sun Jul 27 13:37:22 EDT [Paris2:nbt.nbss.socketError:error]: NBT: Cannot connect to server 192.168.7.21 over NBSS socket for port 445. Error 0x3d: Connection refused. 

Sun Jul 27 13:38:22 EDT [Paris2:nbt.nbss.socketError:error]: NBT: Cannot connect to server 192.168.7.21 over NBSS socket for port 139. Session setup error. Error 0x82: Called name not present. 

Sun Jul 27 13:41:19 EDT [Paris2:cifs.ldap.address.invalid:info]: Could not read the saved AD Lightweight Directory Access Protocol(LDAP) server information. The system will get the AD information by doing AD discovery. 

Sun Jul 27 13:41:19 EDT [Paris2:auth.ldap.trace.LDAPConnection.statusMsg:info]: AUTH: TraceLDAPServer- Starting AD LDAP server address discovery for mydomainname.COM. 

Sun Jul 27 13:41:19 EDT [Paris2:auth.ldap.trace.LDAPConnection.statusMsg:info]: AUTH: TraceLDAPServer- Found 2 AD LDAP server addresses using DNS site query (horizon). 

Sun Jul 27 13:41:19 EDT [Paris2:auth.ldap.trace.LDAPConnection.statusMsg:info]: AUTH: TraceLDAPServer- Found 3 AD LDAP server addresses using generic DNS query. 

Sun Jul 27 13:41:19 EDT [Paris2:auth.ldap.trace.LDAPConnection.statusMsg:info]: AUTH: TraceLDAPServer- AD LDAP server address discovery for mydomainname.COM complete. 3 unique addresses found. 

Sun Jul 27 14:00:00 EDT [Paris2:kern.uptime.filer:info]:   2:00pm up 31 mins, 0 NFS ops, 680 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 14:29:33 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 14:29:37 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 14:29:37 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 14:31:23 EDT [Paris2:wafl.quota.qtree.exceeded:notice]: tid 6: tree quota exceeded on volume SharedFiles3. Additional warnings will be suppressed for approximately 60 minutes or until a 'quota resize' is performed. 

Sun Jul 27 15:00:00 EDT [Paris2:kern.uptime.filer:info]:   3:00pm up  1:31 0 NFS ops, 6703 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 15:29:43 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 15:31:11 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 15:31:15 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 15:31:25 EDT [Paris2:wafl.quota.qtree.exceeded:notice]: tid 6: tree quota exceeded on volume SharedFiles3. Additional warnings will be suppressed for approximately 60 minutes or until a 'quota resize' is performed. 

Sun Jul 27 16:00:00 EDT [Paris2:kern.uptime.filer:info]:   4:00pm up  2:31 0 NFS ops, 9938 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 16:29:49 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 16:31:28 EDT [Paris2:wafl.quota.qtree.exceeded:notice]: tid 6: tree quota exceeded on volume SharedFiles3. Additional warnings will be suppressed for approximately 60 minutes or until a 'quota resize' is performed. 

Sun Jul 27 16:31:29 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 16:31:33 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 17:00:00 EDT [Paris2:kern.uptime.filer:info]:   5:00pm up  3:31 0 NFS ops, 11554 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 17:30:06 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 17:33:07 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 17:33:11 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 18:00:00 EDT [Paris2:kern.uptime.filer:info]:   6:00pm up  4:31 0 NFS ops, 13685 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 18:30:12 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 18:33:28 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 18:33:33 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 19:00:00 EDT [Paris2:kern.uptime.filer:info]:   7:00pm up  5:31 0 NFS ops, 14335 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 19:30:33 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 19:35:06 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 19:35:11 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 20:00:00 EDT [Paris2:kern.uptime.filer:info]:   8:00pm up  6:31 0 NFS ops, 16881 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 20:30:39 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 20:35:31 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 20:35:36 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 21:00:00 EDT [Paris2:kern.uptime.filer:info]:   9:00pm up  7:31 0 NFS ops, 17335 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 21:31:03 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 21:37:08 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 21:37:12 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 22:00:00 EDT [Paris2:kern.uptime.filer:info]:  10:00pm up  8:31 0 NFS ops, 18412 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 22:31:08 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 22:37:13 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 22:37:18 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 23:00:00 EDT [Paris2:kern.uptime.filer:info]:  11:00pm up  9:31 0 NFS ops, 19182 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops  

Sun Jul 27 23:31:14 EDT [Paris2:nbt.WINS.registrationTimeout:info]: NBT: No WINS server are responding. The filer will continue to try to register with WINS. 

Sun Jul 27 23:37:31 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.20 did not respond when the filer attempted to register 10.250.175.30. 

Sun Jul 27 23:37:36 EDT [Paris2:nbt.WINS.registrationFailed:error]: NBT: WINS server 192.168.7.21 did not respond when the filer attempted to register 10.250.175.30. 

Obviously there is a bit more going on with Filer2 than the reboot.. I have a NIC that is on the same subnet as the dedicated management, and an issue with disk quo

I am mostly foused on the reboot though, hoping someone else here has had this happen. For starters, it first losted network connectivity, then the SAS ports, and lastly, just rebooted.. like something initiated a clean reboot.

in OnCommand, the failover pairing is Healthy, and everything else is golden. This appears to have happened before a month prior. Any tips? Thanks!

1 ACCEPTED SOLUTION

billshaffer
6,733 Views

Both nodes report:

Sun Jul 27 13:28:49 EDT [Paris1:mgr.boot.reason_ok:notice]: System rebooted after power-on.  

Sun Jul 27 13:28:49 EDT [Paris1:callhome.reboot.poweron:info]: Call home for REBOOT (power on) 

And Paris1 reports:

Sun Jul 27 13:28:25 EDT [Paris1:cf.fsm.takeoverOfPartnerDisabled:notice]: Failover monitor: takeover of Paris2 disabled (partner booting).

I'd guess the DC lost power - that would explain the network connectivity (switch is down) and the SAS connectivity (disk shelves go down).

Can you confirm?

Bill

View solution in original post

3 REPLIES 3

billshaffer
6,734 Views

Both nodes report:

Sun Jul 27 13:28:49 EDT [Paris1:mgr.boot.reason_ok:notice]: System rebooted after power-on.  

Sun Jul 27 13:28:49 EDT [Paris1:callhome.reboot.poweron:info]: Call home for REBOOT (power on) 

And Paris1 reports:

Sun Jul 27 13:28:25 EDT [Paris1:cf.fsm.takeoverOfPartnerDisabled:notice]: Failover monitor: takeover of Paris2 disabled (partner booting).

I'd guess the DC lost power - that would explain the network connectivity (switch is down) and the SAS connectivity (disk shelves go down).

Can you confirm?

Bill

TCEDRYAN655
6,733 Views

That is what I am debating. However, the APC unit they are plugged into doesn't indicate a power loss, nor did all the other devices in the same rack attached to a split PDU (2 power sources). Only fault could be the APC itself (which isn't monitored, goody). Good Lead, Bill!

TCEDRYAN655
6,733 Views

Ran through the logs with NetApp Support.. HA Group Notification (REBOOT (power on)) INFO

This means, per netapp, that the cluster rebooted due to power loss. So Bill, you win!

Public