Talk and ask questions about NetApp FAS and AFF series unified storage systems. Talk with other members how to optimize these powerful data storage systems.
Talk and ask questions about NetApp FAS and AFF series unified storage systems. Talk with other members how to optimize these powerful data storage systems.
Hello NA Community, I had to power off our old FAS2520 for a maintenance and I think all the SSDs of our flashpool hit the bug 1335350 where they failed after a power cycle and I'm pretty sure we exceeded 70,000 power on hours. The bug details says to contact support, unfortunately this is now an outdated system with an outdated ONTAP version (otherwise I wouldn't have had that issue), so I'm not even able to create a support case for that system anymore. Did someone here encounter this issue and was there a way to make the SSDs that exceed 70k power on hours work again? I assume updating the FAS2520 now wouldn't be useful as the system will be unable to flash a new firmware on the SDDs that are marked as failed/broken. Thank you in advance.
... View more
Hi, in my lab i'm using a single controller FAS2750 and i'm trying to use the e0a and e0b to connect them to a ethernet switch, but i don't get a link. When using the same cable and make a bridge from e0a to e0b i'm getting a link, but not when connecting to the switch. Same on the switch, when i connecting both ports i'm getting also a link. Switch and NetApp are showing all SFP informations of the connected SFP. I have already tryed different DAC cables and SFP modules with optical cables. The switch is a Mellanox MSN2010. Do i missed something? Kind regards Stefan
... View more
A300 have 4*DS224C(shelf ID 0,1,2,3) in single stack. Data aggregates are spread among all disks in this stack. During maintenance window, I plan to move the shelf id 3 to a new stack within the same A300, set new shelf ID as 10 and power on shelf and controller. Does it cause any impact to data? Is there any document or KB for this?
... View more
Hello Team, FAS3220, ONTAP 8.2P3 7-Mode. 2 Controllers, in UP Status, hosts see storage resources. We can login through SSH and OnCommand System Manager to 2nd Controller. But, We unable to login on 1st Controller: 1) Through SSH - when We entering correct login and password, SSH session immedeately closed; 2) Through OCSM - got 500 error. We have tried solutions from https://community.netapp.com/t5/Active-IQ-Unified-Manager-Discussions/OnCommand-System-Manager-recieves-error-500/td-p/93621 thread. Nothing helped. Tried different versions of Java: 7.25 and 6*, OCSM 3.0, 3.1 - nothing helped. But, We can login on 1st Controller through console port. We have created a new User with all permissions - Cannot login via SSH (session immedeately closed) and OCSM (500 error). We need to login via SSH and OCSM to 1st Controller. Can You, please, help to locate where is the problem? Thank you!
... View more
HI all, after an NVRAM module ECC error/failure of a controller on a 2552 in 7 mode (HA) and subsequent replacement from another controller (cleanly shutdown supposedly), the controller now boots up (the other is in takeover) but won't go in "waiting for giveback" and complains about ownership of a couple of disks which have been reserved by the HA partner: Mar 26 17:11:06 [localhost:diskown.isEnabled:info]: software ownership has been enabled for this system Reservation conflict found on this node's disks! Local System ID: 537074349 Mar 26 17:11:06 [localhost:cf.fmns.skipped.disk:notice]: While releasing the reservations in "Waiting For Giveback" state Failover Monitor Node State(fmns) module skipped the disk 0a.00.22 that is owned by xxxxxxxxx and reserved by yyyyyyyyy. WAFL CPLEDGER is enabled. Checklist = 0x7ff841ff Press Ctrl-C for Maintenance menu to release disks. add host 127.0.10.1: gateway 127.0.20.1 Mar 26 17:11:09 [localhost:wafl.memory.status:info]: 2004MB of memory is currently available for the WAFL file system. NOTE: You have chosen to boot the diagnostics kernel. Use the 'sldiag' command in order to run diagnostics on the system. Mar 26 17:11:09 [localhost:dcs.framework.enabled:info]: The DCS framework is enabled on this node. The system has booted in maintenance mode allowing the Mar 26 17:11:09 [localhost:cf.fmns.skipped.disk:notice]: While releasing the reservations in "Waiting For Giveback" state Failover Monitor Node State(fmns) module skipped the disk 0a.00.12 that is owned by xxxxxxxxx and reserved by yyyyyyyyy. following operations to be performed: Mar 26 17:11:09 [localhost:snmp.link.up:info]: Interface 8 is upMar 26 17:11:09 [localhost:netif.linkUp:info]: Ethernet e0P: Link up. ? acorn Mar 26 17:11:09 [localhost:snmp.link.up:info]: Interface 1 is up acpadmin aggr Mar 26 17:11:09 [localhost:netif.linkUp:info]: Ethernet e0a: Link up. cna_flash disk Mar 26 17:11:09 [localhost:snmp.link.up:info]: Interface 2 is up disk_list disk_mung Mar 26 17:11:09 [localhost:netif.linkUp:info]: Ethernet e0b: Link up. disk_qual disk_shelf diskcopy disktest dumpblock environment fcadmin fcstat fctest fru_led ha-config halt help ifconfig key_manager led_off led_on nv8 raid_config sasadmin sasstat scsi sesdiag sldiag storage stsb sysconfig systemshell ucadmin version vmservices vol vol_db vsa xortest Type "help <command>" for more details. In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode. FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED NOTE: It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up Mar 26 17:11:14 [localhost:snmp.link.down:info]: Interface 7 is down. Mar 26 17:11:14 [localhost:netif.linkDown:info]: Ethernet e0M: Link down, check cable. Continue with boot? After continuing the boot it enters maintenance mode and checking the various statuses I get this: *> aggr status Aggr State Status Options aggr0 online raid_dp, aggr root degraded 64-bit *> aggr show Mar 26 17:59:18 [localhost:fmmb.current.lock.disk:info]: Disk ?.? is a local HA mailbox disk. aggr: No such command "show". Mar 26 17:59:18 [localhost:fmmb.current.lock.disk:info]: Disk 0a.00.2 is a local HA mailbox disk. The following commands are available; for more information Mar 26 17:59:18 [localhost:fmmb.instStat.change:info]: missing lock disks, possibly stale mailbox instance on local side. type "aggr help <command>" Mar 26 17:59:18 [localhost:raid.mirror.vote.versionZero:debug]: raid: mirror info empty clear_rpbits options rename snaprestore_cancel Mar 26 17:59:18 [localhost:coredump.host.spare.none:info]: No sparecore disk was found for host 0. Halting and Rebooting with diags I get this: LOADER-A> boot_diags Loading X86_64/freebsd/image2/kernel:0x100000/9578776 0xa22918/4044416 Entry at 0x8016e880 Loading X86_64/freebsd/image2/platform.ko:0xdfe000/786856 0xf9bea0/724152 0xebe1c0/45064 0x104cb58/49752 0xec91c8/110791 0xee428f/80654 0xef7da0/172160 0x1058db0/195312 0xf21e20/16 0xf21e30/2448 0x10888a0/7344 0xf22800/0 0xf22800/344 0x108a550/1032 0xf22958/1952 0x108a958/5856 0xf230f8/1648 0x108c038/4944 0xf23768/240 0x108d388/720 0xf23860/448 0xf5e860/14942 0xf9bda2/253 0xf622c0/136824 0xf83938/99434 Starting program at 0x8016e880 NetApp Data ONTAP 8.2.4P6 7-Mode Root mount waiting for: usbus0 Root mount waiting for: usbus0 Root mount waiting for: usbus0 Root mount waiting for: usbus0 Copyright (C) 1992-2017 NetApp. All rights reserved. md1.uzip: 39168 x 16384 blocks md2.uzip: 16640 x 16384 blocks ******************************* * * * Press Ctrl-C for Boot Menu. * * * ******************************* Error burning Mellanox sinai chipset firmware. ^Cqla_init_hw: CRBinit running ok: 8c633f NIC FW version in flash: 5.4.9 Mar 26 17:53:56 [localhost:sasmon.disable.module:info]: SAS domain is not monitoring transient errors. Mar 26 17:53:58 [localhost:cf.nm.nicTransitionUp:info]: HA interconnect: Link up on NIC 0. qla_init_hw: CRBinit running ok: 8c633f NIC FW version bundled: 5.4.56 qla_init_hw: CRBinit running ok: 8c633f NIC FW version in flash: 5.4.9 Mar 26 17:54:03 [localhost:cf.rv.flush.handleExchange:info]: HA interconnect: Flushing is active. qla_init_hw: CRBinit running ok: 8c633f NIC FW version bundled: 5.4.56 Mar 26 17:54:05 [localhost:netif.linkDown:info]: Ethernet Wrench Port: Link down, check cable. Mar 26 17:54:07 [localhost:snmp.link.down:info]: Interface 3 is down. Mar 26 17:54:07 [localhost:netif.linkDown:info]: Ethernet e0c: Link down, check cable. Mar 26 17:54:07 [localhost:snmp.link.down:info]: Interface 4 is down. Mar 26 17:54:07 [localhost:netif.linkDown:info]: Ethernet e0d: Link down, check cable. Mar 26 17:54:08 [localhost:snmp.link.down:info]: Interface 5 is down. Mar 26 17:54:08 [localhost:netif.linkDown:info]: Ethernet e0e: Link down, check cable. Mar 26 17:54:08 [localhost:snmp.link.down:info]: Interface 6 is down. Mar 26 17:54:08 [localhost:netif.linkDown:info]: Ethernet e0f: Link down, check cable. Mar 26 17:54:08 [localhost:diskown.isEnabled:info]: software ownership has been enabled for this system Reservation conflict found on this node's disks! Local System ID: xxxxxxxxx Mar 26 17:54:08 [localhost:cf.fmns.skipped.disk:notice]: While releasing the reservations in "Waiting For Giveback" state Failover Monitor Node State(fmns) module skipped the disk 0a.00.0 that is owned by xxxxxxxxx and reserved by yyyyyyyyy. WAFL CPLEDGER is enabled. Checklist = 0x7ff841ff Press Ctrl-C for Maintenance menu to release disks. add host 127.0.10.1: gateway 127.0.20.1 Mar 26 17:54:11 [localhost:wafl.memory.status:info]: 2004MB of memory is currently available for the WAFL file system. NOTE: You have chosen to boot the diagnostics kernel. Use the 'sldiag' command in order to run diagnostics on the system. Mar 26 17:54:11 [localhost:dcs.framework.enabled:info]: The DCS framework is enabled on this node. The system has booted in maintenance mode allowing the following operations to be performed: ? acorn acpadmin aggr Mar 26 17:54:12 [localhost:snmp.link.up:info]: Interface 8 is up cna_flash disk Mar 26 17:54:12 [localhost:netif.linkUp:info]: Ethernet e0P: Link up. disk_list disk_mung Mar 26 17:54:12 [localhost:snmp.link.up:info]: Interface 1 is up disk_qual disk_shelf Mar 26 17:54:12 [localhost:netif.linkUp:info]: Ethernet e0a: Link up. diskcopy disktest Mar 26 17:54:12 [localhost:snmp.link.up:info]: Interface 2 is up dumpblock environment Mar 26 17:54:12 [localhost:netif.linkUp:info]: Ethernet e0b: Link up. fcadmin fcstat fctest fru_led ha-config halt help ifconfig key_manager led_off led_on nv8 raid_config sasadmin sasstat scsi sesdiag sldiag storage stsb sysconfig systemshell ucadmin version vmservices vol vol_db vsa xortest Type "help <command>" for more details. In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode. FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED NOTE: It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up Mar 26 17:54:17 [localhost:snmp.link.down:info]: Interface 7 is down. Mar 26 17:54:17 [localhost:netif.linkDown:info]: Ethernet e0M: Link down, check cable. Continue with boot? So it now complains about another reserved disk a0.00.0 and no longer a0.00.22 and a0.00.12. Also it seems there is no defined mbox disk which I suppose is not good. The partner is in takeover and is working ok at the moment. Issuing a cf giveback doesn't work as the partner is not awaiting giveback (still need to get there hopefully). Any advice is appreciated - thanks!
... View more