Solved: Re: Netapp rebuild fas 2040 single controller

thaz · ‎2016-03-22

Dears I am totally new to netapp. We have a netapp 2040 FAS with 12 hard disks. For some reason when this was configured they didnt configure the spare disks. So we are with a netapp with 12 disk without a spare disk. Can any one suggest a rebuild option so that I can configure atleast 1 spare disk.

sysconfig
NetApp Release 8.1 7-Mode: Thu Mar 29 13:54:17 PDT 2012
System ID: 0142269140 (STORAGE)
System Serial Number: xxxxxxxxxxx (STORAGE)
System Rev: C1
System Storage Configuration: Single Path
System ACP Connectivity: NA
slot 0: System Board
Processors: 2
Processor type: Intel(R) Xeon(R) CPU @ 1.66
GHz
Memory Size: 4096 MB

DISK OWNER POOL SERIAL NUMBER HOME
------------ ------------- ----- ------------- -----------
--
0c.00.2 STORAGE (142269140) Pool0 Z1N1GYCX STORAGE (14
2269140)
0c.00.3 STORAGE (142269140) Pool0 Z1N1H0MV STORAGE (14
2269140)
0c.00.1 STORAGE (142269140) FAILED Z1N1G5TL STORAGE (14
2269140)
0c.00.4 STORAGE (142269140) Pool0 Z1N1GZDY STORAGE (14
2269140)
0c.00.6 STORAGE (142269140) Pool0 Z1N1GZVV STORAGE (14
2269140)
0c.00.8 STORAGE (142269140) Pool0 Z1N1GX1Z STORAGE (14
2269140)
0c.00.7 STORAGE (142269140) Pool0 Z1N1GZZ6 STORAGE (14
2269140)
0c.00.5 STORAGE (142269140) Pool0 Z1N1GWBG STORAGE (14
2269140)
0c.00.11 STORAGE (142269140) Pool0 Z1N1H0CG STORAGE (14
2269140)
0c.00.10 STORAGE (142269140) Pool0 Z1N1GXDR STORAGE (14
2269140)
0c.00.0 STORAGE (142269140) Pool0 MS1VZ5ZF STORAGE (14
2269140)

aggr status
Aggr State Status Options
aggr1 online raid_dp, aggr
64-bit
aggr0 online raid_dp, aggr root
reconstruct
degraded
64-bit

We had 2 hard disk failure and I have just added one hard disk now. So it started to rebuild. I was wondering if we can just rebuild the whole netapp? please advice

ghislanzoni · ‎2016-03-22

With the command "aggr status -v" you can find how many volumes are under all the aggregates.

Yes, if you destroy the aggr1 you can add all disks under aggr0 and use one spare disk.

To complete reinstall the storage you can follow this guide:

https://kb.netapp.com/support/index?page=content&id=1011550&pmv=print&impressions=false

Basically you need to reboot the filer, press CTRL+C for the special boot menu and options 4a for the full reinstallation (this options erase all the data and configuration).

Regards

Roberto

View solution in original post

ghislanzoni · ‎2016-03-22

Hi,

I'm sorry but is not possibile to remove a disk from the aggregate to use for spare.

Can you post the result of the command "aggr status -r" ?

Thanks

Roberto

thaz · ‎2016-03-22

Please find below

Aggregate aggr1 (online, raid_dp) (block checksums)
Plex /aggr1/plex0 (online, normal, active)
RAID group /aggr1/plex0/rg0 (normal, block checksums)

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks
) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------
- --------------
dparity 0c.00.3 0c 0 3 SA:B - SATA 7200 847555/173579
4176 847884/1736466816
parity 0c.00.4 0c 0 4 SA:B - SATA 7200 847555/173579
4176 847884/1736466816
data 0c.00.5 0c 0 5 SA:B - SATA 7200 847555/173579
4176 847884/1736466816
data 0c.00.6 0c 0 6 SA:B - SATA 7200 847555/173579
4176 847884/1736466816
data 0c.00.7 0c 0 7 SA:B - SATA 7200 847555/173579
4176 847884/1736466816
data 0c.00.8 0c 0 8 SA:B - SATA 7200 847555/173579
4176 847884/1736466816
data 0c.00.11 0c 0 11 SA:B - SATA 7200 847555/173579
4176 847884/1736466816
data 0c.00.10 0c 0 10 SA:B - SATA 7200 847555/173579
4176 847884/1736466816

Aggregate aggr0 (online, raid_dp, reconstruct, degraded) (block checksums)
Plex /aggr0/plex0 (online, normal, active)
RAID group /aggr0/plex0/rg0 (reconstruction 19% completed, block checksums)

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks
) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------
- --------------
dparity FAILED N/A 847555/ -
parity 0c.00.0 0c 0 0 SA:B - SATA 7200 847555/173579
4176 847884/1736466816 (reconstruction 19% completed)
data 0c.00.2 0c 0 2 SA:B - SATA 7200 847555/173579
4176 847884/1736466816

Spare disks (empty)

Broken disks

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks
) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------
- --------------
failed 0c.00.1 0c 0 1 SA:B - SATA 7200 847555/173579
4176 847884/1736466816

ghislanzoni · ‎2016-03-22

I see that you have two aggregate:

aggr0 - 3 disk (1 failed, 1 reconstructing, 1 data)

aggr1 - 8 disk

If the aggr1 is free you can destroy it and add N-1 disk to the aggr0 (in 7 mode you can use only 1 aggregate).

I don't see the disk 0c.00.9, can you post the command "disk show -v".

Thanks
Roberto

thaz · ‎2016-03-22

Thanks for your kind reply. Aggr1 is what I thnk has our file server and Aggr0 Iam not sure why we have that . We had actually linked the Netapp to a windows 2008 server thriough iscsi and the netapp is acting as a file server. Please find the screen shot attached. Also I can only see 11 hard disk in the list. Not sure what the 12th hard disk is assigned to.

DISK OWNER POOL SERIAL NUMBER HOME
------------ ------------- ----- ------------- -----------
--
0c.00.2 STORAGE (142269140) Pool0 Z1N1GYCX STORAGE (14
2269140)
0c.00.3 STORAGE (142269140) Pool0 Z1N1H0MV STORAGE (14
2269140)
0c.00.1 STORAGE (142269140) FAILED Z1N1G5TL STORAGE (14
2269140)
0c.00.4 STORAGE (142269140) Pool0 Z1N1GZDY STORAGE (14
2269140)
0c.00.6 STORAGE (142269140) Pool0 Z1N1GZVV STORAGE (14
2269140)
0c.00.8 STORAGE (142269140) Pool0 Z1N1GX1Z STORAGE (14
2269140)
0c.00.7 STORAGE (142269140) Pool0 Z1N1GZZ6 STORAGE (14
2269140)
0c.00.5 STORAGE (142269140) Pool0 Z1N1GWBG STORAGE (14
2269140)
0c.00.11 STORAGE (142269140) Pool0 Z1N1H0CG STORAGE (14
2269140)
0c.00.10 STORAGE (142269140) Pool0 Z1N1GXDR STORAGE (14
2269140)
0c.00.0 STORAGE (142269140) Pool0 MS1VZ5ZF STORAGE (14
2269140)

ghislanzoni · ‎2016-03-22

It's a strange situation.

In this moment I suggest you to wait until the full reconstruction of the disk.

After that you can try to reseed the missing disk (remove and reinsert it).

There is also a possibility (not so simple and with full downtime) to move to root volume under aggr0 in the aggr1, so you can make a single aggregate with a spare disk, you can try to see this posts:

https://kb.netapp.com/index?page=content&id=1010097&actp=LIST_POPULAR

http://community.netapp.com/t5/FAS-and-V-Series-Storage-Systems-Discussions/Moving-Root-Volume/td-p/31741

Pay attention is not a normal operation task and if you make a mistake you can destroy all the configuration.

Regards

Roberto

thaz · ‎2016-03-22

Dear ghislanzoni, really appreciate your help.

Can you please let me know a way of finding out what data is in Aggr1 ? I was wondering if I delete aggr1 then I could free up the hard disks?

Also if I need to rebuild the whole system what would be the steps. I have taken a complete back up of the files we have in the device.

Thanks

ghislanzoni · ‎2016-03-22

With the command "aggr status -v" you can find how many volumes are under all the aggregates.

Yes, if you destroy the aggr1 you can add all disks under aggr0 and use one spare disk.

To complete reinstall the storage you can follow this guide:

https://kb.netapp.com/support/index?page=content&id=1011550&pmv=print&impressions=false

Basically you need to reboot the filer, press CTRL+C for the special boot menu and options 4a for the full reinstallation (this options erase all the data and configuration).

Regards

Roberto

thaz · ‎2016-03-22

Please find below.

aggr status -v
Aggr State Status Options
aggr1 online raid_dp, aggr nosnap=off, raidtype=raid_dp,

64-bit raidsize=14,
ignore_inconsistent=off,
snapmirrored=off,
resyncsnaptime=60,
fs_size_fixed=off,
snapshot_autodelete=on,
lost_write_protect=on,
ha_policy=cfo,
hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off

Volumes: Share

Plex /aggr1/plex0: online, normal, active
RAID group /aggr1/plex0/rg0: normal, block checksums

aggr0 online raid_dp, aggr root, diskroot, nosnap=off,
reconstruct raidtype=raid_dp, raidsize=14,

degraded ignore_inconsistent=off,
64-bit snapmirrored=off,
resyncsnaptime=60,
fs_size_fixed=off,
snapshot_autodelete=on,
lost_write_protect=on,
ha_policy=cfo,
hybrid_enabled=off,
percent_snapshot_space=0%,
free_space_realloc=off

ghislanzoni · ‎2016-03-22

On the aggr1 you have only the volume Share but i dont'see the end of the command for the aggr0, is truncated?

thaz · ‎2016-03-22

sorry missed that bit,

Volumes: vol0

Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: reconstruction 39% completed, b
lock checksums

thaz · ‎2016-03-22

Sorry missed it,

Volumes: vol0

Plex /aggr0/plex0: online, normal, active
RAID group /aggr0/plex0/rg0: reconstruction 39% completed, b
lock checksums

ghislanzoni · ‎2016-03-22

Ok, on the aggr0 there is only the vol0 (the volume of the root where reside configuration).

So, if you backup your share data, fell free to reinitialize all the storage.

Regards

Roberto

aborzenkov · ‎2016-03-22

It would be much easier to simply move vol0 to aggr1 and destroy unused aggr0 after that. This can even be done non-disruptive if controller is in HA pair or with minimal downtime otherwise.

thaz · ‎2016-03-22

Can you please advice on how to do this.

aborzenkov · ‎2016-03-22

https://kb.netapp.com/support/index?page=content&id=1010097

Netapp rebuild single controller