ONTAP Discussions
ONTAP Discussions
Hi there. My name is Chris. Long story short, I came onboard a company whos 2 network engineers left about the same time (not on good terms I believe). The company has a NetAPP FS3020 running Data ONTAP 7.3.5.1P1. Of course there is no service contract/agreement on it and I'm pretty sure its already reached EOL. The IT Manager knows basically nothing about it other than the login credentials and the fact that it is what all our VMware VMs are running on.
I've never used a NetAPP SAN. All I have are the credentials to login and I can already tell there are a few things wrong with it. See attached image below.
In addition to this it looks like a complete raid group has failed. And beyond that, There is a spare 4th shelf that we have, complete with disks, that is sitting there untouched. I don't know why we are not using this but I would like to install it on the rack and connect it to expand our storage space and offer more spare disks for the RAID arrays. from what I can tell, these seems to be only 1 spare disk left.
If anyone can point me in the right direction to fix the error message about not enough spare disks and a guide on how to install/integrate the 4th spare shelf.
Any help would be appreciated.
Thanks.
Here is the output of sysconfig -r
Aggregate aggr0 (failed, raid_dp, foreign, partial) (block checksums)
Plex /aggr0/plex0 (offline, failed, inactive)
RAID group /aggr0/plex0/rg1 (partial)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 272000/557056000
parity FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
data 0d.41 0d 2 9 FC:B - FCAL 10000 272000/557056000 280104/573653840 (prefail)
data FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
data FAILED N/A 272000/557056000
Raid group is missing 11 disks.
Plex is missing 2 RAID groups.
Aggregate aggr2 (online, raid_dp) (block checksums)
Plex /aggr2/plex0 (online, normal, active)
RAID group /aggr2/plex0/rg0 (normal)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0b.16 0b 1 0 FC:B - ATA 7200 847555/1735794176 847827/1736350304
parity 0b.17 0b 1 1 FC:B - ATA 7200 847555/1735794176 847827/1736350304
data 0b.18 0b 1 2 FC:B - ATA 7200 847555/1735794176 847827/1736350304
data 0b.19 0b 1 3 FC:B - ATA 7200 847555/1735794176 847827/1736350304
data 0c.20 0c 1 4 FC:A - ATA 7200 847555/1735794176 847827/1736350304
data 0c.21 0c 1 5 FC:A - ATA 7200 847555/1735794176 847827/1736350304
data 0c.22 0c 1 6 FC:A - ATA 7200 847555/1735794176 847827/1736350304
data 0c.23 0c 1 7 FC:A - ATA 7200 847555/1735794176 847827/1736350304
data 0b.24 0b 1 8 FC:B - ATA 7200 847555/1735794176 847827/1736350304
data 0b.25 0b 1 9 FC:B - ATA 7200 847555/1735794176 847827/1736350304
data 0c.26 0c 1 10 FC:A - ATA 7200 847555/1735794176 847827/1736350304
data 0b.27 0b 1 11 FC:B - ATA 7200 847555/1735794176 847827/1736350304
data 0c.28 0c 1 12 FC:A - ATA 7200 847555/1735794176 847827/1736350304
Aggregate aggr1 (online, raid_dp) (block checksums)
Plex /aggr1/plex0 (online, normal, active)
RAID group /aggr1/plex0/rg0 (normal)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0d.16 0d 1 0 FC:B - FCAL 10000 272000/557056000 280104/573653840
parity 0d.17 0d 1 1 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0a.18 0a 1 2 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0a.22 0a 1 6 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0d.39 0d 2 7 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0a.19 0a 1 3 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0a.20 0a 1 4 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0a.23 0a 1 7 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0d.25 0d 1 9 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0d.21 0d 1 5 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0a.26 0a 1 10 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0a.32 0a 2 0 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0a.33 0a 2 1 FC:A - FCAL 10000 272000/557056000 280104/573653840
data 0d.34 0d 2 2 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0d.35 0d 2 3 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0d.36 0d 2 4 FC:B - FCAL 10000 272000/557056000 280104/573653840
RAID group /aggr1/plex0/rg1 (normal)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0d.37 0d 2 5 FC:B - FCAL 10000 272000/557056000 280104/573653840
parity 0d.27 0d 1 11 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0d.38 0d 2 6 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0d.28 0d 1 12 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0d.29 0d 1 13 FC:B - FCAL 10000 272000/557056000 280104/573653840
data 0a.44 0a 2 12 FC:A - FCAL 10000 272000/557056000 274845/562884296
data 0d.42 0d 2 10 FC:B - FCAL 10000 272000/557056000 274845/562884296
data 0d.45 0d 2 13 FC:B - FCAL 10000 272000/557056000 274845/562884296
data 0d.43 0d 2 11 FC:B - FCAL 10000 272000/557056000 274845/562884296
data 0a.40 0a 2 8 FC:A - FCAL 10000 272000/557056000 280104/573653840
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 0b.29 0b 1 13 FC:B - ATA 7200 847555/1735794176 847827/1736350304
hi
Agree with the last comment, you can technically destroy this foreign aggr and have it as a spare. but as it already marked as "pre fail" it might not stay as spare candidate for long (will fail itself)
Note that the current actual spare disk you have there is from a type that can only replace a failed disk in AGGR2 and cannot replace a filed disk in AGGR1. you also didn't mention what other shelf and IO Module type you have there. it could a 3rd type or non compatible.
Anyhow. this hardware is 15-10 years old. there were three generations of disks shelves and heads after it and the 4th is coming. it's about time for this system to get decommissioned, i think that touching it and trying to expand it adds more risk for it's stability and getting parts for it will not be easy.
Gidi
I see. Yes it looks like it has prefailed so I wouldn't even bother using it as a spare.
Ohh your right about that spare. I did not see that before. But now that I think of it (and correct me if I'm wrong) isn't the whole purpose of having an aggregates in a RAID setup is so that it allows for disks to go bad? I mean that is what the parity is for. So even if I don't have a spare, I shouldn't sweat it because if a disk does go bad, we can just order a new one and simply swap it out to rebuild the RAID.
Now beyond that, even though the nature of RAID is for fault tolerance, I would feel much safer knowing that there are at least (2) appropriately-sized spares of both sized disks (847827MB & 280104MB). Hencse that is why I want to add that shelf. The other shelf that we have is a DS14MK2. (pic below) It looks very similar to the first 3 shelves. We installed it on the rack and powered it on not too long ago. Its simply not connected via the fiber optic cable.
I know this system is very old, but they do not want to spend any money on upgrading to a new system. Therefore I just want to make sure this system lasts for as long as it can.
What would you think the best course of action would be?
Or in lieu of spares I coudl just join them to the aggregate thus increasing the storage capacity.
I don't know why but when I replied it didn't show in the thread so I will repost my reply to
I see. Yes it looks like it has prefailed so I wouldn't even bother using it as a spare.
Ohh your right about that spare. I did not see that before. But now that I think of it (and correct me if I'm wrong) isn't the whole purpose of having an aggregates in a RAID setup is so that it allows for disks to go bad? I mean that is what the parity is for. So even if I don't have a spare, I shouldn't sweat it because if a disk does go bad, we can just order a new one and simply swap it out to rebuild the RAID.
Now beyond that, even though the nature of RAID is for fault tolerance, I would feel much safer knowing that there are at least (2) appropriately-sized spares of both sized disks (847827MB & 280104MB). Hencse that is why I want to add that shelf. The other shelf that we have is a DS14MK2. (pic below) It looks very similar to the first 3 shelves. We installed it on the rack and powered it on not too long ago. Its simply not connected via the fiber optic cable.
I know this system is very old, but they do not want to spend any money on upgrading to a new system. Therefore I just want to make sure this system lasts for as long as it can.
What would you think the best course of action would be?
Or in lieu of spares I coudl just join them to the aggregate thus increasing the storage capacity.