I could have taken this up with Netapp support directly but thought to post this question on the community first. Also we bought the shelf from a re-seller for testing.
We plugged the shelf and booted it up but we camt to know there were existing aggregates on the shelf named aggr0 and agg0(1) namely. So we deleted them. Then we assigned all the drives to the controller 1 so that we can create a new aggregate and assign the drives respectively. Once the drives were assigned, we noticed few drives started failing and also the zeroing of few spares did not start as it should be.
When we manually started zeroing the spares, they reach upto 30% zeroing and then the drive moves into the failed state. We waited until the zeroing completed to determine how many drives move into the broken disks list. Then we again "unfailed" the drives and started zeroing the spares again for the left out drives.
To be noted, if any drive is really faulty, it should not be "unfailing" when we try to unfail it. (Am not 100% sure if this is correct). So I got the below output of how things look at my end.
========================================== Output from "environment status shelf": ==========================================
Channel: 0a Shelf: 22 SES device path: local access: 0b.22.99 Module type: IOM3; monitoring is active Shelf status: normal condition SES Configuration, shelf 22: logical identifier=0x50050cc10200646f vendor identification=NETAPP product identification=DS4243 product revision level=0172 Vendor-specific information: Product Serial Number: xxxxxxxxxxxxxxxx Status reads attempted: 118258; failed: 0 Control writes attempted: 553; failed: 0 Shelf bays with disk devices installed: 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 with error: none
Would be thankful if you can please share some thoughts / suggestions regarding this issue. I suspect the drives are really not faulty but there is something happening in the background which is causing this issue.