ONTAP Discussions
ONTAP Discussions
I recently ran the storage aggregate show-scrub-status command on all of our clusters. On all but one cluster the last scrub time was within a week of today, with one exception, and that one was within two weeks. However, on one cluster (a 4-node all AFF cluster), two aggregates had a "-" for Last Scrub Time and was in a suspended state. One was 37% complete and the other was 71% complete. Should I be concerned and run the scrub manually?
Solved! See The Solution
Hi,
For AFF system, the default schedule is weekly at 1 a.m. on Sunday for the duration of 6 hours. So, if it extends beyond 6 hrs, ONTAP will suspend it and for non-AFF it is 4 hrs on weekday so any scrubbing that extends beyond that time will be suspended. This is understood.
However, the next schedule should have pick it up from suspension or start another run? Don't know!
Could be that if raid-scrub is in 'suspended' state, it remains in that state and never picked up again. That's very risky.
Could you do one thing:
1) Resume the suspended scrub:
::> storage aggregate scrub -aggregate <aggr> -node <node> -action resume
Hopefully, that should kick-off scrubbing from where it left. Bydefault : scrub perf impact is low, which means it is not going to have any negative impact on the storage performance. Also, as it is Friday, so no harm in resuming it.
2) Have a look later on and see if it completes.
May be once it is completed that it will pick-up new one (weekly for AFF).
https://docs.netapp.com/ontap-9/index.jsp?topic=%2Fcom.netapp.doc.dot-cm-cmpr-960%2FTOC__storage__raid-options.html [raid.scrub.schedule]
Thanks!
Hi,
For AFF system, the default schedule is weekly at 1 a.m. on Sunday for the duration of 6 hours. So, if it extends beyond 6 hrs, ONTAP will suspend it and for non-AFF it is 4 hrs on weekday so any scrubbing that extends beyond that time will be suspended. This is understood.
However, the next schedule should have pick it up from suspension or start another run? Don't know!
Could be that if raid-scrub is in 'suspended' state, it remains in that state and never picked up again. That's very risky.
Could you do one thing:
1) Resume the suspended scrub:
::> storage aggregate scrub -aggregate <aggr> -node <node> -action resume
Hopefully, that should kick-off scrubbing from where it left. Bydefault : scrub perf impact is low, which means it is not going to have any negative impact on the storage performance. Also, as it is Friday, so no harm in resuming it.
2) Have a look later on and see if it completes.
May be once it is completed that it will pick-up new one (weekly for AFF).
https://docs.netapp.com/ontap-9/index.jsp?topic=%2Fcom.netapp.doc.dot-cm-cmpr-960%2FTOC__storage__raid-options.html [raid.scrub.schedule]
Thanks!
THanks @Ontapforrum . I have kicked off the jobs and will review results on Monday. Hopefully they catch up!
Good stuff...do let us know how it goes eventually , yes that should catch up. Thanks!
@Ontapforrum the resume command got one of the affected aggregates caught up. The other one caught up partly and then went back to suspended. I'll keep manually resuming on that aggregate until it is fully caught up. All others have a fairly recent date now. Thanks for the help!
Thanks for the update. Great stuff. Take care!