2012-09-17 08:16 AM
Ok we did upgrade from 8.0.2P3 to 8.1.1 on a Friday and then the following day on Saturday, I get alerted that one of our filer has a 99% cpu usage, which is unusual. so after a Netapp ticket and hours of digging, I found out that it is the rlw_upgrading causing this and a full scrub of the aggregates are required.
The cpu usage went down on Monday morning, I guess because the filer suspended the scrubbing since it determined that the filer is busy. I went to priv set diag and run aggr scrub status -v and confirmed that it was indeed suspended and the scrubbing is about 5 percent complete.
The question now is, should I run the scrubbing manually or just let the filer decide when is the good time to run it? What did you guys do?
2012-09-17 09:56 AM
That's interesting. I've not personally run into that, but I suspect I soon will since I will be upgrading several controllers from 8.0.2P3 to 8.1.1.
Have you checked the option: raid.scrub.perf_impact ?
Mine defaulted to low, but if yours is set higher, then you may want to ajust it (either temporarily or permanently).
Also, you can determine when the RAID scrub runs and for how long. What you may want to do is make sure the schedule and length of time is off production hours and runs aggressively until the upgrade is complete. See the Physical Storage Management guide, Page 111 and on for details:
Hope that helps!