ONTAP Discussions

Disk leveling and RAID theory



   FYI, this took place 7 years ago. I'm writing a summary about a client's experience and I'm looking for information on disk leveling processes in ONTAP 7 & 7-mode, specifically what on the back end levels out the disks after more shelves are added. They added 1 shelf with 24 disks to an existing 2 shelves with 48 disks and the system was never leveled out, causing a performance imbalance.

   Also I'm looking for a document or white paper on basic RAID theory about how an array (any array) works. I need to be able to cite it for another paper I'm writing. I attended a talk on this subject at Insight around 10 years ago, back when it was still at the MGM Grand. I believe it was the year the customer appreciation night was at Cirque du Soleil.




Re: Disk leveling and RAID theory


 Reallocation must be done to even out the RAID stripe layout across the aggregate, otherwise, all new writes will go to the new disk.


What are the best practices for adding disks to an existing aggregate?

Re: Disk leveling and RAID theory


So generally WAFL will reallocate over time and level out as overwrites happen. WAFL is always writing to new free raid stripes, and as it frees up unused raid stripes, those eventually get overwritten. When blocks get freed, then existing raid stripes will be freed, and ONTAP will fill those up and the file system will level out over years in the aggregate.


What gets in the way is 1) deduplication punches holes in the raid stripes, which causes further problems, and 2) other file system processes can cause issues if a lot of partial stripes happen.


It's always a good idea to reallocate after adding disks because of performance reasons. The article Mo posted above really sums it up. Over time, unless you don't do any storage efficiencies you can need to reallocate too.


One thing that makes it more confusing is there is an Aggregate WAFL and FlexVol WAFL file system. You can have fragmentation in both levels. Generally FlexVol fragmentation is expressed as read latency, and can show up as volume read size of like 64k, but statit showing as 1-2 chains (if nothing else is running).


Aggregate fragmentation expresses in statit as more partial stripes or higher cpread/write ratios.


If you have SSDs, don't reallocate those.

View solution in original post

Earn Rewards for Your Review!
GPI Review Banner
All Community Forums