2009-10-04 01:12 PM
I have a questin to DataONTAP experts.
I can create a RAID-4 configuration on 2 disks (aggr create -t raid4 -r 2).
How do you think, is it RAID-1 in fact?
I know, the difference between RAID-1 (mirror level) and RAID-4 (parity level), but I don't know how RAID-4 works on 2 disks.
2009-10-05 06:28 AM
I wouldn't consider myself an expert on OnTAP, but I think I can still answer your question.
RAID 4 on 2 disks is still RAID 4, where information is stored on one disk (the data disk) and parity is stored on the other disk (the parity disk). In RAID 1, the information is mirrored from one disk to the other (as you said in your PS), so even with two disks a RAID 4 array is different because it is providing fault tollerence rather than 1-1 information duplication. In the event of a data disk failure, the array would switch to degraded mode and begin reading and writing information based on the PARITY information in the parity disk (i.e. caclulating missing information) rather than just switching pointers in the event of a RAID 1/mirror set failure.
I see your arguement on why it might seem like RAID 1 in this scenario, but the way the information is laid down on the disks is still different, so it is still considered RAID 4 with 2 disks. Furthermore, in a RAID 1 set, you are committed to a 1-1 disk ratio, but if you wanted to expand the RAID 4 array all you would need to do is add disks to the aggregate.
I hope this answers your question, and if I have left anything out, I would appreciate it if anyone with a better explanation could step in and set me straight.
Technical Systems Engineer
2009-10-05 08:06 AM
Sounds like a great explanation Ethan.
Just wanted to mention that there are trade-offs if you are planning on making a 2 disk raid instead of going with the default size. It may fit your need just fine, but I thought it better to point it to be safe.
Most importantly make sure to understand the –r switch and the difference between raidsize (maximum number of disks in the raid group) versus ndisks (number of drives to add in the aggr).
This link may be useful;
2009-10-05 11:52 AM
Thanx for reply Ethan,
But what's, with performance?
How do you think, is it similar in both configurations?
In my opinion RAID-1 is faster in read (reads from 2 disk independly vs. reads from data disk only) and write (no overhead for parity calculating) then RAID-4 on 2 disks.
I'm interesting in it, because someone told me, that RAID-4 on 2 disks is functional equivalent RAID-1.
2009-10-05 12:05 PM
That I'm not 100% sure about, but from my understanding the RAID 4 would be comparable in speed, or possibly a tiny bit slower due to the parity calculations... but keep in mind that the way RAID 1 works, it lays the information down to the disk, then mirrors it over, it's not a similtaneous write--also, when you're doing reads, in a RAID 1 scenario, the active disk is used, but not the mirror. The mirror disk is used for redundancy only, not reads or writes, so you'll still get single disk speeds.
As far as RAID 4 and RAID 1 being functionally equivalent, I suppose you could say that, as long as you understand that the way they're writing the information is different in each different scenario, and that recovering from a failure is completely different.
Again, if anyone else would like to add to this, please feel free.
2009-10-05 07:17 PM
Hi Mr Karpeta,
Its a common misunderstanding from some, NetApp competition in particular, to state that raid-dp (netapps implementation of raid6) is slower due to having to
compute extra parity. the fact is that all of this is done in the NVRAM card BEFORE going to disk, thus this is not holding up performance given that the NVRAM
is much faster than most disks IE.
2009-10-06 12:30 AM
There are some legacy edge cases we saw where 2 drive raid4 raid groups were been used when disk utilization was over 50% on an aggregate before losing a disk and losing a drive would cause degradation. If there is a disk failure, disk I/O will increase (estimating double if remaining disks are read an extra I/O for missing data) on remaining disks until the rebuild is complete..so if 50%+ utilization before a failure, raid4 will have degradation to the aggregate until the rebuild is done. Depending on aggregate size, adding disks to keep utilization down works but if you max the aggr size and still are over 50% and can't have degradation...there is the edge case. This is independent of the rebuild for performance. With 2 drive raid groups, there isn't the issue of doubling I/O on remaining drives and I/O performance is stable even with the failure. Since syncmirror is free now, it probably makes more sense...raid_dp+1. A disadvantage of RAID4 2 drive raid groups compared to syncmirror on raid_dp aggregates is that background disk firmware updates will not occur since they work with dual parity...unless you manually get around it by adding a dual parity drive, wait for rebuild, wait for background fw update to complete, then drop the dual parity drive and zero it...painful and not too functional.
Unless there have been changes in WAFL that are new, ONTAP stripes to disk from RAM when NVRAM is half full or every 10 seconds (whichever occurs first). Parity is kept in memory with data prior to write and block maps are used to determine write stripes. NVRAM isn't used to write to disk but is used as insurance until RAM is flushed to disk.
2009-10-06 11:42 AM
Great question...someone on the WAFL team maybe can clarify (correct me).... From the block map, all disks are written to in the aggregate across all raid groups concurrently. I remember something about 4k blocks grouped into chunks of 24 per drive...96k total per disk across all raid groups concurrently... When you look at the disk drives you see them all blink together on write so it is concurrent...although if a near full aggregate there are other considerations and incomplete stripes to fill.
This white paper is great for an overview.. http://www.netapp.com/us/library/technical-reports/wp-3001.html
2009-10-07 10:18 AM
in addition to the previous answers my 2cents:
The parity in RAID-4 (and RAID-DP) is calculated as a sum across the blocks in the stripe, an XOR, to be exact.
In the case of a 2 disk RAID-4 group, the parity would therefore be the exact contents of the data drive.
In other words, if you wouldn't know better, it sure would look like RAID-1.
But it's an edge case... The internal workings are different, it's expandable and the recovery is different.
Like Scott mentioned, the data is written concurrently to all available disks (in the way he mentioned, chains and all...).
In the edge case of a 2-disk aggregate, though, if would obviously be faily sequential.
One thing to keep in mind, if you're playing with the idea of building an aggregate out of 2-disk RAID-4 RGs:
RAID-DP gives you about 1000-times (depending on disk type and raid group size...) better security, more performance and uses just a bit over half as many disks!
(Less disks - less risk, ANY 2 disks can fail in a RAID-DP aggr, whereas in the 2-disk RG aggregate, if a pair fails, you've got data loss)
Hope that helped