2012-07-13 01:56 AM
I am trying to understand the performance of the NetApp disk subsystem (or any disk subsystem for that matter)
If my hosts are generating 32KB IO requests does this translate to 8 IOs on the NetApp system? As a 4KB blocksize is used on the NetApp (is it 4 KB as in kilobyte or 4Kb as in kilobit) does this translate to a 4KB strip size, i.e. a 32KB IO request will require 4KB being read from 8 disks (assuming more than 8 data disks in the aggregate)?
Assuming this assumption is correct then if I have a 32 disks aggregate with a raid group size of 16 then I would have 28 data disks and 4 parity disks. Using the general rule that a 15K rpm SAS drive can produce an average of 175 IOPS then the aggregate could produce 28 x 175 IOPS (for read) = 4,900 IOPS at the storage device or 612 IOPS at the host using a 32KB IO size.
Is this correct or have a misunderstood this somewhere?
Then assuming that the infrastructure between the hosts and the NetApp can handle it and the hosts can process the data quick enough then I could get a maximum throughput of 612 x 32KB per second = 19.5MB/s.
So if I wanted an aggregate capable of coping with 900MB/s I would need an aggregate of :-
900MB/s / 4KB = 225,000 IOPS
225,000 IOPS / 175 IOPS per disk = 1286 data disks
Using these calculations then the throughput does not depend on the IO size generated at the host.
This doesn't sound right.
2012-07-13 09:24 AM
Say that you have calculated that the Application on the Host generates IO 32kb, and taking into account switches, mulitpath...LUNs...physical, or virtual etc. Then along the way to the storage system spindles that has an aggregate of 8 disks of Disk Type SAS 450GB with seek time of 15k rpm, then your subsystem IOPS would be 175 IOPS x 8 = ,1400 IOPS, approximately. http://www.wmarow.com/strcalc/goals.html
Therefore, 32kb divided by 4 kilobyte of blocks is equal to 8 IO. To calculate NetApp Disk Performance also depends on Plex0, aggr0, FlexVol, 7-mode or c-mode . Also take into account, NVRAM, Flash Cache and FlexCache.
2012-07-13 09:37 AM
If my hosts are generating 32KB IO requests does this translate to 8 IOs on the NetApp system?
The real answer is as usual "it depends"; but under optimal conditions single 32KB host IO should result in single 32KB NetApp disk IO.
2012-07-13 12:44 PM
Instead of performing this calculations, the best way to find the metrics regarding the IOPS, reads and writes with varying blocks, use the SIONTAP tool, download from toolchest and run it for multiple samples and take the average.
2012-07-14 11:34 AM
The 175 IOPs you mention as a rule of thumb is for random IOs at 'acceptable' latency. If you have sequential IOs, or larger individual IOs, these will be more efficient at the disk layer. Also, Data ONTAP and WAFL are designed to get the most out of the disks using a variety of techniques. So while I can understand you might like to work out these calculations with a spreadsheet, unless you buy a basic RAID array (that lacks features like cloning, dedupe, snapshots, replication, Flash-based acceleration, etc) you will almost certainly get it wrong. There are just many more factors at play than the disk mechanics.
NetApp has a performance sizing tool available to partners and employees and I'd recommend you work with someone who has access generate the answer for (or with) you.
2012-07-16 02:19 AM
But surely, ignoring any caching, if 32KB of data is requested then with a large aggregate this is likely to be read from 8 separate disks, i.e. 4KB from each disk and therefore require 8 IOs.
Now lets assume that the hosts can generate 1,000 read IOPS of 32KB. Again ignoring caching this is going to require 8,000 4KB blocks to be read and if each disk in the aggregate can handle an average of 175 IOPS then 46 data disks are going to be required to satisfy the 1,000 x 32KB read IOPS.
2012-07-16 03:16 AM
Disk seeks are expensive so Data ONTAP tries to minimze them. For example when Data ONTAP issues a request to the disks it includes a chain length, that is, how many consecutive blocks it should access from the disk. I think the max chain length is 64, so 64 x 4KB or 256KB of data could be accessed in one disk IO request. It also can use 'skips' if we have smaller IO, so for example if we needed blocks 1 and 6 on a specific disk we might issue a single IO with chain length of 6 with a skip of blocks 2-5 to allowing servicing of 2 x 4KB client requests in one disk IO. What chain length you will get depends on the layout of data on disk, which is why it is important that Data ONTAP optimize writes in the first place, and can move them later if need be. The performance sizing tool I mentioned earlier uses data from autosupport and the QA labs to predict what is needed for some new workload based on observations from these running systems.
If Data ONTAP used a physical mapping model to map client data blocks to disk data blocks your question could possibly be answered using a formula, but this isn't how Data ONTAP (or any advanced array) works these days.