To truly isolate disk performance between the two types of disk, you must run the same tests on each type. That is pretty obvious, but another thing to consider is that the test should be run once before any measurements are recorded in case the workload for the test is able to be cached. So run it once as a warm-up and then immediately run it on the target aggregates. If the target aggregates are on different nodes, each node should get a warm-up run.
The QoS statsistics will help you isolate latency from the disks themselves. Assign a policy with no limit to the target volume and then go under "qos" - > "statistics" -> latency from disk. Follow Neto's and Chriz Ott's suggestions on looking at IOPS and concentrate on the latency from disk from QoS stats.
Always remember that performance is a combination of throughput and latency. Measuring one without the other is not very useful in most cases.