Subscribe

FCP Performance

I was wondering if anyone out there has any suggestions on how to improve FCP performance. It seems right now the most i'm able to get is around 10-20 mb/s over a 4g pipe. To me that just seems ridiculous.

I did not get to set up these aggr, they were already provisioned this way when I came in.

Specs

Back End

FAS3140

Data Ontap 8.0.2

2 Filers

All Raid DP

FilerA

      Aggr0 - 20 disks, 15K FCAL

      Aggr1 - 13 disks, 7200 ATA

FIlerB

       Aggr1 - 20 disks, 15k FCAL

Front End

Dell M1000e blade chassis

2 Brocade FC switches Teamed

16 Dell M610 esx servers running VMWare 4.1.

Performance

Under load, CPU utilization is around 40%, ops/s stay between 450 and 750, I/O writes around 80mb/s and reads from 30-35 mb/s.

sysstat output shows FCP in and out never going over 10mb/s. 

I've searched and searched and I can't find anything regarding normal FCP transfer speeds and what other people are averaging to compare what i'm seeing with to even know if what i'm seeing is normal or not. But 10mb/s over a 4G pipe can't be right in my eyes. Any input would be appreciated!

Re: FCP Performance

Josh -

Why does Mr Crocker always kick the tough ones back up to the top of the stack ?

So I could kick you out std disk IO/s # for your drives -

15K FCAL - ~175 IO/s

7.2K SATA - ~75 IO/s

What else are these filers doing ? 40% CPU seems a bit high for the throughput you aren't getting.

Shame you're not one of my clients -I'd log in and look at your ASUPs ...

: )

I hope this response has been helpful to you.

At your service,

Eugene E. Kashpureff

ekashp@kashpureff.org

Senior Systems Architect / NetApp Certified Instructor

http://www.linkedin.com/in/eugenekashpureff

(P.S. I appreciate points for helpful or correct answers.)

Re: FCP Performance

It could be a lot of things, you should be getting something around 400 MB/s, I would start by checking the FC switches configuration, also, have you installed the FC Host Utilities on your ESX hosts? do you have any other workload using your filer?

Cheers.

Re: FCP Performance

What kind of workload do you have? It is purely random then you will never see high throughput.

Re: FCP Performance

As far as the switch config we don't really have anything set on it besides zoning. I'm not great with brocade, but based on the performance statistics of the switch it doesn't look to be having any issues or at least to be the bottle neck.

We have not installed the FC Host Utilities on our ESX servers, where can I find that?

*edit:  Ok I found the FC Host Utilities, but it seems to similar functionality as the Netapp Virtual Storage Controller plugin. Are they the same or do you need both?

We are currently using the NVSC plugin.

There is no other workload on the filer. ESX/Vmware is the only thing using it.

Re: FCP Performance

How are you generating the workload?  Have you tried multiple streams of data simultaneously?  Are you using SIO or IOMeter with multiple threads/workers? 

Is the filer doing anything else at the time?

Single thread read or write to a filer would be slow.  Also, as mentioned by Pascal, small random IO is also going to be slower.  I usually use SIO or IOMeter with large (64KB) sequential workloads just to check maximum throughput end to end.  Try this from multiple VM's/ESX hosts at the same time.  It's not realistic, and doesn't mean you will get this level in the real world, but it will show up any artificial bottlenecks along the data path.

Hope this helps!!

Re: FCP Performance

When we were testing speed, we had a bunch of different VM transfers from different ESX hosts going. We have not used SIO yet, going to set that up and see what kind of info it gives us.  We did recently figure out that we have misaligned VMs. So we are fixing that now, but i find it hard to believe that being the cause of our problem.

Re: FCP Performance

Misaligned VM's can make a significant difference, particularly if you're pushing disk IOPs.  Are these VM's on the SATA aggr by any chance?  With 100% random workload you're looking at 50-60 IOPS per disk (check statit output) and you'll be approaching 20ms latency on SATA.  With misalignment you could be seeing up to 3 times the number of disk IOPs as a result, which will affect latency significantly if you have disks reaching higher levels of disk iops.

I'd suggest you fix your misalignment, then try again with a load generator using many threads across several ESX hosts/VM's.  Worth googling Little's Law, and also take a look at Jason Ledbetter's doc here: https://forums.netapp.com/thread/25097

Keep us posted on the results.