ONTAP Hardware

FAS6280. Disappointing performance?

webfarm_aruba
19,764 Views

We have just acquired a new shiny FAS6280 in metrocluster configuration with 2x512GB PAM per controller and a total of 336 FC 15k disks hoping for awesome performance, but now that we have set it up we are quite disappointed about NFS performance.

We already have a metrocluster FAS3160 (our first netapp), and I have to say that we were surprised by its performance reaching the 35-40k NFS v3 IOPS per controller (our application profile is 80% metadata, 12% reads, 8% writes on hundred of millions of files) saturating CPU (its bottlenek in our profile), with very low latency (disks and bandwidth were OK).

We use just NFS protocol, nothing more, nothing less but we use it heavily and we need very high IOPS capacity (we use storage for real not as many customers that use it without touching its limit), so we decided to move to the new FAS6280 hoping for a huge improvement in CPU performance (powerful exacore X5670 versus a tiny old dual core 2218 AMD Opteron). If you look at CPU benchmarks websites (like http://www.cpubenchmark.net) you can see something like 6x performance in pure CPU power so we hoped for at least 3x performance increment knowing that the bottleneck in our configuration was just the CPU.

A bad surprise.. using same application profile, a single controller seems barely reach 75-80k IOPS before 100% CPU busy is touched. So from our point of view just 2x performance more than a FAS3160. The bad thing is that obviously a FAS6280 doesn't cost 2x a FAS3160...

So we tried to investigate..

The real surprise is how the cpu is badly badly (let me say badly) employed on FAS6280. For a comparison here is the sysstat -m from our FAS3160 running near 100% cpu busy:

sysstat -m 1
ANY    AVG   CPU0 CPU1 CPU2 CPU3
100%   86%    70%  96%   90%   90%

How you can see the CPUs usage is quite well balanced and all core are employed. Good.

Now the sysstat -m on the FAS6280 with same application profile:

sysstat -m 1
ANY  AVG  CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11
99%  21%    6%   0%   0%   0%   0%   0%   0%   0%  48%  48%  58%  93%

How you can see the FAS6280 barely uses 4-5 cpu cores out of 12 available, expecially CPU11 reaching 100% cpu busy very soon. We tried to optimize everything, mount in V2, V3, V4, create more aggregates, more volumes, flexvols, traditional vols and so on, but nothing let us to increase performances..

From my personal point of view, it seems that the FAS6280 hardware is far far more advanced than the Ontap version (8.0.1) that probably can't take advantage of bigger number of cores of this new family of filers, so it finishes to use an advanced cpu like X5670 just as an older dual core or a little more.. Simply the x5670 core is faster than 2218 core so it obtain a better performance.. but far far away from what it could do..

I read that new ontap upgrades should unlock NVRAM (now it can use just 2Gigs out of 8Gigs installed) and cache (now it can use 48Gigs out of 96Gigs) and should give better multithreading. Will these upgrades unlock some power out of our FAS6280?

Another strange thing is that the filer reach 100% busy CPU with extra low latencies. I read that is not recommended to take the filer upper than 90% cpu busy limit, because latencies could increase very fast and in an unpredictable way. This sounds reasonable, but for us is not useful at all to have application with 200us latencies.. we just need to stay under the 12ms limit.. For instance, if we touch the 100% CPU bound is it reasonable to continue to increase the filer usage until we reach, for example, 8ms medium latency for the slowest op? Or the latency could really explode in a unpredictable way causing problems?

What do you think? I sincerely don't know what other to refine, I think we have tried almost everything. Can we better balance CPU utilization? What do you suggest?

Of course I can post more diag commands output if you need it.

Thanks in advance,

Lorenzo.

1 ACCEPTED SOLUTION

lovik_netapp
19,719 Views

Hi Lorenzo,

Welcome to community!

Let me first tell you that, "you should never judge a netapp system's load by CPU usage." It's always a bad idea to think that a system is busy just by looking at CPU usage, to best use your netapp system always use multithreaded operation with recommended settings and look at latency values, which in your case is very good 200us.

Ontap ties every type of operation in a stack called domain, every domain is coded in a way that it will fist use the explicit assigned cpu and once it satuarates that CPU then it shifts its load to next CPU. Every domain does have it's own priority and they are hard coded so instance a houkeeping job domain will always have lower priority over NFS/CIFS domain and Ontap always makes sure that user requests always take priority over system internal work.

One more thing please don't look at CPU stat's 'ANY' counter always see 'AVG' as 'ANY' conter always give me a mild heart attack.

At last I would say that the issuse you are looking at is cosmetic, however If you think you have any performance problem run a benchmarck with SIO_ontap and then you can see the system's capaciry or open a support case but I am sure you will hear the same thing from support also.

Cheers!

View solution in original post

43 REPLIES 43
Public