Solved: Re: NetApp FAS6240 & Linux performance

robinetapp · ‎2013-07-11

Hi everybody,

i wolud like to know if anyone has a performance problem with FAS6240 and Linux?

We have two FAS systems in different location. On two Linux servers (physical) we run DB2 database. Linux is 64-bit RHEL 5.9 with kernel 2.6.18-348.1. Also the latest mpio driver is instaled (device-mapper-multipath-0.4.7-54.el5-9.2). Replication is done via DB2 HADR (so is not use aggr mirror). Multipath looks fine, also "lun stats -o" show correct traffic. Multipath.conf is ok (checked with netapp guys), iostat on linux also seems good. But we are not too happy with performance. Maybe is something wrong with server, i don't know. Server check didn't show any problems.

So my question is, if anyone had any similar problems and if maybe is some configuration (mpio, system, db2...) to change?

Thanks in advance for any info,

Robi

robinetapp · ‎2014-02-17

Hi to all,

sorry for the late reply... Maybe someone will still come in handy.... We resolved a problem. After a lot of hour of testing and system fine tuning, we found a problem in block size definition for one of database tablespace. After changing this, we reduced processing time for batch job, whitch was working with this data from 50-60 minutes to 9-10 minutes.... Stunning results. The results are now even better then they were in the "clear" Shark... So - when you migrate from one storage to another maybe is best way to find some book like "Best practice for database xxxx and netapp"... (... ...) before starting migration.

Thanks to all who participated in this discussion.

Robi

View solution in original post

radek_kubka · ‎2013-07-11

Hi Robi,

Can you quantify somehow your statement "not too happy with performance"? Is it measured latency? Or just general "feeling"?

The controllers you have are pretty fat, but how about back-end disks: are they fast spinning ones in a decent quantity?

Can you possibly post the output of "sysstat -x 1" captured when performance problems occur?

Regards,
Radek

robinetapp · ‎2013-07-11

Hi Radek,

let me explain: it is neverending story (best word is war) between us - system guys - and programmer. I wolud like to say, that we monitor and analyze netapp (with help of netapp support) about month or two. Analysis shows, that there is no problem (with performance, latency or other). But the problem is, that before netapp we used IBM DS8100 and batch processing were slightly faster. It's normal that when you change the storage you expect better processing time. I also know and take note, that there is many other factors which affect the performance (maybe is who, who doesn't want to hear)- the database is greater, in the last time DS has been unloaded and it was working only for this server and in the end DS use 300GB 15Krpm DDM, netapp has 600 GB 15Krpm DDM.

About spining - i don't know and i don't suspect, that ther will be the problem - i use aggregat with 36 modules, i'm not sure, but i think that in DS i have pool with 30 or 32 modules.

About FC - we use 2 x 4 FC connections from netapp (2 fabric with 4 cable in each, 8 Gbps) and 2 fc connection from server (4 Gbps in each fabric). I monitored and analyzed also SAN (with special tools) and there is no bottleneck.

So briefly: there is a "problem" only with this server (we have another 5 same servers, with same os for db) and only with batch processing. And this is feeling of programers (ok - we can measure the time of jobs). So my job is to convince them, that netapp work fine.

Therefore, i investigate in the direction of "fine tuning" between server (RHEL) and netapp. If maybe is some hidden settings on RHEL, which would help to improve the performance.

Best regards,

Robi

radek_kubka · ‎2013-07-12

I'd cautiously comment that DS8100 is quite a beefy system, so it is not necessarily obvious that FAS6240 will perform better for *any* type of workload.

Does this batch job do a lot of sequential reading by any chance?

robinetapp · ‎2013-07-14

Hi,

I spoke with our programmers, who said, that there are a lot of sequential readings. They read account after account in unsorted order.

aborzenkov · ‎2013-07-14

In this case I would check fragmentation and tried reallocation to get better sequential performance.

robinetapp · ‎2013-07-15

ok, thanks.

We we will try.

robinetapp · ‎2013-07-15

Hi,

here are the results: we use 4 LUN (each on own volume) - two for data and two for log. On Data volumes Optimization are 2 and in log Optimization are 4. Thresholds are set to 4. Our netapp suport thinks that this is ok.

radek_kubka · ‎2013-07-15

I would still give it a shot and run reallocation on volumes with threshold 4:

toaster> reallocate start -f -p /vol/volname/

robinetapp · ‎2014-02-17

Hi to all,

sorry for the late reply... Maybe someone will still come in handy.... We resolved a problem. After a lot of hour of testing and system fine tuning, we found a problem in block size definition for one of database tablespace. After changing this, we reduced processing time for batch job, whitch was working with this data from 50-60 minutes to 9-10 minutes.... Stunning results. The results are now even better then they were in the "clear" Shark... So - when you migrate from one storage to another maybe is best way to find some book like "Best practice for database xxxx and netapp"... (... ...) before starting migration.

Thanks to all who participated in this discussion.

Robi