Latency experienced when copying large files to volumes (NFS)

svaughanbb · ‎2011-08-28

Hi,

We're noticing that when copying a large file (3gigs) to a volume that is housed on a SATA aggr, that latency is signifcantly increased on another SAS/SCSI aggr in the same filer. Is this considered normal behavior?

When I say latency, there are a couple of things which give this away:

1) We run Oracle on our SAS volumes/aggr's, and the Oracle logwriter begins alerting for latency when we copy large files to the SATA aggr.

2) Using 'dd' to write a small 1kb file test every 1second, we notice increased latency (from less than 1ms to 10seconds) during the file copy

3) We see 'nfs not responding' messages on systems which are mounting SATA volumes over UDP:

Such as:

Aug 29 01:04:31 xxx kernel: nfs: server 10.0.255.24 not responding, still trying

Aug 29 01:04:31 xxx kernel: nfs: server 10.0.255.24 OK

Aug 29 01:05:53 xxx kernel: nfs: server 10.0.255.24 not responding, still trying

Aug 29 01:05:54 xxx kernel: nfs: server 10.0.255.24 OK

We only stumbled across this due to issues experienced by some of our Oracle databases, we had systems on SATA aggr's which were generating high I/O and thus causing latency on one of our SAS aggr's. We also find that the latency is experienced towards the end of the file copy, which makes us wonder if we're hitting some kind of backlog of I/O on the filer?

thanks in advance

aborzenkov · ‎2011-08-28

Well … all aggregates share the same NVRAM and consistency points are global for all aggregates either; which means slow disks will potentially slow down consistency points processing for everything else.

Which is why best practice is to not mix SATA and FC/SAS on one system …

lmunro_hug · ‎2011-10-28

aborzenkov,

Do you know where it mentions that mixing SATA and FC/SAS on a single system is not best practice? We are experiencing latency issues on one of our controllers with SATA and SAS and I am trying to find documentation that states this configuration is not ideal.

Many Thanks

aborzenkov · ‎2011-10-28

I thought I have it but I do not find it. This was mentioned in various discussions several times as well. I may be wrong with “best practice”, because I cannot find any reference to it. I apologize for confusion.

bikash · ‎2011-10-27

I would require some more information to find out what exactly going on. Even though SATA disks are not the greatest of disks for Oracle DB, but it should be exhibiting the kind of behavior as described above. You can contact me -bikash@netapp.com to go over some other details pertaining to this issue.

bbjholcomb · ‎2014-03-05

What OS are you using? We are seeing logwriter errors on controllers with just SCSI drives. We are working with NetApp now. We may have a solution if you are running RedHat.