ONTAP Discussions

Highlighted

reallocate TR?

After fighting with our DBAs for months now about IO performance, we finally narrowed down a growing performance issue to disk fragmentation.   We accidentally discovered that a copy of a database was 2x faster to backup than the original copy which lead us to the conclusion that we did actually have an issue with IO. 

We ran a reallocate measure which resulted in a threshold of 3.   I'm not sure exactly how this is calculated, but obviously the result is misleading. The results should have been 10 IMHO.

We ran reallocate start -f -p on the volume and it immediately reduced the backup time in half.   Disk util, disk caching, latency all were significantly better after the reallocate completed.

It appears that a the Sybase full reindex basically tries to optimize by re-writing data...   This process somehow causes the disk fragmentation.

I've only been able to find limited information on the reallocate command, however with this significant performance increase, there should be a whitepaper on the subject that includes the effects of reallocate on Cache, PAM, Dedup, Vmware, Databases, Exchange, etc...

Is there a TR document in the works on reallocate?  If not, someone should start one.

95 REPLIES 95
Highlighted

Re: reallocate TR?

I'll second the need for further documentation.  As part of my PM work (things are slow right now) I found that some of my volumes came back with 6 and 7's.  However, the documentatioin does seem to be very light!  I've been playing around but don't know if is really helping since we did't do any benchmarking prior to reallocate being run.

Highlighted

Re: reallocate TR?

Hi

I am just posting as I would be keen to know more about reallocate.  I have used the command a couple of times in the past and have had issues due to aggregates being create 7.1 and the command being used on 7.3 filers.

How did you "we finally narrowed down a growing performance issue to disk fragmentation".  Are you using statit and looking at chain lengths and RAID stats?

Thanks

Bren

Highlighted

Re: reallocate TR?

I'd have to agree.  I'm looking into performance tuning now that our main business application will be moving to RAC and NetApp this Summer.  Highly transactional kind of stuff.

Highlighted

Re: reallocate TR?

Hey NetApp, please document this better - most of your customers are not willing to read these boards to find stuff like this and the docs in the 7.3 manual are very sparse. Would be nice to have a decent scheduling system set up too.

And the same thing was true for our system, reallocate made a huge difference in sequential read type stuff.

Highlighted

Re: reallocate TR?

In fact I'd be happy just to know if aggr reallocates take care of everything under them. I can handle one large flood of snapshots if I'm ready for them but I'd prefer not to get ready if it's not worth my time...

Highlighted

Re: reallocate TR?

Any updated documentation or thinking on this topic?

We're running SQL Server 2005 on a NetApp FAS3160 server.  We do SQL index rebuilds on the weekends, and I wonder if it's same as Sybase under the covers, with regards to how it behaves at the storage level...

Highlighted

Re: reallocate TR?

In fact I'd be happy just to know if aggr reallocates take care of everything under them. I can handle one large flood of snapshots if I'm ready for them but I'd prefer not to get ready if it's not worth my time...


According to official NetApp manuals aggregate reallocation does not optimize file layout (which is logical when you think about it - aggregate does not know anything about files that are too far above). It compacts used blocks to create more contiguous free space.

So aggregate reallocation may help with disk writes, but it shouldn't have any effect on large sequential disk reads.

Highlighted

Re: reallocate TR?

We all know NetApp filers need massive help with writes

I just have a problem with the resources the reallocate -A has affects.  7.3.3 supposes to help make it better.. 8.0 completly solve it.

Highlighted

Re: reallocate TR?

I'm not the greatest storage administrator out there but I did work at NetApp for a few years and still have contact there. Although the aggregate level reallocate does not explicitly give you better read performance it can if you've added new disks (which is what I said earlier) because it DOES move data more or less evenly across them. So instead of reading only from the old spindles it will now be able to pull data from the newly added ones as well.

I have not verified this first hand with testing but it makes sense. In addition it probably reduces seek times depending on how full your aggregates are simply because the heads don't have to travel as far to reach the next block, although that's purely speculation & I am not sure there would be a measurable difference there.

As far as MSSQL, are you running it on a VM or a LUN?  Is it deduped or not? If you're running it on a non-ASIS LUN I'd do a reallocate measure and see, a volume level reallocate made a substantial difference on  our Oracle LUNs.

Highlighted

Re: reallocate TR?

Not sure what you mean by that, what problems do you have with writes? Sized properly the NVRAM should be handling most of that load.

Highlighted

Re: reallocate TR?

When ever large write workloads are kicked off, say, bulding up temp table space, table splits, or file copies. Once the write thoughput reached 200MB/sec we start to see some increase read latency, once it's at 250MB/sec the NVRAM can not keep up, even on disks that are not utilizied (under 10% IO and space utiliziation).  Filer wide latency is increased.  This is on a 6080 filer 7.3.1.1.  average thoughput 9-5 is 400MB/sec on each 6080 node in the cluster. at times we push well over 500. 50-75MB/sec average write work load.   Write workloads just kill things when pushed.

We've worked to limit write's to off hours and what not so it's not a big deal.

Highlighted

Re: reallocate TR?

I gotcha, I think OnTap is probably tuned to expect the NVRAM to keep up with the writes and it sounds like you're going well beyond it's ability. Maybe you can get some inside-out PAM cards

We're super (read "filer used as RAM because the DBA has no clue") read intensive here so I don't see that problem. Our biggest single system is an Oracle 10g DB that can peak in the 200mBs range but usually is between 100 and 85 - but 98% of that is reads and 90% of those are being serviced by the cache. Sad part is the AIX host that Oracle is running on has at least 10 gig of free memory

Highlighted

Re: reallocate TR?

We migrated from HP + oracle 9 to Linux + oracle 10g + RAC + NFS + 10Ge. New to NetApp at the same time.  After a year we started to explore some tuning.  just doubling some SGA or what ever IO usage drop 50% on the filer side. The DBA's were new to RAC and used "monolithic tuning" on RAC at first. It was safe call at first. Plus the new env was 150% faster (before the memory changes) then the old so there wasn't any more call to tune.

We'll be doing some more tuning linux side and netapp side this summer if we can find some time.

Highlighted

Re: reallocate TR?

Don't confuse aggregate reallocation vs volume level.  This should help explain it better at a high level:  http://www.theselights.com/2010/03/understanding-netapp-volume-and.html

As for the NetApp not being able to handle writes, actually it is basically a write optimized SAN.  You will probably get better write perfomance on a NetApp system then you will any other SAN.  One of the biggest problems they faced was sequential read after random write, but as of OnTap 7.3 they added the read_realloc option to volumes which will sequentially reallocate data after it has been read once.  The best way to check the performance issues with such a heavy write workload is to do a: sysstat -x -s -c 60 1

Look at the column CP ty.  I am curious to see if you are experiencing any back-to-back CP's (B) or deffered back-to-back CP's (b).

Highlighted

Re: reallocate TR?

Right now writes are not very back-to-back or defered,  we tuned our apps (and users).   Since Flex share is a piece of crap and fails, we have to manualy handle things more then we figured we would

I'll see where I can induce sio load on the 3160 cluster that's not in prod yet and get it close to what I've seen on the 6080.

Check out the KB!
NetApp Insights To Action
All Community Forums