ONTAP Discussions

reallocate TR?

danpancamo
44,385 Views

After fighting with our DBAs for months now about IO performance, we finally narrowed down a growing performance issue to disk fragmentation.   We accidentally discovered that a copy of a database was 2x faster to backup than the original copy which lead us to the conclusion that we did actually have an issue with IO. 

We ran a reallocate measure which resulted in a threshold of 3.   I'm not sure exactly how this is calculated, but obviously the result is misleading. The results should have been 10 IMHO.

We ran reallocate start -f -p on the volume and it immediately reduced the backup time in half.   Disk util, disk caching, latency all were significantly better after the reallocate completed.

It appears that a the Sybase full reindex basically tries to optimize by re-writing data...   This process somehow causes the disk fragmentation.

I've only been able to find limited information on the reallocate command, however with this significant performance increase, there should be a whitepaper on the subject that includes the effects of reallocate on Cache, PAM, Dedup, Vmware, Databases, Exchange, etc...

Is there a TR document in the works on reallocate?  If not, someone should start one.

95 REPLIES 95

radek_kubka
9,864 Views

Is anyone else having trouble getting to this doc?

It's posted on Field Portal, which is accessible for NetApp insiders & resellers only (don't shoot the messenger though )

aborzenkov
9,245 Views

I checked external library before posting link but it was not available. TR is not marked as NDA so I presume it is just matter of time.

lwei
8,921 Views

I just posted a blog on read_realloc, with some simple test results.

http://blogs.netapp.com/pseudo_benchmark/2011/06/read_realloc.html

Regards,

Wei

jakub_wartak
8,921 Views

Wei,

it is much better than 6% over time I've started from something like 8MB/s in reads (worst case scenario) and it self-optimized to a state nearly just after fresh "reallocate start -f -p"

Take a look http://jakub.wartak.pl/blog/?p=343 (WAFL performance VS sequential reads: part III, FC LUN performance from AIX vs read_realloc)

-Jakub.

lwei
8,924 Views

Jakub,

Thanks for the post and I'm glad you ran your own tests and found it's much better than 6%. I just used very simple tests to illustrate the effectiveness of read_realloc to some workloads. The improvement will probably vary under different scenarios. I think the worst case is probably 0% improvement. On the other hand, it could be much better than 6%.

Thanks,

Wei

avbohemen
8,921 Views

I have another question about read_realloc: If I would enable read_realloc, will it also optimize volume layout if do ndmp backups for a volume with a single lun? NDMP reads a single file (or object, in this case a LUN) sequentially, so I tend to think that read_realloc will help if I do frequent (daily) full backups... or not? NDMP creates a snapshot first, so basically it backups data in that snapshot, not the active filesystem. Will the volume get optimized if I turn on read_realloc and only do NDMP sequential i/o? The case I have in mind is a database lun, which mostly does random i/o, but ndmp backups get slower and slower over time.

The second question is: does "read_realloc=space_optimized" work on SyncMirrored aggregates / MetroCluster? I know "reallocate -p" does not work on MetroCluster, and TR3929 tells me that the space_optimized setting for read_realloc is based on physical (-p) reallocation.

avbohemen
8,922 Views

Just found the answer to my second question: space_optimized is not possible on MetroCluster.... produces this error:

filer1> vol options vol1 read_realloc space_optimized

vol options: UNKNOWN error reason: 226 (CR_VOLUME_IS_MIRRORED)

Can anyone shed some light on whether reallocate is going to be supported on MetroCluster somewhere in the future?

vol options: UNKNOWN error reason: 226 (CR_VOLUME_IS_MIRRORED)vol options: UNKNOWN error reason: 226 (CR_VOLUME_IS_MIRRORED)

lwei
8,995 Views

Anton,

Is the NDMP backup reading from a data LUN directly, or a snapshot? If it's a LUN, then turning on read_realloc may help. Note that if you just read the data only once, then read_realloc is not going to help, since on the first pass, it tries to optimize the layout. Only on the 2nd or later pass does it help.

Thanks,

Wei

KINYUAWANJUGUNA
8,995 Views

Hi Wei,

Atleast am coming across something better than NetApp support.

My environment is as below:

Sun M5000 server connected to FAS2020/FC Direct with Data OnTap 7.3.2. The aplication is T24 running on Jbase DB.

Backups are taking too long and backup times deteriorating with time. The wafl scan measure layout returned 13.9 for the LUN

We ran reallocate -f -p <lun> and later measure layout which returned 1.3. Backup went on fine with backup time reduced from 4 Hrs to 40MIns.

The next day backup time was 2Hrs, measure layout at 6.9;

Seems when there are writes during the day, layout ratio deteriorates hence longer backup times. Basically backup does a tar to a different location. It Tars the Contents of the LUN to a different LUN or tape.

Reads are slowing by the day!

Can we ran read_reallocate on this LUN. What are the effects in terms of peformance when read_allocate is ON?

Should we run reallocate -f -p everyday? the first one took 9 Hrs offpeak which we cannot afford. Can we run it in pick Hours?

I have searched for JBASE + Netapp/Data OnTap best practices nowhere to be found.

Regards

dan

jeremypage
8,995 Views

Sounds like your volume/aggrs need more free space.

KINYUAWANJUGUNA
8,995 Views

Hi Jeremy,

Thanks for your feedback.

I can drop afew LUNs to have more free space. Will i need to run reallocate command for the free space to be useful in performance of my original remaining LUN?

Regards

dan

jeremypage
8,995 Views

Yes but the issue is not really that the LUN is full, it's that the volume/aggr is full enough that when you write WAFL cannot find contigious blocks so your system is becoming fragmented very quickly.

You need free space at the aggr/volume level (I dunno how you are set up). An aggr show_space -h  can give you a good view.

KINYUAWANJUGUNA
8,995 Views

My system is very transactional. That might explain the fragmentation.

On volume space issue, below is the output.

netapp-prod1*> aggr show_space -h

Aggregate 'aggr0'

    Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG           A-SIS

         2187GB           218GB            98GB          1870GB             0KB             0KB

Space allocated to volumes in the aggregate

Volume                          Allocated            Used       Guarantee

vol0                                 12GB           765MB          volume

test_vol                           1220MB            61MB          volume

u2_vol                              285GB           247GB          volume

u1_vol                              342GB           279GB          volume

u3_vol                              285GB           214GB          volume

bu_vol                              171GB            77GB          volume

Aggregate                       Allocated            Used           Avail

Total space                        1096GB           819GB           773GB

Snap reserve                         98GB            11GB            86GB

WAFL reserve                        218GB            22GB           195GB

netapp-prod1*>

We have also disabled snapshots for now.

Regards

radek_kubka
8,995 Views

If I am reading this correctly, you have tons of free space in your aggregate, so this is not the issue.

Any chances you can ditch traditional backup (which is basically a large, sequential read not liked by NetApp filers) in favour by snapshots, followed by a NDMP backup to a secondary target?

aborzenkov
8,995 Views

It is misunderstanding where fragmentation comes from.

Let’s say you have file with contiguous blocks 1,2,3,4,5,6. Now 2,4,6 are overwritten. Maybe in the same CP timeframe. So you are left with

1,hole,3,hole,5,hole

contiguous 2,4,6

So file is fragmented; there are no two blocks located contiguously. Even though there was enough space to write new blocks sequentially.

It is hard to say whether read_reallocate will help. It happens in the wrong time (i.e. - first data is accessed and only then it is reallocated). So it is dependent on workload; but it will create permanent additional disk load which is again hard to quantify.

Try to run reallocation more often, may be every day before backup, and to check whether it has noticeable impact on your system. Official statement is, reallocation scan runs in background and has low priority. I have seen multiple people saying they did use it during normal working hours without any performance impact.

Or consider dropping tape backup in favor of snapmirror/snapvault. May be consider offloading tape backup to snapmirror/snapvault destination. This will allow you to reallocate destination without impact to source and be more flexible with backup window.

KINYUAWANJUGUNA
8,995 Views

Hi Radek/Aborzenkov,

We are currently running rellocate 2Hrs before backup, it works fine for now. But it seems its more of a workaround than a real solution.

We have not tried read_reallocate, its poorly documented and the outcome for our scenario where at backup time we must read all the data can not be envisioned..

We are exploring the snapshot/NMDP backup option since it seems to be the longterm solution. In the meantime, just to brig you to speed, this is how application vendor has recommended we do the backup: (used to work perfect with the old Sun StorageTek & V890)

Steps:

1. Run Close Of Business Before Backup

2. Run Pre-Batch Backup (Basically a tar of the Location holding the files-)

3. Run End Of Day (EoD)

4. Run Post-Batch Backup (Same Tar as in 2 to take care of the changes of EoD in 3.

If am to use snapshots; our approach will be most likely look like;

1. Run Close Of Business Before Backup

Take snapshot_pre-batch

2. Run Pre-Batch Backup - From snapshot_pre-Batch

3. Run End Of Day (EoD)

Take snapshot_post-batch

4. Run Post-Batch Backup (Use snapshot_pre-batch take care of the changes of EoD in 3.

My question is, since snapshots do not copy data but only keep an inode reference to the original data, will the backups using snapshots be any faster?

Thanks for your responses.

aborzenkov
8,995 Views

creation of snapshot is almost instantaneous. It takes very little time.

But if you ask, whether using snapshot to backup to tape will be faster - no, it won’t. Actually, tape backup using NDMP starts with creating snapshot that is then written to tape.

aborzenkov
11,104 Views

As indicated by http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=585280, you also can reallocate directory. Wow!

reallocate start -o -f -p /vol/<volname>/<dirname>

ASUNDSTROM
11,104 Views

According to the Reallocate Best Practices Guide TR-3929 published June 2012:

Deduplication and Compression

Starting in Data ONTAP 8.1 deduplicated data can be reallocated using physical reallocation or read_realloc space_optimized. Although data may be shared by multiple files when deduplicated, reallocate uses an intelligent algorithm to only reallocate the data the first time a shared block is encountered. Prior versions of Data ONTAP do not support reallocation of deduplicated data and will skip any deduplicated data encountered. Compressed data will not be reallocated by reallocate or read reallocate, and it is not recommended to run reallocate on compressed volumes.

dimitrik
11,104 Views

And as of 8.1.1P1 you can use the free space reallocate - a real-time option that, coupled with read_realloc, should keep systems humming:

aggr options aggr1 free_space_realloc on

I'd recommend you turn this on either for new systems or for systems where you've already ran the normal reallocate.

I'd also turn both this and read_realloc on.

The only think read_realloc doesn't make sense to turn on for is DB redo logs.

D


dburkland
11,104 Views

Would you be able to attach this TR to this thread? I'm really interested in reading more about this but can't find the aforementioned article on now...

Public