Reallocation: aggr vs volume.

jasonczerak · ‎2010-01-30

I'm a little confused here on where it's approtate to use this

When running some the reallocation on an aggr, I've noticed that it goes though each volume and does what I believe to be the volume version of the reallocation command. Is this true? If I set a schedule to kick off a a weekly reallocation on an aggr it will blanket the relayout of blocks first at hte free space at the aggr then block optimize each volume?

Or should I be setting up schedule on a per volume level? there are a handful of volumes that I think could use some daily optimizations. Some may even work well with read_realloc as well

Where I officaly got confused was this with the reallocate command. This "note" doesn't seem to be in any of the pdf/online docs and contradicts them.

reallocate start -A [-o] [-i interval] <aggr_name>
NOTE: -A is for aggregate (freespace) reallocation.
        Do NOT use -A after growing an aggregate if you wish to
        optimize the layout of existing data; instead use
            reallocate start -f /vol/<volname>
        for each volume in the aggregate.

Looking at the volume status' I see "online,raid_dp,redirect,active_redirect" for each volume. This would lead me to believe that it's doing "everthing"

DataOnTap 7.3.1.1 (and 7.3.3RC1 on 2 array's soon to be deployed about when 7.3.3 is hopfully GA)

jasonczerak · ‎2010-01-30

quick edit here. More observations that lead me to beleive that doing the aggr level reallocation hits up volumes. This is started after the block level stuff is done.

agybatst1:
State: Redirect: 0 of 3 volume(s) processed.

jasonczerak · ‎2010-02-02

TTT?

anthonyfeigl · ‎2010-02-03

Hey Jason,

Keep in mind that the reallocation of data at the AGGR level will affect the volume data as well, in the aspect of moving data blocks around.

You can see that in this example taken from the NOW site.

"Defining an aggregate reallocation scan

After reallocation is enabled on your storage system, you define a reallocation scan for the aggregate on which you want to perform a reallocation scan.

Considerations

Because blocks in an aggregate Snapshot copy will not be reallocated, consider deleting aggregate Snapshot copies before performing aggregate reallocation to allow the reallocation to perform better.

Volumes in an aggregate on which aggregate reallocation has started but has not successfully completed will have the active_redirect status. Read performance of such volumes may be degraded until aggregate reallocation has successfully completed. Volumes in an aggregate that has previously undergone aggregate reallocation have the redirect status. For more information, see the na_vol(1) man page."

You can get more details from this page.

http://now.netapp.com/NOW/knowledge/docs/ontap/rel727_vs/html/ontap/sysadmin/tuning/concept/c_oc_tun_reallocate-how-to-manage-scans.html#c_oc_tun_real...

Therefore, when you do an AGGR reallocate, volumes will be processed as well.

Keep in mind that a volume is a virtual slice of a physical AGGR. Much like a Qtree is to a volume.

Anthony

jasonczerak · ‎2010-02-03

It seems every doc I read on NOW, the information is a bit different.

From your link:

You can define only one reallocation scan per file, LUN, volume, or aggregate. You can, however, define reallocation scans for both the aggregate (to optimize free space layout) and the volumes in the same aggregate (to optimize data layout).

So, it looks like i need to do both here.

What about read_realloc volume option. I understand it to be a more "how the app uses the data" kinda of optimizion with the penality of a little CPU and IO overhead on reads (then writes). Have you had much sucess with measuring a gain from this option? Would running volume level reallocate "undo" changes made by read_realloc if both were executed on the same volume?

Also, one last question:

We use flex vols for a few oracle databases. These flex vols are tied out a few levels from the main production DB. Should we be running the reallocate commnads on flex vols? Some times we can go 6 months with plenty of data changes between refreshes of snapshot'ed DB's.

radek_kubka · ‎2010-02-03

Hi Jason,

I am trying to follow this thread, as reallocation seems to be quite interesting subject.

It seems this is like a skeleton in the cupboard - experience from the field proves it is needed, yet it isn't very 'politically correct' to openly say that actually WAFL could suffer from fragmentation.

I always took a stance that when dealing with technology it's better to be honest & know the issues up front to mitigate possible negative impact.

A solid TR document gathering together all relevant info around fragmentation & reallocation is a must in my opinion (someone at least seems to share this view: http://communities.netapp.com/message/20969#20969)

Regards,
Radek

jasonczerak · ‎2010-02-03

Well, I'm going to kick it off on a few Luns. I'm not sure why we need to do it on a lun if you just include the volume. Does it skip luns in a volume? this kinda stuff with "why" questions are not answered.

I'm going to compose some of this data to my SE today. I'll report back.

radek_kubka · ‎2010-02-03

I'm not sure why we need to do it on a lun if you just include the volume. Does it skip luns in a volume?

Nope, I don't think LUNs get skipped - you can either reallocate all blocks within volume (including these in LUNs), or just blocks within a given LUN.

Also - physical reallocate is worth consideration, as it shouldn't cause your existing snapshots to grow:

http://now.netapp.com/NOW/knowledge/docs/ontap/rel732/html/ontap/rnote/rel_notes/concept/c_oc_rn_feat73-admin-physical-reallocation.html

Regards,

Radek

jasonczerak · ‎2010-02-03

so if -f and -p the same function in regards to a volume/lun/file?

http://now.netapp.com/NOW/knowledge/docs/ontap/rel732/html/ontap/sysadmin/GUID-33157B83-0C35-46B8-8B55-84EEEFE51EDB.html

argh!

The problem with testing is time here. I'd like to understand what I'm testing before i'm testing the real stuff. We are maybe 2 weeks out before we get our DB's on the new storage. The nice thing about oracle 11.x is the replay feature. So I can get consistant DB runs on the data over and over and over and tun the filer.

radek_kubka · ‎2010-02-03

so if -f and -p the same function in regards to a volume/lun/file?

Yep, it looks like this is exactly the case.

Actually this doc gives a fairly complete (& clear) description of options on page 427:

http://now.netapp.com/NOW/public/knowledge/docs/ontap/rel732/pdfs/ontap/210-04499.pdf

Regards,

Radek

__frostbyte_9045 · ‎2010-02-12

If only we didn't move our older disk shelves to the new 3140... Original aggr's were created with 7.0 so we can't do it at the aggr level. When I do it at the volume level, it kills our snapshot reserve as well as the wan link since they are also snapmirror sources. Grrr....

anthonyfeigl · ‎2010-02-03

Jason,

After 10 years of working on NetApp gear, I have to agree on this statement.

>>It seems every doc I read on NOW, the information is a bit different.

That is a consistent problem I have.

I find the best information is ascertained by testing and personal experience.

My work with flex vols is very limited (two years) and I am currently testing reallocate on these types of volumes.

In my previous three years I performed WAFL reallocate on Traditional Volumes, which would be similar to current AGGR's and I found some (20%) performance improvement for backups and read performance by doing this.

Keep in mind, WAFL reallocate is a Defrag type of thing and when you perform an operation on a volume with this type of command, you will risk data loss (unlikely), but possible.

Anthony

jasonczerak · ‎2010-02-11

Well, after some more reading I've concluded that volume -p is necessary and I'm confirming but if you do this on a parnet of a flex clone, you should on the clones.

Still no word on if a volume reallocate will "un do" read_realloc volume option. This is alot of work to test this.

Anyone have any DFM counters at the volume level that could be graphed to show an improvment in IO or anything after these operations are performed?

__jeremypage_3897 · ‎2010-02-22

I think it would show up as a higher number for your chain reads. I'm guessing those are sequential reads. So higher = better.

Sometimes it's frustrating though, hard to know when there is no man page for statit!

BrendonHiggins · ‎2010-02-23

Hi

One thing to watch out for. We tried to do aggregate reallocation but becuase the aggregate was create with DoT 7.2 but the filers now run 7.3, the process failed and would not start. So we had to make do with volume reallocation.

Hope it helps

Bren

__jeremypage_3897 · ‎2010-02-23

That's interesting. We started at 7.2.4 and upgraded to 7.3.1 but didn't have any issues. Where you using traditional volumes? We're 100% USDA Flex here.

BrendonHiggins · ‎2010-02-23

All flexvols

jasonczerak · ‎2010-02-23

We started at 7.2.6 I think and moved up to 7.3.1.1 and the 4orignal aggr's created with 7.2 run the process with out error. It's just when they get to the "volume" part, one core gets chewed up and cuases latency issues filer wide. I have an issue up with support for this.

BrendonHiggins · ‎2010-02-23

My bad. Just looked up the error and it was due to ours being a 7.1

aggregate. Error was:

Unable to start reallocation scan on 'aggrX': Aggregate created on older

version of ONTAP

erick_moore · ‎2010-03-18

Also it is important to note the differences between aggregate reallocation and volume reallocation. All aggregate reallocation does is make free space in the aggregate contiguous. This is different from volume level reallocation where data blocks are optimized across an aggregate. For example when you add new disk to an aggregate you generally should run volume level reallocations against all the volumes in that aggregate. Doing that will distribute your existing data evenly across the new disk, sort of a leveling process so you don't end up with hot disks. Doing an aggregate reallocate after adding new disk would basically do nothing since there is already new contiguous free space in your aggregate on the newly added disks. Does that make sense?

__jeremypage_3897 · ‎2010-03-19

erick.moore@sxc.com

Doing an aggregate rescan after adding new disk would basically do nothing since there is already new contiguous free space in your aggregate on the newly added disks.

I don't think that's correct. It's important to reallocate after adding disks to make sure that you're contigious space is spread evenly across spindles. The last thing you want is 10 disks full and 4 disks empty. The data needs to be contigious when looked at from the aggr level, not on the disk itself - you don't want to read 10 blocks off 4 disks when you could be reading 1 block off 40. Which of course brings up the question of statit results again - are chain reads according to spindle (I suspect they are since it's in the per disk data output).

Also I was reading the 7.3x upgrade guide and it mentions that deduped blocks are not reallocated. Bummer, I can see why (can you imagine the overhead if you had to look at an extra 16 dimensions when doing a reallocate) but it's still a shame. I wonder if it will be part of OT8.