Active IQ Unified Manager Discussions

Aggregate Almost Overcommitted

ALVSERVERSUPPORTTEAM
23,161 Views

We are getting this message in Operations Manager for one of our lone aggregates. It's not part of resource pool. Thresholds are set to defaults for the aggregate and for the global settings (so Aggregate Nearly Overcommited Threshold is 95%). Snapshots are currently disabled. Double-Parity is checked. I'm not sure if Thin Provisioning is used or not. The condition says:

"Committed 7.32 TB (99.60%) out of 7.35 TB available; Using 5.84 TB (79.44%) out of 7.35 TB available"

The "Space Breakout" is thus:

Aggregate Size7.35 TB
Snapshot Reserve Size837 GB
Used Space5.84 TB
Available Space1.51 TB
Available Snapshot Reserve526 GB
Used Snapshot Space311 GB
Unused Guaranteed Space2.41 TB
 

We added two additional disks to the aggregate yesterday to bring the used capacity percentage down from 84% to 79.44%. I guess it didn't help the Committed percentage though, which must be unrelated.

So what exactly does 7.32TB commited mean? How is that calculated? How can we reduce that value to get it back under 95%? Let me know if there are some command-line commands that can be more enlightening than the GUI. How can we tell if thin-provisioning is in use?

Sorry if this is in the wrong forum (please help move along if necessary), first time poster.

Thanks!

1 ACCEPTED SOLUTION

shailaja
18,548 Views

Hi,

>> I found dfm. It was installed on the windows box containing Operations Manager.

"dfm" is the main command in Operations Manager.

>> Auto-grow is turned on for most of the volumes. Looks like several have grown to their limit (why don't they show this in the GUI?).

This report is available in Operations Manager UI.

Its at Member Details -> File Systems and select the report "Volume Autosize Details" report from the drop-down.

Overcommitment of an aggregate is calculated to give an idea of amount of thin provisioning in aggregates.

So, it is the sum of total space of all volumes in the aggregate irrespective of whether the guarantee of the volume

is "volume", "none" or "file".

When you set autogrow on a volume, the maximum size set for the volume is considered for overcommitment calculations.

So, for autogrowing volumes, maximum (volume_size, maximum_size_of_volume) is taken.

Thanks,

Shailaja

View solution in original post

24 REPLIES 24

BrendonHiggins
21,065 Views

Hi

Welcome to the forums

First performance will suffer once the aggregate gets above 85% full and the higher this gets the worse it will become.  This is because the filer is hunting for free block to write too and is constantly reclaiming them.

I notice you have an aggregate snap reserve set of 5%.  Many people turn this to zero as for them it is wasted space.  We would not aggregate snap restore, only at the volume level.

aggr snap reserve 0

This would give you 837 Gb back

Yes you are thin provisioned.  Have a read of this for what is going on.  http://blogs.netapp.com/shadeofblue/2008/10/really-thin-pro.html

Also be careful adding in disks one at a time as you will create hot spots in your aggregates with all the IO going to the new disk until it is as full as the other disks.

Hope this helps

Bren

ALVSERVERSUPPORTTEAM
21,065 Views

Hi Brendon,

Thanks for the info on performance problems over 85% usage, snap reserve, thin provisioning, and adding disks (we actually added two more at the same time, not sure if that created a two disk hotspot or not). That does help a little.

I guess I'm still confused on aggregate overcommittment and how we reduce this value to get back under the threshold? I gave it more disks, but that did nothing for committment. Removing the snap reserve also seems to be just creating more "free space" and helping us stay under the 85% usage. Unless I missed something, we haven't addressed overcommitted yet, have we?

Thanks!

smurphy
21,065 Views

On the CLI, use "aggr show_space" to show true space allocation/usage for the aggregate in question. That output will also show you if space guarantees are in use.

Sean.

adaikkap
21,810 Views

Hi Michael,

     Can you get the output of the following report for your aggr ? To see why it is overcommitted ?

dfm report view volumes-space-guarantees <aggr name/id>

Regards

adai

ALVSERVERSUPPORTTEAM
21,810 Views

> aggr show_space

Aggregate 'AGGR01F_VMDATA_H03'

    Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG
   9747745280KB     974774528KB     877297072KB    7895673680KB             0KB

Space allocated to volumes in the aggregate

Volume                          Allocated            Used       Guarantee
VM_CONFIG_FILES_H03            78955744KB      55719244KB          volume
VM_SNAPSHOT_H03               131141424KB         79452KB          volume
VM_SWAP_H03                   132449216KB      12936640KB          volume
VM_T1_PRI01_H03               504132756KB     329457160KB          volume
VM_T1_SEC01_H03               928648876KB     374106984KB          volume
VM_T1_OSSWAP01_H03            185736048KB     138311856KB          volume
VM_T3_OSSWAP01_H03            132669036KB      68442612KB          volume
VM_T3_PRI01_H03               822486052KB     720505312KB          volume
VM_T3_SEC01_H03              1034757948KB     898116724KB          volume
VM_TEMPLATE_H03               524875800KB     231609016KB          volume
VM_T3_SEC02_H03               918863184KB     340049036KB          volume
VM_T3_SEC03_H03               918802796KB     565447436KB          volume

Aggregate                       Allocated            Used           Avail
Total space                  6313518880KB    3734781472KB    1623106244KB
Snap reserve                  877297072KB     348884024KB     528413048KB
WAFL reserve                  974774528KB        593540KB     974180988KB

> dfm report view volumes-space-guarantees AGGR01F_VMDATA_H03
dfm not found.  Type '?' for a list of commands

adaikkap
21,811 Views

you must qualify the aggr name with the filer name as filername:aggrname or better use the id of the aggr which can be obtained from

dfm aggr list

Can you also get the report for the following

dfm report view volumes-autosize-details < aggr name/id> as autogrow turned on is also considered for aggr overcommmit

calculation.

ALVSERVERSUPPORTTEAM
21,812 Views

I don't think we have the dfm command. I'm puttied into the filer on the command line. Is dfm a (client) product that should be running on my windows pc or a linux box?

> ?

<snip>

cifs                igroup              ping6               stats
config              ipsec               pktt                storage
date                ipspace             portset             sysconfig
df                  iscsi               priority            sysstat
disk                keymgr              priv                timezone
disk_fw_update      license             qtree               traceroute
dns                 lock                quota               traceroute6
<snip>

Is there another command-line command to show the auto-grow feature per aggregate? I'm pretty sure we used this at one point. Not sure if this was at the volume level like dedupe, or the aggregate level though.

Thanks!

smurphy
21,065 Views

Yes, DFM is a separate product.

Sean.

adaikkap
21,815 Views

yes dfm can run on windows or linux.

autogrow is an options at volume level.

You will have to drill down to each volume of the aggr.Its not available at aggr level.

Regards

adai

ALVSERVERSUPPORTTEAM
17,228 Views

OK, I added another roughly 200GB to one of the volumes (under this aggregate in question) which was giving warnings because of space (VM_T3_PRI01_H03). Now I see that this has indeed affected our committment and the warning condition now says:

"Committed 7.52 TB (102.26%) out of 7.35 TB available; Using 6.06 TB (82.43%) out of 7.35 TB available"

Shouldn't adding more disks bring this value down?

Does it make sense that other volumes allocated space went up (not just VM_T3_PRI01_H03)? So allocated is directly, or almost, related to committment, but how do we go from 6.10TB allocated to being 7.52TB committed? I'm still missing how Committed is calculated. And from just reading over Guarantees once it sounds like selecting 'none' or 'file' would allow you to overcommit, but 'volume' should not. Yet I'm 102.26% overcommitted. How is that possible?

The new 'aggr show_space' is:

Aggregate 'AGGR01F_VMDATA_H03'

    Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG
   9747745280KB     974774528KB     877297072KB    7895673680KB             0KB

Space allocated to volumes in the aggregate

Volume                          Allocated            Used       Guarantee
VM_CONFIG_FILES_H03            79284604KB      56129904KB          volume
VM_SNAPSHOT_H03               131141424KB         79452KB          volume
VM_SWAP_H03                   132537952KB      13025376KB          volume
VM_T1_PRI01_H03               504097340KB     329572264KB          volume
VM_T1_SEC01_H03               928602684KB     374108236KB          volume
VM_T1_OSSWAP01_H03            185706664KB     138282472KB          volume
VM_T3_OSSWAP01_H03            132591672KB      68365248KB          volume
VM_T3_PRI01_H03              1059708024KB     721931136KB          volume
VM_T3_SEC01_H03              1034658988KB     898039576KB          volume
VM_TEMPLATE_H03               524896416KB     231630340KB          volume
VM_T3_SEC02_H03               919125088KB     340417312KB          volume
VM_T3_SEC03_H03               919325400KB     566311640KB          volume

Aggregate                       Allocated            Used           Avail
Total space                  6551676256KB    3737892956KB    1387168180KB
Snap reserve                  877297072KB     305224700KB     572072372KB
WAFL reserve                  974774528KB        593696KB     974180832KB

ALVSERVERSUPPORTTEAM
17,229 Views

I found dfm. It was installed on the windows box containing Operations Manager.

Since the commands produce output more than 80 characters wide, I'm attaching txt files of those two dfm report commands you wanted me to run.

Auto-grow is turned on for most of the volumes. Looks like several have grown to their limit (why don't they show this in the GUI?).

shailaja
18,549 Views

Hi,

>> I found dfm. It was installed on the windows box containing Operations Manager.

"dfm" is the main command in Operations Manager.

>> Auto-grow is turned on for most of the volumes. Looks like several have grown to their limit (why don't they show this in the GUI?).

This report is available in Operations Manager UI.

Its at Member Details -> File Systems and select the report "Volume Autosize Details" report from the drop-down.

Overcommitment of an aggregate is calculated to give an idea of amount of thin provisioning in aggregates.

So, it is the sum of total space of all volumes in the aggregate irrespective of whether the guarantee of the volume

is "volume", "none" or "file".

When you set autogrow on a volume, the maximum size set for the volume is considered for overcommitment calculations.

So, for autogrowing volumes, maximum (volume_size, maximum_size_of_volume) is taken.

Thanks,

Shailaja

ALVSERVERSUPPORTTEAM
17,229 Views

The total autogrow space allocated comes up to just about the committed space value. It's off by about 150GB, but close enough to tell that is where it is coming up with the value. If anyone wants to try to figure out the discrepancy we can dig deeper, but the main question of the post has been answered. So to address this warning I see four options:

1) Add more disks

2) Reduce the total space allocated to autogrow

3) Reduce (if possible) any volumes that are already larger than the autogrow allocated for that particular volume

4) Increase warning threshold

Thanks!

tyrone_owen_1
17,229 Views

Hello

ONTAP: 7.3.2

Ops Mgrs: 4.01

Interesting post. I have an aggregate that contains volumes with guarantees set to 'volume' and SnapMirror destination volumes with guarantees set to 'none'. Is there an Ops Mgr report that can show me how committed the aggregate would be if the SnapMirror volumes guarantees were set to 'volume'? If not, what's the alternative?

Thanks

tyrone_owen_1
17,230 Views

....another thought, I'd like to alert via Ops Mgr when the aggregate is committed at 90% (excluding volume autosize calcs). How can I achieve this (without switching autosize off!)?

Thanks

adaikkap
14,837 Views

Write your own script plugin to calculate the overcommitment excluding the autosize, and generate a custom event.

Regards

adai

adaikkap
14,837 Views

Even if you create your SnapMirror Destination volume with none guarantee, after the snapmirror initialize it will have the same guarantee settings as the source volume.

So you cant have different guarantee for source and destination of a snapmirror.

Also the Aggregate over commitment does not take into consideration the volume guarantee.

Regards

adai

tyrone_owen_1
14,837 Views

Thanks Adai, I think there's some confusion about what I'm asking:

Adaikkappan Arumugam wrote:

Even if you create your SnapMirror Destination volume with none guarantee, after the snapmirror initialize it will have the same guarantee settings as the source volume.


If you break the mirror in order to start using the SnapMirror destination as a live volume, won't the guarantee become 'volume' if the SnapMirror source volume is also set at 'volume'?

Adaikkappan Arumugam wrote:

So you cant have different guarantee for source and destination of a snapmirror.

Yes you can, depends on the version https://kb.netapp.com/support/index?page=content&id=2011568

Adaikkappan Arumugam wrote:

Also the Aggregate over commitment does not take into consideration the volume guarantee.



Yes it does, this is why having SnapMirror destination volumes with a guarantee of 'none' within the aggregate clouds the issue. Aggregate overcommitment is the prediction of space required within the aggregate for volumes it contains when those volumes are guaranteed as 'volume', i.e. fully fat provisioned - I sound pretty confident here, but I'm happy to be corrected

This is a confusing issue and I haven't been able to find any clarity with any of the NetApp tools. My aggregates contain live volumes and SnapMirror destination volumes to cater for a DR scenario. I want to ensure I have the space available within the aggregate when the time arises I need to use these SnapMirror destination volumes in a DR situation. Other than manually going through the volumes and totting up the fully guaranteed space, I haven't found a NetApp tool that will help me.

Cheers

adaikkap
14,837 Views
Adaikkappan Arumugam wrote:

Even if you create your SnapMirror Destination volume with none guarantee, after the snapmirror initialize it will have the same guarantee settings as the source volume.


If you break the mirror in order to start using the SnapMirror destination as a live volume, won't the guarantee become 'volume' if the SnapMirror source volume is also set at 'volume'?

No.

Adaikkappan Arumugam wrote:

So you cant have different guarantee for source and destination of a snapmirror.

Yes you can, depends on the version https://kb.netapp.com/support/index?page=content&id=2011568

Thanks,I learnt this new.

Adaikkappan Arumugam wrote:

Also the Aggregate over commitment does not take into consideration the volume guarantee.



Yes it does, this is why having SnapMirror destination volumes with a guarantee of 'none' within the aggregate clouds the issue. Aggregate overcommitment is the prediction of space required within the aggregate for volumes it contains when those volumes are guaranteed as 'volume', i.e. fully fat provisioned - I sound pretty confident here, but I'm happy to be corrected

This is a confusing issue and I haven't been able to find any clarity with any of the NetApp tools. My aggregates contain live volumes and SnapMirror destination volumes to cater for a DR scenario. I want to ensure I have the space available within the aggregate when the time arises I need to use these SnapMirror destination volumes in a DR situation. Other than manually going through the volumes and totting up the fully guaranteed space, I haven't found a NetApp tool that will help me.

Cheers


Aggr overcommitement, basically says this,

First example

i have a aggr of 100G, if i create two volumes of 100G with guarantee none, then it means my aggr is overcommited by 100%

Take second example

same 100G aggr, 2 volumes of 100G with guarantee, volume and none, then also it means my aggr is overcommited by 100%

Third example.

same 100G aggr, 2 volumes of 100G with guarantee volume can be created, as we dont have space in the aggr, only one volume with volume guarntee be created, so here there is no

quesiton of overcommitment.

Havae you take look at the reports provided by Operations Manager on the overcommitments ?

Regards

adai

tyrone_owen_1
13,270 Views

Thanks Adai

Adaikkappan Arumugam wrote:

Also the Aggregate over commitment does not take into consideration the volume guarantee.



Yes it does, this is why having SnapMirror destination volumes with a guarantee of 'none' within the aggregate clouds the issue. Aggregate overcommitment is the prediction of space required within the aggregate for volumes it contains when those volumes are guaranteed as 'volume', i.e. fully fat provisioned - I sound pretty confident here, but I'm happy to be corrected

This is a confusing issue and I haven't been able to find any clarity with any of the NetApp tools. My aggregates contain live volumes and SnapMirror destination volumes to cater for a DR scenario. I want to ensure I have the space available within the aggregate when the time arises I need to use these SnapMirror destination volumes in a DR situation. Other than manually going through the volumes and totting up the fully guaranteed space, I haven't found a NetApp tool that will help me.

Cheers


Aggr overcommitement, basically says this,

First example

i have a aggr of 100G, if i create two volumes of 100G with guarantee none, then it means my aggr is overcommited by 100%

Take second example

same 100G aggr, 2 volumes of 100G with guarantee, volume and none, then also it means my aggr is overcommited by 100%

Third example.

same 100G aggr, 2 volumes of 100G with guarantee volume can be created, as we dont have space in the aggr, only one volume with volume guarntee be created, so here there is no

quesiton of overcommitment.

Havae you take look at the reports provided by Operations Manager on the overcommitments ?

Regards

adai

Yes eaxctly, so the Ops Mgr Aggregate Overcommitment alert calculates space usage on fully guaranteed volumes. The alert according to this post also includes volume autosize into the calculation. I was asking for a report which gives an the Aggregate Overcommitment alert (fully guaranteed volumes) without the autosize calculation.

I think the confusion has stemmed from I'm talking about the 'overcommitment alert' and you are talking about the practice of 'overcommitment'. I understand overcommitment I just want a report to show me space usage based what space I could use up if all my volumes were fully guaranteed now - without the autosize

Here's my intial question:

Interesting post. I have an aggregate that contains volumes with guarantees set to 'volume' and SnapMirror destination volumes with guarantees set to 'none'. Is there an Ops Mgr report that can show me how committed the aggregate would be if the SnapMirror volumes guarantees were set to 'volume'? If not, what's the alternative?

....and I followed up with:

....another thought, I'd like to alert via Ops Mgr when the aggregate is committed at 90% (excluding volume autosize calcs). How can I achieve this (without switching autosize off!)?

Hope I'm making some sort of sense now!!!

Thanks, Ty

Public