ONTAP Discussions

What does ONTAP Snapshot-Only Capacity tier do?

Steve_A
1,124 Views

The use case is SQL DB backups using Snapcenter.  What data in the snapshots gets moved to capacity tier with a 14 day setting and 1 weekly backup for example?  I understand the current snapshot is 0 blocks until blocks are changed on the volume and the snapshot grows with the changes. (plus meta data)

 

To make it simple, let say a new 100gb table is added every week and an older 100gb table is deleted every week and the DB itself remains around 1TB.  After 4 weekly backups is there 1200gb on SSD and 200gb on capacity tier?   I understand data tiering of cold blocks of regular storage but can't wrap my head around Snapshot tiering.

1 ACCEPTED SOLUTION

elementx
992 Views

> doesn't show storage type

 

Do you mean it doesn't show whether the snapshot is tiered out or still local? You won't see that there, I believe. Tiering is done on the underlying RAID blocks, which is "below" WAFL and files (snapshots).

 

I think you should look for a PS equivalent of this:

- Before you enable

https://docs.netapp.com/us-en/ontap/fabricpool/determine-data-inactive-reporting-task.html

- After you enable, watch FP utilization:

https://docs.netapp.com/us-en/ontap/fabricpool/monitor-space-utilization-task.html

- More:

https://kb.netapp.com/onprem/ontap/dm/FabricPool/How_does_FabricPool_inactive_data_reporting_work

 

If you enable Snapshot-only tiering and start taking daily snapshots, you should see FP start growing from zero on day 3, and if you were to clone or restore this test volume you should see some reads from S3 as well. Then if you delete some test snaps, FP utilization should drop if the amount of deleted snapshot data is more than 10-20% of all data (see the FP TR, about S3 "chunk" "fragmentation").

 

If you start playing with this I suggest first a simple non-SQL volume, were you create files with fio (https://fio.readthedocs.io/en/latest/fio_doc.html#binary-packages) or somesuch in a controlled manner, e.g. use a timer to add 1 GB per day, and watch it for 3-4 days and try different things including delete/restore from FP. This is to make it very easy to understand what's being added, removed or overwritten.

Then after that works as you expect, do SQL.

View solution in original post

6 REPLIES 6

elementx
1,096 Views

What's moved to Capacity Tier is "post-efficiency" changed data. If you have 32 changed blocks with 10 identical blocks, 10 blocks that can be compressed by 50%, and the rest (12) is as-is, that becomes 1 + 10*0.5 + 12 = 23 blocks. These are packed with other blocks set for tiering in 4MB chunks before upload.

 

> To make it simple, let say a new 100gb table is added every week and an older 100gb table is deleted every week and the DB itself remains around 1TB. After 4 weekly backups is there 1200gb on SSD and 200gb on capacity tier? 

 

If each week's changes completely overlapped (say, the 100GB table was completely overwritten with the new 100GB, that would be an equivalent of 100GB "delta"), then your weekly snapshot data tiered to the cloud would have (100 GB/week x 4 weeks) of data.

 

Normally added and deleted won't completely overlap (although they might). It always depends, but in many cases of a growing storage footprint some blocks will be changed, some added. Both of these would be included in Snapshot-only tiering.

Say your weekly change is 100 GB, 20 GB is new log data (transaction logs, for example), 20 GB was deleted (old transaction logs and some table rows) , and 60 GB is new and modified table data.

 

In this case new & modified would be tiered out, so only 80 GB.

20 GB of data would be removed vs. previous weekly snapshot, so it wouldn't have to be tiered. It would remain in older snapshot data tiered to S3, though, as long as those snapshots are retained. Once they are deleted on ONTAP, if they are no longer referenced from any snapshot, they would be deleted from S3 as well.

Steve_A
1,032 Views

This is the first explanation that I can understand, thank you.  The goal is to have the benefit of snapcenter for application consistent backups of SQL 52 x 1 weekly backup,  while not spending SSD for older snapshots.

 

Many databases are used very little after 2-4 weeks of heavy use but need to remain online for quick reference.     In a scenario where a 100GB database is untouched for a few weeks, after several weekly snapshots of the database do any of the blocks end up on capacity tier?  I would think this DB would stay on SSD unless the database was deleted or there were changed blocks.  If it was deleted I think those blocks would end up in snapshot that might get tiered?

 

I didn't know if this excerpt below applies the the scenario above.  Would any of the snapshot blocks above get cold?

3. Local Storage Snapshot-based Backups
In this case you could have a volume with data that needs to be in the performance tier with a defined NetApp ONTAP Snapshot policy. By applying Cloud Tiering’s Snapshot-only tiering policy to this volume, you can save space by just tiering the cold Snapshot blocks from the point-in-time backups. This way you keep Snapshot copies available in case you need to perform an instant local restore and have storage savings of around 10%-15% weekly, depending on the case.

 

elementx
1,025 Views

You can see https://www.netapp.com/pdf.html?item=/media/17239-tr-4598.pdf (page 33 in current version):

 

> The default tiering-minimum-cooling-days setting for the Snapshot-Only tiering policy is two days. A two-day minimum provides additional time for background processes to provide maximum storage efficiency and prevents daily data-protection processes from needing to read data from the cloud tier.

 

This means unless you change settings, snapshot blocks will be marked cool after 2 days.

That may be sooner than what you mention (2 weeks) but I wouldn't sweat it - since those are snapshot data, they don't need reading unless you need to restore a snapshot, and even if you need to restore it from S3 after say 10 or 15 days, it will happen only seconds slower than with non-tiered snapshots: imagine if FP needs to pull back 15% of a 100 GB database - at 1 GB/s that may take 15 seconds.

 

Even if you automated all other things that you can while restoring a snapshot, you probably wouldn't care about having the entire workflow take 75 instead of 60 seconds. On top of that while seconds may matter for the restoration of snapshots created today or yesterday, for snapshots 3 or 13 days old, it's usually not that critical..

 

But as mentioned in the PDF you can change cooling period on a per-volume basis. For Snapshot-only the default looks the same as for other types, i.e. "-" (- meaning not set, meaning the default which in the case of Snapshot-only policy is 2 days). You could tune that cooling period to 14 days, for example, if you want fast restoration of snapshots younger than 14 days.

This KB has a bit more on that.

https://kb.netapp.com/onprem/ontap/os/The_volume_show_command_displays_no_value_for_tiering-minimum-cooling-days

 

Steve_A
996 Views

Thank you for your answer, at this point I will run the system in the wild and monitor what is happening with a small PoC.  Is it possible to pull a report that shows a list of snapshots with size and storage type.?

 

This gives a good list with snapshot size but doesn't show storage type

Get-NcSnapshot -ontapi | Sort-Object -Property Created

 

I have tried a few scripts that come up on searches here but haven't found anything that lists all snapshots with individual size + storage type.   I have seen answers in the community that indicate that there are many scripts in the community, try X , but I haven't been able to find a directory of scripts if that exists.

 

elementx
993 Views

> doesn't show storage type

 

Do you mean it doesn't show whether the snapshot is tiered out or still local? You won't see that there, I believe. Tiering is done on the underlying RAID blocks, which is "below" WAFL and files (snapshots).

 

I think you should look for a PS equivalent of this:

- Before you enable

https://docs.netapp.com/us-en/ontap/fabricpool/determine-data-inactive-reporting-task.html

- After you enable, watch FP utilization:

https://docs.netapp.com/us-en/ontap/fabricpool/monitor-space-utilization-task.html

- More:

https://kb.netapp.com/onprem/ontap/dm/FabricPool/How_does_FabricPool_inactive_data_reporting_work

 

If you enable Snapshot-only tiering and start taking daily snapshots, you should see FP start growing from zero on day 3, and if you were to clone or restore this test volume you should see some reads from S3 as well. Then if you delete some test snaps, FP utilization should drop if the amount of deleted snapshot data is more than 10-20% of all data (see the FP TR, about S3 "chunk" "fragmentation").

 

If you start playing with this I suggest first a simple non-SQL volume, were you create files with fio (https://fio.readthedocs.io/en/latest/fio_doc.html#binary-packages) or somesuch in a controlled manner, e.g. use a timer to add 1 GB per day, and watch it for 3-4 days and try different things including delete/restore from FP. This is to make it very easy to understand what's being added, removed or overwritten.

Then after that works as you expect, do SQL.

Steve_A
983 Views

Thanks, you have been a great help.

Public