Data Backup and Recovery

Consistency Group (NTAP_CONSISTENCY_GROUP_SNAPSHOT)

ANDREAS_JANKOWIAK

Hi all,

can someone tell me what is the advantage of using Consistency Groups?   What is the purpose of it?

Our environment where we use SnapCreator is DB2 and NFS. When creating snapshots we use the db2 plugin to quiesce the database. We snapshot the following Volumes:

Archive : Db2 Archive logs

Logs: Db2 Online Logs

Data: Db2 data files

Soft:  DB2 and other binaries

Do we need the feature of CG in this case?

Thanks a lot!

Best Regards,

Andreas

1 ACCEPTED SOLUTION

ktenzer

In your case I wouldn't use CG it adds no value. I am a minimalist like you so I only would want to do what is required to meet the requirements, in this case that is a consistent backup of DB2.

Keith

View solution in original post

19 REPLIES 19

oommen

Andres,

Yes, I would use CG. But I wouldn't add the SOFT volume.

Bobby.

ANDREAS_JANKOWIAK

Hi Bobby,

thanks for your response!

Can you please explain why I should use it?  I'm not sure about this feature, but doesn't it handle filesystem consistency (for FC, iSCSI)? Dump Buffer to Filesystem?

We use NFS so the NFS fs should take care, or am I wrong?

Or do I understand this feature wrong?

Thanks a lot!

BR,

Andreas

oommen

Andreas,

CG is a Data ONTAP feature, which ensures all the volumes which are part of your database will have the same consistency point during a snapshot operation.

Regards,

Bobby.

oommen

Also - CG has a timeout mode (medium) default, and will make sure database is put back into a normal mode if the snap operation takes more than 7 seconds, that way the application is protected.

ANDREAS_JANKOWIAK

Hi Bobby,

thanks. Mhmm I don't get the point... what is the advantage of having a consistency point for the volumes data, archive and logs? (in NFS environment).

How does NetApp create the consistency point technically?

Thanks again.

BR,

Andreas

oommen

Each volume is a separate entity, and snapshots are done at the volume level. To make sure all the volumes got the same Consistency point, you will need to use this option. Also - database is protected from being in the "I/O suspend" state for more than seven seconds.

ANDREAS_JANKOWIAK

So Consistency Point = same timestamp for the indiv. snapshots, right?  So if the consistency feature is enabled, all Volumes have the same I/O state?  No writes from buffter to fs on all volumes and then the snapshot is taken?  

If this is correct, do  I need CG for the data volume? Because I/O on the data and log volume is suspended by db2 write suspend, right?

ktenzer

Consistency Group snapshots are a snapshot which is IO consistent across volumes within a storage controller. The way it works is that writes in ONTAP are held in NVRAM for all volumes part of group. Once IO is fenced for volumes then a snapshot is created on all volumes and writes to the volumes can occur. This is also why there is a timeout and maximum for this timeout is 20 seconds from time fence is up till process is finished. If CG snapshot process does not finishin within timeout ONTAP will fail operation and resume writes to volumes.

Keep in my a CG cannot span volumes on multiple controllers, it is a group of volumes within a controller. If you have multiple volumes on multiple controllers you end up with a CG per controller.

CG has nothing to do with DB2 and probably adds little value as far as consistency goes. You are already putting DB2 in backup mode by suspending writes so DB2 is consistent, nothing more is needed.

An example when CG would add value is if you had say DB2 database and an application and wanted to get some consistency between them you could include application volume on and snapshot it together with CG, or if you had multiple databases. Another use case is for Oracle ASM. CG use cases are limited and most people think, hey well it sounds better than normal snapshot so let me use it. Dont do this, if you dont have a reason to use CG, dont, that is my advice.

Hope this helps

Keith

ANDREAS_JANKOWIAK

Hello Keith,

thank you very much for your detailed response. This is exactly my mindset... I didn't saw an advantage of using CG when suspending my db2 database, because in this case I already have suspended I/O... so I was a bit confused.

I just have one DB2 db to snap so I would  go without using CG on our for involded volumes.

Thanks!

BR,

Andreas

ktenzer

In your case I wouldn't use CG it adds no value. I am a minimalist like you so I only would want to do what is required to meet the requirements, in this case that is a consistent backup of DB2.

Keith

View solution in original post

oommen

I still would use CG, that way I time out during a snap operation which takes more than 7 secs. I have seen agent time out not kicking off. As far as snapshot consistency - You don't necessarily need CG.

ANDREAS_JANKOWIAK

Hi Bobby,

with agent time out you mean SC_AGENT_UNQUIESCE_TIMEOUT ?

BR,

Andreas

oommen

correct. my concern is I/O suspend literally freezes everything in DB2, I argued with the DB2 development team not to freeze LOGWR (log writers) process during write suspend. I have tried snapshots with CG and without 'write suspend' - The database comes up fine, with by doing a crash recovery - which is okay for a NON recoverable database.

ktenzer

Ok yes that is a good point and a CG use case however it isnt a use case that applies to this situation.

If you dont want to do a DB2 write suspend you can just do a snapshot w/CG which should achieve IO consistency but this I would only recommend if customer cant afford doing the write suspend (since that is what IBM wants you to do). In this case customer is fine with this and it is working so I would again not recommend CG as it adds no value.

Keith

ANDREAS_JANKOWIAK

Hi Bobby and Keith,

thanks! Now I understand your statements and I'm not confused anymore ;o)   I know the option of CG for Oracle recovery.... but it's also a good point for non-rec. DB2 databases ;o) 

Thanks again for your patient resonses and help. Perfect community for SnapCreator!

BR,

Andreas

oommen

Yes - anytime, we are here to help. Please feel free to reach us, We much appreciate your Feedback.

ktenzer

SC_AGENT_TIMEOUT is general timeout used for all communications

SC_AGENT_UNQUIESCE_TIMEOUT is timeout used for only quiesce/unquiesce of application. If this isnt set then we use the SC_AGENT_TIMEOUT value, point is if you want these can be different. An example is if you have a very sensitive application 60 seconds may be too long so you would set SC_AGENT_TIMEOUT=60 and SC_AGENT_UNQUIESCE_TIMEOUT=20.

Not sure what bobby means that he has seen that agent timeout was ignored...if this is the case issue should be reproduced and a bug open. At this point I can say we have no open issues and I can say few customers configure CG and nobody is reporting that applications arent unquiesced when timers are up.

Keith

ANDREAS_JANKOWIAK

Hi Keith,

this is the way I implemented it.   We also don't have a problem with the unquiesce, SC_AGENT_UNQUIESCE_TIMEOUT works perfectly.

BR,

Andreas

ktenzer

Awesome well hopefully this conversation was useful nevertheless as maybe at some point you will have use case for CG and now you understand the technology and its intention better

Happy Snap Creating!

Keith

Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public