how to manage consistency groups in config files in snapcreator 3.3.0

ldomenella · ‎2011-05-23

Hi guys,

im curretly doing my Oracle backup using a perl script to perform a consistency group based snapshot.

since id like to start using snap creator to do the same, i did not understood how to use a consitency group in snap creator.

ill explain better, i know on the config file how to specify to use a consistency group and his time usage... but im not sure it is applied on all the controllers and all the volume at the same time.

just to be more clear, this is a brief summary of my env:

2 filers head, each head have 2 volumes containing my LUNs (for ASM).

this is an extract of what i configured in my config file/ profile of snap creator:

SNAME=orareplica03

SNAP_TIMESTAMP_ONLY=N

VOLUMES=ntap4a:rac_data,rac_log;ntap4b:rac_data,rac_log

NTAP_SNAPSHOT_RETENTIONS=daily:1

NTAP_CONSISTENCY_GROUP_SNAPSHOT=Y

NTAP_CONSISTENCY_GROUP_TIMEOUT=medium

i do snapshots one a day. (except for testing purposes)

do you confirm that runing: ./snapcreator --profile replica03 --action snap --policy daily --verbose

is the correct syntax do it ?

any better trick or idea is very welcome

best regards,

Luca

ktenzer · ‎2011-05-23

Hi Luca,

Your config file looks good

I would recommend using SNAP_TIMESTAMP_ONLY=Y since with _recent we need to rename snapshots, this can slow down process and cause you to reach cg timeout, medium is 7 seconds so not doing rename is prefered.

Consistency Groups are done on a per storage controller basis, since that is what ontap supports. There is no such thing as a consistency group that spans controller, the main thing however is doing cg-start on all controllers before doing commits so that you have a consistency point.

Since oracle will not send a write without previous write being complete you just need to cg-start on all controllers which SC does.

SC does the following process for your config

cg-start ntap4a

cg-start ntap4b

cg-commit ntap4a

cg-cmmit ntap4b

This is pretty much the best you can do. One small improvement would be to send the start and commit calls in parallel, SC does it in serial since this would increase speed. We are planning that for a future release.

Also in SC 3.4.0 (will release end of June) we added a wafl-sync before cg-start as option which also can increase speed and prevent timeouts

We recommend combining this with our Oracle plugin, it certainly doesn't hurt putting Oracle in backup mode

Does this help?

Regards,

Keith

ldomenella · ‎2011-05-23

Hi Keith,

wow very quick answer

use the cg-start / cg-commit is what i do using a perl script and API.

i really dont understand how to do what you suggested to me with snapcreator... can you give me an example ?

for example i get this message when i run snapcreator with the syntax i told you:

INFO: NetApp Snap Creator Framework detected that SnapDrive is not being used. File system consistency cannot be guaranteed for SAN/iSAN environments

reading the docs i understood i didnt need snapdrive to do that !

please let me know what i missed....

best regards,

Luca

ktenzer · ‎2011-05-23

SC does what I suggest for you, I was just explaining, there is no more configuring than what you already did

except setting option SNAP_TIMESTAMP_ONLY=Y

As for message about SnapDrive, this is a guidance, if you are in a Lun environment you should use SnapDrive in combo with SC since SC does not provide file system consistency.

Regards,

Keith

ldomenella · ‎2011-05-24

Hi Keith,

i dont need FS consistency, ASM and Oracle give me that. i just need to coordinate the CG for snapshot all the LUNs in the same group.

and its not yet clear to me how snapcreator does it

br,

Luca

ktenzer · ‎2011-05-24

I already tried to explain this so let me try one last time, otherwise someone else will need to explain

You need to do following

Set all volumes on all storage controllers that need to be snaped in VOLUMES parameter

Set NTAP_CONSISTENCY_GROUP_SNAPSHOT=Y

Set SNAP_TIMESTAMP_ONLY=Y

Set NTAP_CONSISTENCY_GROUP_TIMEOUT=medium

Now SC will take a CG snapshot for all your volumes

If you want more info on how SC does it here it is.

There are two APIs cg-start and cg-commit. They are both on a per storage controller bases

cg-start - fences IO to all volumes, creates snapshot

cg-commit - finishes snapshot creation and unfences IO to all volumes

SC does the following assuming two storage controllers: filer1 and filer2

cg-start filer1 (volumes)

cg-start (filer2 volumes)

cg-commit (filer1 volumes)

cg-commit (filer2 volumes)

If you run SC with --debug you will see APIs we perform

Hope this helps

Keith

pascalc · ‎2011-06-25

Hello Keith

I am running Oracle on AIX with Luns and snapdrive is installed.

Currently I see 3 methods of snapshoting Oracle with snapcreator :

1. use quiesce/unquiece/pre-exit custom commands (oracle in/out of hor backup) + custom snapdrive commands to take snapshots

2. same as 1 but let snapcreator do the snapshots (no snapdrive custom commands)

3. using the oracle plugin of snapcreator

I am using method 2.

I suppose the integrated Oracle plugin gives the best of methods 1 and 2 ? This is not very clear in the documentation.

How does the oracle plugin works ?

ktenzer · ‎2011-06-25

Hi

You are correct all three methods are valid. Method 1 and 2 rely on you creating oracle script or sql commands and using APP_QUIESCE_CMDs, APP_UNQUIESC_CMDs, as well as PRE_EXIT_CMDs.

Option three uses SC built-in oracle plugin APP_NAME=oracle

The SC oracle plugin will do following:

Normal backup

quiesce - begin backup mode

unquiesce - end backup mode, switch logs

Archive log only

Optionally with ARCHIVE_LOG_ONLY=Y SC will do following:

quiesce - switch logs

unquiesc - nothing

This allows you to separate data from archive log backups if desired. Best practice is to create 2 configs 1) for data files 2) for archive logs. If customer wants to backup everything you would run archive log config from within data config as a POST_APP_UNQUIESCE_CMD01 or NTAP_POST_CMD01.

The adv of using oracle plugin is it is supported. SC is supported but if issue occurs with cmds or scripts you are running those wouldnt be supported, of course we would still try to help. Another adv is discovery VALIDATE_VOLUMES=data. SC determines if we are backing up correct data files, this is optional but does require snapdrive for non NFS.

In the end it is your decision, that is what is nice about SC, you have lots of choices and flexibility.

There is also an unsupported oracle backup script on communities we have used for 8 and 9 which arent supported by SC buikt-in plugin, that may be interesting, it isnt well documented though but if interested I can try and help you out.

Regards,

Keith

pascalc · ‎2011-06-29

thanks, I'll give a try to oracle plugin as well then.

jakub_wartak · ‎2011-07-06

Hi,

If you are on AIX you can freezee any actvitiy on VG by using chvg -S/-R... that's actually what snapdrive does. Your single VG can span across multiple LUNs on multiple controllers. I'm not so sure what to do with NFS across controllers on on other OS'es.

-J.

ktenzer · ‎2011-07-06

If volumes are spread accross storage controllers it depends on application. CG snapshots are within a storage controller so SC will end up creating a CG per controller meaning CGs are not 100% consistent with one another.

However if application issues write dependent writes (meaning first write must be successfull before it sends another write) then they will be consistent since as long as one CG has it's bubble up the application will hold writes and wait which we want it to do.

You can control the wait time in SC since CG has a timeout

For ASM and oracle snapdrive puts DB in backup mode and does a CG snapshot. It does nothing to file system, so this would apply for NFS. In ASM volumes are spread all over file systems.

In summary it is a combination of CG and Application that make things work (write dependent writes)

Regards,

Keith

jakub_wartak · ‎2011-07-13

Keith,

Well you your thinking about applicannot is correct when you are thinking from storage point of view... however if you are using LVMs then you must take into consideration that something folloowing like this might be happening:

a) /bin/HelloWorld wries 1MB to /test/file1

b) AIX recieved the write() syscall and passes it to VMM/FS layer

c) FS layer identifies that /test fs is on testLV which is on testVG

d) testVG is e.g. stripped across >= 2 physcial volumes

e) you write 1MB to one of the physcial volumes (PV) becasue the (in e.g. AIX) Physcial Partition (PP) size is let's say 32MB.

Now taking into consideration that HelloWorld can be run in paralell (or anything else writes to /test/) even if you freeze FlexVol on controller1 (handling 1st PV/LUN) there is absolutley no guarantee that the 2nd controller is not going to be used.

Enter Oracle: Yes it appears that DBWR can issue ordered application writes, but how do you know that? Do you have access to source code for every Oracle C source code file ? And what is happening if direct write path was used e.g. during loading from SQLLoader or during INSERT /*+ APPEND */ .. ?

That's why i'm thinking that suspending VG (especially writes) is very good idea. You have frozen writes at VG at some identical point to all VGs.

Now when it is mixed hot-backups it can give yet another layer of protection in my oppinion. Your oppinion and testing can vary. Even thinking about it is is pretty interesting

-J,.

ktenzer · ‎2011-07-13

Good points and I agree

Like I said CG is only IO consistent inside a CG and a CG doesn't span controllers 1) so with one controller and many volumes, no problem under any condition using CG 2) many volumes many controllers, you need write dependent writes or write ordering as you refer to it from Application.

If we have write ordering, you can spread volumes across controllers and use CG, if not then you need to look at freezing LVM as you mentioned or something else. All of which can be done through Snap Creator since it is a framework by providing simple commands or scripts to handle LVM layer.

Keith