Data Backup and Recovery

Snap Creator and Oracle RAC support

LEIGHTONN
13,976 Views

Hi,

I've been trying to clone our 11.2 Oracle RAC databases on AIX 6.1 using Snap Creator 4.0p1d2 without success. The database uses ASM for database storage with each disk group using a separate volume. Example diskgroup layout:

DG_DATA - datafiles

DG_ORL - online redo

DG_FLASH - controlfiles

DG-ARCH - archived redo logs

A config was created for the data disk group which in turn calls another config for the archivelog disk group. Once snapshot copy is complete and volumes have been Snap mirrored to a remote server I then mount the ASM diskgroups and recover the database. However, the recovery process always fails. Here's an output of recovery operation after issuing the command RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL:

SQL> ORA-00279: change 7725379529097 generated at 12/18/2013 16:47:10 needed for
thread 1
ORA-00289: suggestion :
+DG_ARCH_CLRMR/clrprd/archivelog/2013_12_18/thread_1_seq_21913.1570.834511647
ORA-00280: change 7725379529097 for thread 1 is in sequence #21913


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-00279: change 7725379529097 generated at 12/18/2013 07:04:46 needed for
thread 2
ORA-00289: suggestion :
+DG_ARCH_CLRMR/clrprd/archivelog/2013_12_18/thread_2_seq_20670.1938.834481725
ORA-00280: change 7725379529097 for thread 2 is in sequence #20670


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-00328: archived log ends at change 7725368217050, need later change
7725379529097
ORA-00334: archived log:
'+DG_ARCH_CLRMR/clrprd/archivelog/2013_12_18/thread_2_seq_20670.1938.834481725'


ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01195: online backup of file 1 needs more recovery to be consistent

I've successfully used the same process for a single instance database. Is Oracle RAC supported for use with Snap Creator with the Oracle plug-in?

Thanks,

Leighton

22 REPLIES 22

sivar
13,410 Views

Hello Leighton,

I see only 2 volumes in the config file. (asmdata_prod from filer14a and filer14b)

Where are your other volumes?

Also, once you add all the volumes, you may add this option.

META_DATA_VOLUME=filer:archivevol

example:

META_DATA_VOLUME=FILER14A:volfaix002_asmarchive_prod

Thanks,
Siva Ramanathan

LEIGHTONN
13,410 Views

Hi Siva,

I'm cloning the database using different disk group names (because same disk groups already exist on the target server) so I don't need the diskgroups with the control files and online redo logs since they're being recreated.

This is the procedure I use:

1. Put database in hot backup mode
2.      Take snapshot copy of datafiles
3.      Take database out of hot backup mode
4.      Spool controlfile to trace.
5.      Switch archivelogs
6.      Take snapshot of archivelogs
7.      Mirror all snapshot volumes to remote server
8.      Mount snapshot copies on remote server
9.      Rename and mount ASM diskgroup
10.  Mount and recreate controlfile from trace
11.  Recover database using backup controlfile until cancel
12.  Open database resetlogs
13.  Shutdown and mount database
14.  Rename with NID
15.  Open database resetlogs


The NetApp Snap Creator software has a plug-in for Oracle that takes care of steps 1-8. The remaining steps are done using scripts.
All steps up to 10 complete successfully. Step 11 is where it’s failing during the recovery.

Also I'm going to try to clone all disk groups but keep the names the same (cloning to new server). Using this method should I create a separate config for all the diskgroups and call them from the data config?

What does the META_DATA_VOLUME parameter do? I've searched the compatibility matrix but didn't see any RAC mentioned.

Thanks,

Leighton


sivar
13,410 Views

As of 4.0p1d2 you can designate one volume in META_DATA_VOLUME=filer:volume

This tells Snap Creator NOT to take backup of the volume during normal snapshot operation.

The workflow then becomes like this.

quiesce database

snapshot data volumes

unquiesce database  (this makes Oracle to flush logs to archive logs)

take snapshot of META_DATA_VOLUME (in your case archive logs)

You may also please refer this thread by https://communities.netapp.com/people/steiner

https://communities.netapp.com/thread/30498

LEIGHTONN
13,411 Views

Hi Siva,

Thanks for your response.

Isn't this the equivalent of calling the archivelog config from the data config POST_CLONE_CREATE_CMD?

POST_CLONE_CREATE_CMD03=SERVER:/opt/NetApp/snapcreator --profile prod1 --config prod1_archive --policy daily --action cloneVol

The archivelog config has ARCHIVE_LOG_ONLY=Y

Thanks,

Leighton

sivar
13,410 Views

As long as prod1/prod1_archive has only the archive volume, I don't see a problem there.

Please refer to this thread, where has a script written.

https://communities.netapp.com/community/netapp-blogs/databases/blog/2013/09/18/integrating-snap-creator-into-asmlib-clones

I am looking for another article that Neto has written as well, and will soon include here.

sivar
12,989 Views

Here is another to refer.

https://communities.netapp.com/community/netapp-blogs/databases/blog/2012/09/13/theses-queries-can-save-your-life--for-recovery-purposes-and-clones

Please let me know if you have further questions.

I will forward this to Oracle experts.

LEIGHTONN
13,410 Views

Actually Neto, Mike and myself were working in parallel to create a script for this purpose. My script is not using ASMLib however since we're on AIX.

In any case, the ASM cloning piece works fine - it's just the recovery and more specifically RAC recovery that doesn't seem to work.

sivar
13,410 Views

I have just forwarded this thread to Neto and Jeff Steiner.

Since most of the folks are on holiday/vacation please expect delays.

clilescapario
13,410 Views

I am not using ASM, but clustered filesystem and have successfully cloned RAC databases using snapshots created by snapcreator, so it does work.

The procedure you outlined is very similar to the one I use. META_DATA_VOLUME is your friend to only have 1 config file instead of managing arclogs in a separate config. I also create the clone on a single instance and then modify init.ora to be clustered after recovery is complete.

The query listed in https://communities.netapp.com/community/netapp-blogs/databases/blog/2012/09/13/theses-queries-can-save-your-life--for-recovery-purposes-and-clones will tell you the arc logs needed for recovery. I would suggest running the query on the source database to discover which arclogs are needed, and then verifying that those logs are present in the mounted clone. (Isn't there some way to look at the content of the ASM diskgroup?)

I have also ran into the same error when I apply the log files in the incorrect order.

--

Chris

LEIGHTONN
12,222 Views

Hi Chris,

Do you use multiple volumes for the archivelogs? We use two volumes (one per filer) and I don't know if it's possible to specify more than one volume in META_DATA_VOLUME. Also we do mirroring from a remote datacenter before cloning from the secondary volumes. What parameters should be set in the data config? Specifically SNAPMIRROR_VOLUMES and NTAP_CLONE_SECONDARY_VOLUMES is set for each config but if we use a single config how would this be handled since separate volumes are involved? (See attached config).

Also I can see all the archivelogs generated from each thread and instance in the ASM disk groups using ASMCMD. They're not missing but the database seems to only apply redo from a single thread during recovery.

Would you be able to send me examples of your config and parameter files that you use to clone RAC to a single instance?

Thanks,

Leighton

clilescapario
12,223 Views

My volume layout is a bit simpler, 1 volume for data, 1 for arclog, and 1 for undo/temp/redo. Since undo/temp/redo are not supported by oracle in recovery I only snapshot the data and arclog. An example for META_DATA_VOLUME and the corresponding VOLUMES is

META_DATA_VOLUME=filer1:ora_arclog
VOLUMES=files1:ora_data,ora_arclog

I won't be able to help with the snapmirror stuff since I am registering my backups in protection manager and allowing it to handle the mirror, but I suspect that if you just add in the arclog volumes to the appropriate snapmirror parameters it will work. When using META_DATA_VOLUME snapcreator will (abridged version):

- backup control file

- put oracle in hot backup mode

- snapshot volumes not in META_DATA_VOLUME <--- This would only be your data volumes

- exit hot backup mode

- backup control file

- log switch on all RAC nodes

- snapshot META_DATA_VOLUME volumes <--- This would only be your log volumes

Since I have a clustered filesystem and a 2 node cluster it is a bit easier for me to discover the arclogs necessary for recovery from the filesystem (pick the l last 2 arclogs in the mounted arclog snapshot), but I have difficulty judging the order without oracle telling me something.

When cloning, I restore a modified version of the backuped control file, and modify params to have "*.cluster_database=FALSE" and then start only a since instance.

-RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL

-When prompted, apply the suggested log file

-When prompted for the next log, I go and look at my mounted arclog snapshot, I pick the other arclog that has the same timestamp as the suggested on (since they were both generated at the same log swtich)

-It will say something like "No longer needed for recovery" and at that point we are ready to cancel and open with resetlogs

-Modify params back to cluster_database=TRUE and restart/start each node

I have had the same error before if I apply the logs in the wrong order or possibly just let oracle pick the 2nd log, unfortunately I can't refresh my clone at the moment to check when it happens.

I think it would be pretty cool is there was a way to query the source database after the data snapshot, write to a text file on the arclog volume the necessary arclogs needed for recovery of the change sequence, and then take the arclog snapshot. This would make the pair of snapshots a bit more self-contained for recovery.


LEIGHTONN
12,223 Views

My procedure is nearly identical. The main difference being that I don't use META_DATA_VOLUMES. We use the separate profile for archivelogs which is called from the data profile, mainly because of the Snap Mirror and multiple volume requirements. By all accounts this should also work and I've used it before without problems. I could certainly test using META_DATA_VOLUMES as well.

BTW - I was finally able to get this to work. The only change I made was to restart the instance after re-creating the control file.

Thanks for all the feedback.

ktenzer
12,223 Views

Starting in SC 4.1 which should release any week the META_DATA_VOLUME parameter supports one or more volumes. We would recommend using this method instead of splitting configs and now that we can support multiple archive log volumes there should be no reason to split configs.

Best Regards,

Keith

LEIGHTONN
12,224 Views

That is great new Keith! Could you kindly notify me when the new version becomes available? I still notice that the clone fails at times during recovery. I'm hoping using the META_DATA_VOLUMES will help.

Thanks,

Leighton

ktenzer
11,801 Views

SC 4.1 should be available this week on NOW. January 9th!

Let us know if 4.1 addresses your issue

Keith

Chris_Stevenson
10,085 Views

Hi Leightonn

 

Yes, multiple FRA volumes (8). I added them to META_DATA_VOLUME and they seem to be backing up last based on the timestamp of the snaps. We are going to test recovery this next week. I am thinking since recovery will include archive log restores and we are in hot backup via SC Agent on one of the RAC nodes, we don't need them in the consistency group (and don't have to waste storage on the volume siziing). But I am trying to get valiation on that.

I was also cautioned by my local Netapp RSE (admittedly not a SC expert) about multiple FRA volumes in META_DATA_VOLUME to back them up last. And I think it does NOT work in earlier versions of SC. But I am using 4.1.1p3 and they are all getting snapped as mentioned.

 

We are going to test recovery this next week. So will keep this thread posted with the pertinent details.

 

Thanks for your response.

LEIGHTONN
11,261 Views

Do you know if this version fixes the SENDTRAP issue? https://communities.netapp.com/thread/32361

Chris_Stevenson
10,644 Views

I am trying to get a fairly large Oracle RAC instance backed up. I have two questions>>>
> In order for backup jobs to succeed without errors, I commented the ' PRE_APP_UNQUIESCE_CMD01=/u00/app/orautil/bin/db_physical_ext.sh TEST post FAILED' entry. Is this entry not needed for RAC or am I missing the configuation details somehow?

>> My DB is fairly well fanned out each datalun, archive log lun, redo lun, etc have their own volume. My DBA has me include redo log volumes. I read in Netos whitepaper they are never needed to be backed up. Am I missing something here or do we actually need the redos as part of the consistency group for restoral purposes. To me, they should not be included. (tr-3858 Using Crash-Consistent Snapshot Copies as Valid Oracle Backups July2010)

Thanks

aborzenkov
10,638 Views
It probably is more useful to open new thread with clear subject instead of reviving years old catch all one ... Anyway

You need online redo logs if you are creating *conistent* online backup without putting database in backup mode. That gives you crash-consistent image.

You do not really need online redo logs if you put database in backup mode. In this case you need to replay archive logs past "end backup" anyway, so any online redo log saved as part of backup is outdated and of no use.

Or may be he means to save online redo logs in addition to archived logs? That makes sense too as it gives you more recent data, thus less data is lost.

steiner
10,630 Views

Why not use the usual Oracle plugin? It works equally well on RAC.

Public