That is the behavior right now. The changes I would propose would introduce a new config file option that helps the Plugin figure out which instance name is running locally. For example, a config file for database NTAP would require a config file option specifying instances of NTAP1, NTAP2, and NTAP3. When the backup runs, the plugin would consult the /etc/oratab file and figure out which instances are running locally. It could also look at the PMON processing running locally and get the data that way. It's also possible a simple srvctl command would work, but then the plug would need to know grid home as well. It's really just a question of name resolution. The plugin needs to accept a db_unique_name and then figure out which instance is available on the node where the plugin is executing.
... View more
The main limitation is a config file points to one and only one agent. That prevents you from placing both nodes in a single config file. The config file is also coded with an instance name, not a database name, so if the target instance is down the config file would no longer work. I have an idea on how to address this. I can't commit to any timeline, but I would like your opinion on thi We could establish a new config file parameter called “VERIFY_LOCAL_SID” or something like that. You would then populate ORACLE_DATABASES with every SID in the RAC database that is a permissible backup target. The agent would need to be installed on all nodes in the cluster. The agent would have to be listening and accessible on a VIP or a SCAN. There would need to have the local SID for a given database listed in the /etc/oratab file. Not all DBA’s do that. Only hot backups would be possible. Cold backups would require a much larger effort By listening on a VIP or a SCAN, we can ensure that only one config file is needed. If there is a node failure, SC will still be able to contact the agent because the SCAN/VIP will move to a different server. You could actually make RAC work with a VIP/SCAN now without any changes to SC, but it’s not a good workaround. Let’s say you have a 4-node RAC database. When the plugin ran, it would connect to an agent and only one of the 4 SID’s you specified would succeed because the other 3 SIDs are on a different server. To compensate, you’d have to set the IGNORE_ERROR (or something like that ) parameters. That makes it difficult to know if your nightly backups truly succeeded. The point of the VERIFY_LOCAL_SID flag would be to avoid the need to suppress the errors. When that parameter was set, the plugin could consult the /etc/oratab file to see which SIDs actually exist locally, and the plugin would only attempt to perform the backup on that SID. Possible limitations include: There could be a short period during RAC failovers where the SCAN/VIP is present on a server with no running instance. The backup would fail in this case. Not all network configurations would allow SC to access the SCAN/VIP. We can’t avoid this requirement without a major rework of the SC framework. Right now, a config file operations on one and only one IP address. If that IP address is the local IP rather than the VIP/SID then the operation will fail if that node fails. What do you think?
... View more
Most customers using RAC set up SnapCreator to act on a single database. The plugin can only act on a single instance, which means a cold backup cannot be done. It can, however, do a hot backup. This does not affect recoverability. You can place a database in hot backup mode from any instance in the RAC cluster. This does mean if a particular instance chosen for SnapCreator is down, the backups will not succeed. The config file would need to be changed to point at a different instance on a different agent.
... View more
The main value of the plugin isn’t registering the snapshots, it’s the ability to instantly create backups and perform restores. It allows rman itself to drive the snapshot process. I don’t see a lot of value with Exadata, though. The main process for using NetApp technology to back up an Exadata system is using DataGuard to replicate to a NetApp system and then using SnapManager for Oracle.
... View more
I’ve never scripted any LDOM work before, plus the need to re-write ASM headers makes this even more difficult. I don’t see a good option here.
... View more
You’ll have to do quite a bit of scripting. The problem with LDOM’s is they essentially re-virtualize LUNs as Sun LUN’s. It hides the source of them. You can do this with SnapCreator, but you’d have to write some scripts that map and unmap the LUNs through the LDOM.
... View more
I can’t think of any explanation for this. Any software configuration that can survive a power failure without corruption should also work with a snapshot. This applies to any vendor technology. Restoring a snapshot is essentially the same as starting up a database after a power failure. Even if you had something like EMC SRDF replication, your recovery procedure would be the same because if you lost the primary site you’d be left with a copy of the data on the remote site. That copy would be frozen at a moment in time with no special preparation of the filesystem or database. Can you supply the exact procedures used to create the backup and then perform the restoration? I can’t believe that Essbase becomes corrupt simply because of a power failure.
... View more
Question - you say "volumes". If the database exists across more than one volume, this can get more complicated. The simplest approach is to generate a cg-snapshot of all the volumes. You can do this with the SnapDrive utility, SnapCreator, or a simple script. Once you have this, you can easily snapmirror this set of snapshots anywhere you want. If you can match the directory structure of the source, then you can just start the database. It will have the same name as the original, but if that's okay then this is a VERY easy way to spin up a clone.
... View more
That's definitely not an sc problem then. The only reason that should happen is a full archive log destination. The oracle alert log should show more details on the cause of the hang. Sent from my mobile.
... View more
A ‘switch logfile’ is a different operation. That’s not synchronous with an archive log operation, it simply changes the active log file. Are you saying that ‘alter system archive log current’ is hanging at the sqlplus level too?
... View more
You’ll have to test this. Delphix is effectively a filesystem. I would be surprised if it had problems. Could I ask why you want to make snapshots? Is this to create backups of databases running under Delphix control?
... View more
What’s the application? The only app I know where CG snapshot might not work is Teradata and it’s possible to configure reiserfs in a really dangerous way. For mainstream filesystems and applications, no problem. If the SC operation completed, your snapshots are valid.
... View more
SnapCreator will take CG snapshots across as many controllers as you configure. It sets up the write fencing one by one, which might seem odd at first unless you look very closely at how it works. The end result is that all dependent write order is preserved. I’ve personally done snapshots across up to 6 controllers as far back as 6 years ago. Never had a problem…
... View more
I concur - we should get this gets fixed. I would expect an occasional need to create symlinks for libraries when using very old software with a very new OS or vice versa, but the CLI should just work on RHEL6.4 It's not a bug, but it's a definitely problem. This requirement will interfere with scalability. I wouldn't want to have to create custom symlinks on 100 different servers and then deal with ongoing maintenance to put the symlinks back after patching and server rebuilds and such. I'll submit a bug/RFE report for this.
... View more
Bobby might have more to add on this in a bit. I'd recommend going with around 4 LUNs for the datafiles and maybe 2 LUNs for the logs and other stuff. If you're going with Windows, divide up the datafiles among drive letters. If you're going linux use a volume group. The important thing is to isolate the datafiles into a single volume group. That allows you to restore those LUNs and thus the datafiles independent of the other data. You'll have to balance that with the limited number of RDM's you get. If you go with fewer than 4 LUNs you can run into some performance bottlenecks related to SCSI operations. It's not that the 'one big LUN' approach is bad, but you will lower the performance ceiling. As long as IO is low, no problem. I don't usually bother with striping unless you know you're going to have a ton of sequential IO, in which case you probably wouldn't be virtualizing anyway. Just make a regular volume group with the logical volumes sitting on extents distributed across all of the LUNs. I've worked with a number of projects testing databases on VMDK/VMFS and it shows a marked decrease in potential performance compared to RDM's or iSCSI/NFS mounted to the guest itself. Again, it's not bad performance, but it will lower the ceiling, especially if you have a lot of writes.
... View more
The problem here is that when SC is called, it sets the environment parameters which are promptly overwritten by the login script. With configurations using bash, there must be no overrides. For csh, you're stuck having to use .login to set variables, which means the ORACLE_HOME, ORACLE_SID and other parameters have no effect. We're looking at options to allow both csh and bash to be used. This would still prohibit any use of login scripts that alter ORACLE_HOME or ORACLE_SID.
... View more
The online logs aren't part of a hot backup. Recovering a hot backup requires two things: 1) An image of the datafiles while in hot backup mode. 2) Every archive log generated while in hot backup mode. When SC completes a hot backup, it will force a log archival. Most of the time if you need to recover the required archive logs are already present in the archive log location. You don't generally need to recover anything. You just restore the datafiles from the hot backup snapshot and replay archive logs to the desired point in time. With SC, you need to be careful about how you restore. If you recover the entire snapshot where the datafiles were in hot backup mode and that volume also contains those archive logs, you will have erased the archive logs required to bring the datafiles online. There are multiple options. Some users put the archive logs on a separate volume and schedule a snapshot within ONTAP. Other customers will leverage the META_DATA_VOLUME option to get a separate snapshot of the archive logs after the hot backup completes. The meta data volume needs to be defined via META_DATA_VOLUME and also in the regular VOLUMES parameter. Snapshots are performed on the VOLUMES (without the META_DATA_VOLUME) and after the unquiesce of the database, then the META_DATA_VOLUME get its snapshot.
... View more
The architectural overloads of SC tell me this is a bug. SC doesn't like the fact that the archive log volume is the only volume on that particular filer in the VOLUMES section. What version of SC are you using?
... View more
These are items in the config file that need to be set… -The Archive Log Volume must be defined as one of the Flexvols: VOLUMES=etc2a:hps16_11g_nfs_oractl,hps16_11g_nfs_oradata,hps16_11g_nfs_oraarch -The Meta Data Volume parameter must be set for the Flexvol which needs to be snapshot'd after the unquiesce. META_DATA_VOLUME=etc2a:hps16_11g_nfs_oraarch The Oracle Plug-in continues to be used and configured as usual.. Should see something like this in the logs… ########## Taking Snapshot on Primary etc2a:hps16_11g_nfs_oradata ########## [Tue Dec 18 16:35:33 2012] INFO: Creating Snapshot for hps16_11g_nfs_oradata on etc2a [Tue Dec 18 16:35:33 2012] INFO: Snapshot Create of nfs11g-daily_recent on etc2a:hps16_11g_nfs_oradata Completed Successfully [Tue Dec 18 16:35:33 2012] WARN: A meta data volume hps16_11g_nfs_oraarch was detected, skipping snapshot, it wil be taken after unquiesce ########## PRE APPLICATION UNQUIESCE COMMANDS ########## [Tue Dec 18 16:35:33 2012] INFO: No commands defined ########## PRE APPLICATION UNQUIESCE COMMANDS FINISHED SUCCESSFULLY ########## ########## Application unquiesce ########## [Tue Dec 18 16:35:33 2012] INFO: Unquiescing databases [Tue Dec 18 16:35:33 2012] INFO: Unquiescing database nfs11g [Tue Dec 18 16:35:35 2012] INFO: Unquiescing database nfs11g finished successfully [Tue Dec 18 16:35:35 2012] INFO: Unquiescing databases finished successfully ########## Creating meta data snapshot for etc2a:hps16_11g_nfs_oraarch ########## [Tue Dec 18 16:35:35 2012] INFO: Creating Snapshot for hps16_11g_nfs_oraarch on etc2a [Tue Dec 18 16:35:35 2012] INFO: Snapshot Create of nfs11g-daily_recent on etc2a:hps16_11g_nfs_oraarch Completed Successfully
... View more
It shouldn’t be doing all 3 volumes at the same time. The archive log volume needs to be last. There are two things you need for a valid hot backup – an image of the datafiles while in hot backup mode and every single archive log generated while in hot backup mode. It will make your life easier if you also protect the control files, but strictly speaking it’s not required. The archive log volume should be defined as a metadata volume. That will make it get snapshotted last. If you need to restore, don’t forget to shut down the database and the ASM diskgroup itself before doing any snaprestore operations.
... View more