Most customers using RAC set up SnapCreator to act on a single database. The plugin can only act on a single instance, which means a cold backup cannot be done. It can, however, do a hot backup. This does not affect recoverability. You can place a database in hot backup mode from any instance in the RAC cluster.
This does mean if a particular instance chosen for SnapCreator is down, the backups will not succeed. The config file would need to be changed to point at a different instance on a different agent.
The main limitation is a config file points to one and only one agent. That prevents you from placing both nodes in a single config file. The config file is also coded with an instance name, not a database name, so if the target instance is down the config file would no longer work.
I have an idea on how to address this. I can't commit to any timeline, but I would like your opinion on thi
We could establish a new config file parameter called “VERIFY_LOCAL_SID” or something like that.
You would then populate ORACLE_DATABASES with every SID in the RAC database that is a permissible backup target.
The agent would need to be installed on all nodes in the cluster.
The agent would have to be listening and accessible on a VIP or a SCAN.
There would need to have the local SID for a given database listed in the /etc/oratab file. Not all DBA’s do that.
Only hot backups would be possible. Cold backups would require a much larger effort
By listening on a VIP or a SCAN, we can ensure that only one config file is needed. If there is a node failure, SC will still be able to contact the agent because the SCAN/VIP will move to a different server.
You could actually make RAC work with a VIP/SCAN now without any changes to SC, but it’s not a good workaround. Let’s say you have a 4-node RAC database. When the plugin ran, it would connect to an agent and only one of the 4 SID’s you specified would succeed because the other 3 SIDs are on a different server. To compensate, you’d have to set the IGNORE_ERROR (or something like that ) parameters. That makes it difficult to know if your nightly backups truly succeeded.
The point of the VERIFY_LOCAL_SID flag would be to avoid the need to suppress the errors. When that parameter was set, the plugin could consult the /etc/oratab file to see which SIDs actually exist locally, and the plugin would only attempt to perform the backup on that SID.
Possible limitations include:
There could be a short period during RAC failovers where the SCAN/VIP is present on a server with no running instance. The backup would fail in this case.
Not all network configurations would allow SC to access the SCAN/VIP. We can’t avoid this requirement without a major rework of the SC framework. Right now, a config file operations on one and only one IP address. If that IP address is the local IP rather than the VIP/SID then the operation will fail if that node fails.
That is the behavior right now. The changes I would propose would introduce a new config file option that helps the Plugin figure out which instance name is running locally. For example, a config file for database NTAP would require a config file option specifying instances of NTAP1, NTAP2, and NTAP3. When the backup runs, the plugin would consult the /etc/oratab file and figure out which instances are running locally. It could also look at the PMON processing running locally and get the data that way. It's also possible a simple srvctl command would work, but then the plug would need to know grid home as well.
It's really just a question of name resolution. The plugin needs to accept a db_unique_name and then figure out which instance is available on the node where the plugin is executing.