Data Backup and Recovery

Yet another SME w/DAG question

Craig_McKellar
4,356 Views

Hiya,

I know it's been discussed quite a lot regarding SnapManager for Exchange backing up DAGs - however I was wondering if anyone had some advice regarding the following setup:

2 Mailbox servers in production, 1 in DR over a LES (sub 250ms) which are all part of the DAG.  2 databases in the DAG, each production server has an active/passive copy while the DR server has the additional passive copy of each - So MB1 has DB1(active) and DB2(passive), MB2 has DB1(passive) and DB2(active) while MB3(DR) has DB1(passive) and DB2(passive).

What I'm aiming to get to is the DR server performing the verification and also having up to 8 weeks retention (as we have additional space on SATA), while the production servers purely have up to 1 weeks retention.

Currently we do 1 full backup at midnight which verifies then a few full backups without verification during the day along with FRPs.

I was thinking along the lines of having the DR server run the scheduled tasks using the -clusteraware option, but I'm not sure how to handle the different retention levels - would that be possible using the gapless copy feature to only have the production servers taking a copy backup?  If so will that impact the log truncation in live?

Appreciate any advice 😃

Cheers

1 ACCEPTED SOLUTION

christoph
4,356 Views

Hi,

if you add the DAG as Member in the view of SME you will see all databases in the dag for each server. in my case, you see the active and passive copies (thpsvie0107/thpsvie0106) on each server. you should see three copies of each database on all of your servers.

On the Backups and Verify window select advanced and there you'll find this screen:

This allows you to select the passive DBs in the first place and create a copy backup of the active dbs with a different retention and without verification on the active dbs getting away the burden of verification of your active servers. I guess this is behaviour you'd like to achieve, plz correct me if i missunderstood the request.

plz test the behaviour of the backup job if scheduled on all clusternodes and especially if the cluster group of the DAG does failover to different hosts. i have seen very different behaviour ranging from hanging backup jobs to errors creating backups if the job is not scheduled on the owner  node of the cluster group and would like to hear this is just my testenvironment or a generic problem.

Thanks.


BG Christoph

View solution in original post

4 REPLIES 4

christoph
4,357 Views

Hi,

if you add the DAG as Member in the view of SME you will see all databases in the dag for each server. in my case, you see the active and passive copies (thpsvie0107/thpsvie0106) on each server. you should see three copies of each database on all of your servers.

On the Backups and Verify window select advanced and there you'll find this screen:

This allows you to select the passive DBs in the first place and create a copy backup of the active dbs with a different retention and without verification on the active dbs getting away the burden of verification of your active servers. I guess this is behaviour you'd like to achieve, plz correct me if i missunderstood the request.

plz test the behaviour of the backup job if scheduled on all clusternodes and especially if the cluster group of the DAG does failover to different hosts. i have seen very different behaviour ranging from hanging backup jobs to errors creating backups if the job is not scheduled on the owner  node of the cluster group and would like to hear this is just my testenvironment or a generic problem.

Thanks.


BG Christoph

Craig_McKellar
4,356 Views

That was exactly it - I set this up and it's working almost perfectly 😃  The passive databases are backed up on the DR side but it also kicks off the copy backup on live.

The only issue I now have is that the verification on DR is taking upwards of 11 hours to complete - The aggregate which contains the databases is only a half-populated SATA shelf so I don't think it has the IOPS to cope with the verification process (the disk queue length is averaging 1000!).  I think we'll have to look at disabling verification until we can get some more spindles in that aggregate..

Craig_McKellar
4,356 Views

As a side note, I've noticed some of the verification FlexClones aren't being disconnected from the DR DAG node, it's showing as "busy, vclone - Busy"..

Will check the logs, but I hazard a guess that this is due to the verification running over the next scheduled backup so it's getting locked?

christoph
4,356 Views

busy vclones mean the disk seems to be mounted on a host. check if the verification wasn't able to disconnect the disk. this happens in rare cases and should be cleared out manually.

Public