Subscribe

Snap-mirror causing intermittent 1135 cluster errors in Exchange 2010

Hi,

 

We have NetApp OnCommand System Manager 8.3.1 p2. The daily snap-mirror is intermittently causing Exchange mailbox servers to drop out of the DAG with event ID 1135 system event log errors;

 

  • Cluster node 'NNNNNN' was removed from the active failover cluster membership. The Cluster service on this node may have stopped.

It occurs approx 10 mins after the backup starts. Mailbox server affected is variable. Doesn't happen daily, sometimes 8 days between episodes.

The backup was set to start at 10pm. Issues occured at roughly 10:10. To prove it was the cause we moved it 11pm. The issue moved to 11:08pm.

 

Exchange is 2010 SP3 with latest rollups & hot-fixes. DAG is multisite (3 subnets). Samesubnet cluster heartbeats set to 2/10. Cross-subnet set to 4/10. i.e. Microsoft best practice.

 

Anyone experienced this?

Any recommendations?

Should we be using Snap-Manager for Exchange instead of OnCommand System Manager?

Is OnCommand System Manager fully compatible with Exchange 2010 SP3?

Are any specific settings required when backing up Exchange?

Are there any best practices for configuration/settings of OnCommand SM?

 

BTW: I'm a Microsoft Windows/AD/Exchange/etc person, not NetApp, so be gentle with me here.

 

Thanks for your help.

Martin