We have a large number of unmanaged SnapMirror and Vault relationships that are monitored by OCUM 9.5. Most of these are vaults that have a 12 hour lag, but we have some mirror volumes that replicate every 5 minutes.
Unfortunately the global "Lag Thresholds for Unmanaged Relationships" doesn't fit both types of relationships. e.g.
Warning is set to 150%
Error is set to 250%
Our 5 minute SnapMirror relationship produces an error if the lag exceeds 12m min 30 sec
Our 12 hour Vault relationship would produce an error after 30 hours.
There's one particular mirror relationship that exceeds the error threshold overnight whilst it processes data. We aren't concerned by the lag, but we do find it onerous that it generates an alert every day which in turn goes to our incident ticketing system.
If we increased the error threshold to only alert after 30 mins on the 5 min relationship (600%), it would mean we only receive a notification for failed backups after almost 3 days!
Is there a way to set thresholds per relationship? If not, are there any plans to introduce this in future versions?
I agree that the threshold is pretty generic. Nonetheless it follows a rule.
150% means you missed one update, 200% means you missed two. That's irrespective of the time between updates (as you correctly noted). Unfortunately there is nothing you can do about it in OCUM.
Now - you say the lag is expected as updates just take longer during the night.
So you can't change the thresholds in OCUM on a per relationship bases, but how about accounting for the longer updates in your snapmirror schedule?
I assume you currently have a pretty simple schedule, meaning just every 5min.
How about you building a slightly more sophisticated schedule that accounts for the longer update times at night?
With that OCUM will apply the 150% and 200% Threshold to each intervall individually. So lets say you keep 5min from 8am to 8pm, but change to 20min from 8pm to 8am, then alerts would only be triggered by lagtimes of 30min or more at night, but 7.5min and 10min respectively during the day.
Kind regards, Niels
If this post helped you, help others by marking it as solution or give kudos.
I'm intruigued by your suggestion of creating a more sophisticated schedule, but I'm not sure how it can be done?
From what I can see, a SnapMirror relationship can only have a single cron schedule assocaited with it, so i guess the schedule is where we need to apply the sophistication. I'd love to see some examples please because at the moment the only solution I can see is to have a very long schedule that specifies every time option, e.g. 8:05, 8:10, 8:15, 8:20, 8:25, 8:30, 8:35, 8:40, 8:45, 8:50, 8:55, 9:00, 9:05, etc, etc
Apologies, I should have said that I can create a job schedule interval, but when I try to attach this to a SnapMirror relationship's schedule using "snapmirror modify -destination-path <path> -schedule <interval_name>, it gives me that error message.
Thanks for your help with this! I'm just setting up mirrors between different versions of sims on my laptop to see if I can replicate the error on 9.1 and 9.3, just in case there's a setting on our prod clusters preventing it from working with intervals.
Phew, thank you for confirming you also see the same problem because I still had the error on my 9.6 sim! I've since been looking on the support pages to try see what I was doing wrong before raising a support case. No problems for the confusion, I'm glad it wasn't just me!
Hopefully this thread can be used as evidence to request an RFE with OCUM Product Management for SnapMirror Lag Thresholds per Relationship please?
chiming in here too. This feature is not currently on the roadmap for ActiveIQ Unified Manager. I have raised an RFE. (1258908) for consideration by engineering. I have also briefly spoken to the PM. All of that said, this definitely won't make 9.7, which means 9.8/9.9 at the earliest! So you may want to try with those long schedules for those tricky individual relationships. You can play about with the "Advanced" schedule pane in "Schedules" in System Manager, it allows you to be very specific with Month/Day/Week/Hour/Minute, and these can definitely be applied to individual volume relationships.
Let us know if this gives you a temporary workaround!