ONTAP Discussions
ONTAP Discussions
NetApp best practices, at least in earlier versions, recommend configuring event notifications on a cluster with the "important-events" filter. After upgrading from OnTAP 9.3 to 9.5 last week, I've noticed that this mechanism has become much more chatty, and not in a good way. I now get two "NOTICE" severity alerts every time I create a new LUN, for example. One says :
------------------------------
LUN.nvfail.option.on: The nvfail option for the volume <volume> has been turned on. The volume contains LUN or NVMe namespaces and it is recommended that this option be kept on for such volumes.
-------------------------------
The other says:
-------------------------------
LUN.space.resv.not.honored: Space reservations in volume idc_v_pobdb01t_dblogs1 (DSID 5359) are not honored, either because the volume space guarantee is set to 'none' or it is disabled due to lack of space in its containing aggregate.
-------------------------------
Any ideas on how to turn off particular annoying alerts like this but keep the others?
Solved! See The Solution
Thanks @Mjizzini . It turns out that somewhere along the way, this filter lost it's connection to Active IQ UM so it was sort of acting as a 'standalone' filter anyway. I removed it and tried to re-add it from AIQUM but it didn't work. I'll figure that out later. Thanks again.
you can modify the security filter rules to get less alerts (EMERGENCY, ALERT, ERROR, NOTICE).
How to configure event filter and event notification for email destination
Thank you @Mjizzini ! After checking, my current settings appear to be the default. Unless I'm misreading the 'exclude' line, NOTICEs shouldn't generate email alerts but they are. Am I misunderstanding it?
Also, there is actually ONE notice type, for autogrow, that I *do* want to get. I've been happy to see those alerts pop up so I'm aware of them.
Should I delete the exclude rule and replace it with one that specifies NOTICES and other informational alerts, but put a new INCLUDE rule further up that includes the one for autogrow? Will that do the trick?
Hi Tmadocthomas,
You can add an exclude rule for NOTICE messages to the filter "important-events" and re-position the rule to the top. We don't recommend this as self-correcting events like under/over temperature or under-voltage from power outage events come through as NOTICE, i.e. chassisTemperature.ok. Excluding those self-correcting messages might cause you to think an issue is still on-going and have to investigate it. Perhaps creating a rule and moving NOTICE alerts it to a folder is a better approach just in case you do need to verify something.
Here is an example of what an exclude rule for NOTICE events should look like in your configuration:
cluster::> event filter show
Filter Name Rule Rule Message Name SNMP Trap Type Severity
Position Type
----------- -------- --------- ---------------------- --------------- --------
important-events
1. exclude * * NOTICE
2 include * * EMERGENCY, ALERT
3 include callhome.* * ERROR
4 exclude * * *
If you are still running issues after modification of the filter, you could be encountering bug: 1045538 - EMS notification configuration is not synchronized across all nodes where the configuration isn't getting propagated to all nodes.
The workaround for this is:
::> set advanced
::*> event config force-sync -node <node-name-not-applying-rule>
Regards,
Team NetApp
Thank you @ttran ! If this is still a best practice I don't want to change anything, however I'm not following why the two specific messages I mentioned in my primary post are being sent. They are purely informational, essentially telling me something I already know, and don't appear to add any value. Any thoughts on these two messages in particular? This only started after we upgraded from 9.3 to 9.5.
Hi Tmadocthomas,
I tested the "exclude" functionality in the lab and it is working as designed, where the two NOTICE level messages are excluded from the filter "important-events." Something you can check is if you have the filter "no-info-debug-events" configured to send events, as that does include severity level NOTICE.
You can view all your enabled filters using:
::*> event notification show
ID Filter Name Destinations
---- ------------------------------ -----------------
1 default-trap-events snmp-traphost
2 important-events snmp-traphost
2 entries were displayed.
Here are my configured event filters:
::*> event filter show
Filter Name Rule Rule Message Name SNMP Trap Type Severity
Position Type
----------- -------- --------- ---------------------- --------------- --------
default-trap-events
1 include * * EMERGENCY, ALERT
2 include callhome.* * ERROR
3 include * Standard, Built-in *
4 exclude * * *
important-events
1 include * * EMERGENCY, ALERT
2 include callhome.* * ERROR
3 exclude * * *
no-info-debug-events
1 include * * EMERGENCY, ALERT, ERROR, NOTICE
2 exclude * * *
test1
1 include * * EMERGENCY, ALERT, ERROR
2 exclude * * *
11 entries were displayed.
You can test a specific event if it will be included or excluded against a specific filter using the following:
::*> event filter test -filter-name default-trap-events -message-name LUN.nvfail.option.on
The message-name "LUN.nvfail.option.on" is excluded from the given filter.
::*> event filter test -filter-name default-trap-events -message-name LUN.space.resv.not.honored
The message-name "LUN.space.resv.not.honored" is excluded from the given filter.
::*> event filter test -filter-name important-events -message-name LUN.nvfail.option.on
The message-name "LUN.nvfail.option.on" is excluded from the given filter.
::*> event filter test -filter-name important-events -message-name LUN.space.resv.not.honored
The message-name "LUN.space.resv.not.honored" is excluded from the given filter.
The same tests for both LUN events provided against filter "no-info-debug-events" and the result is "included":
::*> event filter test -filter-name no-info-debug-events -message-name LUN.nvfail.option.on
The message-name "LUN.nvfail.option.on" is included in the given filter.
::*> event filter test -filter-name no-info-debug-events -message-name LUN.space.resv.not.honored
The message-name "LUN.space.resv.not.honored" is included in the given filter.
Here is a reference document for "event filter test" command:
Regards,
Team NetApp
@ttran this was super-helpful!! Thank you very much. My settings match yours, except I have one extra filter that also is included in the NOTICE alerts. This is the one that is generated when adding EMS events from OnCommand Unified Manager (or now ActiveIQ Unified Manager):
CITHQIFNACL01P-EBSCO-COM_filter
1 include objstore.host.* * *
2 include objstore.interclusterlifDown
* *
3 include cloud.aws.* * *
4 include qos.monitor.memory.* * *
5 include NVMeNS.* * *
6 include fg.space.member.* * *
7 include fg.inodes.member.* * *
8 include arl.netra.ca.check.* * *
9 include gb.netra.ca.check.* * *
10 include monitor.vol.* * *
11 include sms.* * *
12 include wafl.vol.autoSize.* * *
13 include LUN.* * *
14 include Nblade.* * *
15 include cifs.shadowcopy.* * *
16 include fabricpool.* * *
17 include osc.signatureMismatch * *
18 include nvmf.graceperiod.* * *
19 exclude * * *
The "LUN.*" entry is what's doing it of course. It also shows up in no-info-debug-events, however I have no destinations set for that one so I think I'm getting the alerts from the Unified Manager one.
So: I can delete that filter, but I do like being aware of truly critical EMS events. Any suggestions? Should I remove this from OCUM and then re-create it under 9.5 to see if it "cleans it up"?
Removing it from OCUM and then re-create it under 9.5 will be a good troubleshooting step.
Thanks @Mjizzini . It turns out that somewhere along the way, this filter lost it's connection to Active IQ UM so it was sort of acting as a 'standalone' filter anyway. I removed it and tried to re-add it from AIQUM but it didn't work. I'll figure that out later. Thanks again.