I work at a site that doesn't allow for autosupport to send messages to NetApp support. I have setup email alerts internally but am looking for some ideas / suggestions on what to alert on? To start I am looking for alerts for the following occurs:
Drive failure, port goes down / offline, and when a volume is above 90% full.
I realize there additional pieces of software I can buy that will help notify me when these events occur but we have no budget for more software.
I am running ONTAP 9.7p12
In looking at the EMS event catalog 9.7 I believe ""disk.outOfService" and "disk.ioRecoveredError.pfa" should be what I should use for failing/failed disks. Unless someone has a better suggestion?
Does anyone have any suggestions on the exact event I should monitor for ports / hardware offline and volume over 90%?
Are you using AIQUM to capture SNMP traps or just the general alert notifications? Many of the typical hardware failures are already set up in AutoSupport if you are just configured to receive email notifications properly (such as: hardware failures, takeover, link down, etc.)
As per suggestion AIQUM is free and you can use to consolidate all alerts from all you storage (assuming they are all running on (DOT 9.x). As for alerts, the default setting usually is sufficient. You can have autosupport sent to internal email. If the alerts is too much for you, you can then play around with filters using the "event notification/configuration"