Updated: April 17, 2013 - updated templates attached, please update your systems.
How to: Build a multilevel alert in Performance Advisor
In this article I build on the excellent technical report TR-4090 Performance Advisor Features and Diagnosis (7-mode) which outlines what to monitor with Performance Advisor. If you have not seen this paper, I highly recommend you download it here. It will give you a sound methodology and explain which counters really matter.

Monitoring a performance counter with a threshold is useful, but the typical threshold has a single value and a single severity level: error.
In this article, I'll show you how to build multilevel thresholds as defined in the Performance Advisor Default Thresholds document and assign alarms to those so you get notified as each is breached.
Building and applying a multilevel threshold is a 4 step process:
- Build the template
- Set the severity levels
- Apply the template to objects to monitor
- Create alarms for notification
First, decide what you want to monitor. In our case, I'll use the latency for the CIFS protocol and build a single threshold template with three distinct thresholds, each with a different severity level. Here are the thresholds and the severities we'll be using.
Counter
|
Threshold
|
Severity
|
cifs:cifs_latency
|
> 15ms over 5 minutes
|
warning
|
cifs:cifs_latency
|
> 20ms over 5 minutes
|
error
|
cifs:cifs_latency
|
> 40ms over 5 minutes
|
critical
|
Step 1: Build the template
Login to the NetApp Management Console and select Performance Advisor. Go to the bottom of the navigation panel and click Setup. In the upper portion of the panel, click Threshold Templates. Click the Add icon in the main panel to start the Add Threshold Template Wizard.

Use the Next button to step through the wizard, completing this information:
Name: NA_CIFS_Latency
Description: CIFS Latency > 15/20/40ms over 5 min
Threshold Interval(seconds): 300
Event Name: CIFS_Latency_warn
Object: cifs
Counter: cifs_latency
Type: upper
Value: 15
Unit: millisec
Event Name: CIFS_Latency_error
Object: cifs
Counter: cifs_latency
Type: upper
Value: 20
Unit: millisec
Event Name: CIFS_Latency_critical
Object: cifs
Counter: cifs_latency
Type: upper
Value: 40
Unit: millisec
When completed, your template should look like this:

Step 2: Set the Severity Levels
When you create an event in a template Performance Advisor assigns it a severity of error. You need to modify the warning and critical level events so that they have the correct severity level. To do this, we need to go to the command line on the OnCommand server.
Each performance 'event' consists of two conditions: a normal condition and a breached condition. The normal condition is the expected state, that is not an error, while the breached condition signifies the abnormal condition for which you are monitoring.
To set the correct severity we will modify the breached condition of each event using the dfm eventType command. The syntax of the command is:
dfm eventType modify -v <event-severity> <event-name>
where
<event-name> = <event type>:<event>:<condition>
In our case, the <event type> is perf since this is a performance event, the event is the name we created, and the condition is breached. The commands to modify our warning and critical events to the correct severity are:
C:\>dfm eventType modify -v critical perf:CIFS_Latency_critical:breached
Modified event "perf:CIFS_Latency_critical:breached".
C:\>dfm eventType modify -v warning perf:CIFS_Latency_warn:breached
Modified event "perf:CIFS_Latency_warn:breached".
The commands to verify our modifications are:
C:\>dfm eventType list perf:CIFS_Latency_warn:breached
Event Name Severity Class
-------------------------------------------------- ------------ ------------------
perf:CIFS_Latency_warn:breached Warning perf:CIFS_Latency_warn
C:\>dfm eventType list perf:CIFS_Latency_critical:breached
Event Name Severity Class
-------------------------------------------------- ------------ ------------------
perf:CIFS_Latency_critical:breached Critical perf:CIFS_Latency_critical
If you are creating a series of these events, create a batch file or shell script to make the severity changes for you, it will make things go much easier and quicker.
Step 3: Apply the threshold template to what you want to monitor
We designed this template to monitor latency for the CIFS protocol, which is a controller-level value. Our next step is to apply the template to the controllers we wish to monitor.
Select the threshold template NA_ CIFS_Latency, right click, and choose Objects from the drop down menu. In the Objects pane, select the controllers to monitor in the left panel and use the > button to move the controller to the right panel. Click the OK button to begin monitoring the controllers for the threshold.

Now the system will generate a warning, error, or critical event entry in the log each time the threshold is breached, and a normal event when the counter returns below the threshold.
OK, we're almost done, you only have one more step!
Step 4: Add Alarms for the Events
Now that you are monitoring the thresholds and generating events in the logs, one step remains - telling the system for which events you need alarms (notifications) generated.
In the Setup navigation panel, select Thresholds. In the main panel select the threshold you want to generate alarms.

Hint: you can use the filter in the 'Event Name' and 'Object' columns to quickly locate the events. For our events, enter 'CIFS' in the filter box and the display will show only events beginning with 'CIFS'.
Once you have selected the threshold, right-click and choose Add Alarm from the drop down menu. On the Add Alarm Wizard, fill in the field to notify via e-mail, pager (SMS messaging), a script to execute, or a SNMP host to send a trap to. Click Next to move to the next panel, then enter the time range for the alert to be active, and whether to do repeat notifications. Click Next one more time, review your choices, then click Finish to save the alarm. Congratulations, you're now set to monitor and alert.

Quick Start
As a quick start, I have uploaded a pre-built set of multilevel threshold templates for OnCommand 5.0 that implement basic multilevel thresholds as outlined in the Thresholds document mentioned at the beginning of this post. The file BasicMultiThresholdTemplates.zip contains these thresholds, ready to import into Performance Advisor
Updated: Two changes as of April 17, 2013.
The NA_CPU_Busy_HA and NA_CPU_Busy_SGL templates are updated to use the "avg_processor_busy" counter which presents a more accurate picture of controller CPU utilization.
Threshold Template
|
Description
|
NA_CIFS_Latency
|
CIFS Latency > 15/20/40ms over 5 min
|
NA_CPU_Busy_HA
|
CPU busy > 50/70/90 Pct on controller in HA-Pair for 5 min
|
NA_CPU_Busy_SGL
|
CPU busy > 60/70/80 Pct on single controller system for 5 min
|
NA_DISK_Busy
|
Check for disks busy > 60/70/90 percent
|
NA_FCP_Latency
|
FCP avg latency > 10/20/30ms over 5 min
|
NA_ISCSI_Latency
|
iSCSI average latency > 15/20/30ms over 5min
|
NA_LUN_Latency
|
LUN average read/write latencies > 20/30/40ms for 5 min
|
NA_MAX_Disk_Busy
|
Maximum disk busy > 80/90/98 for 5 minutes
|
NA_NFSV3_Latency
|
NFS v3 average latency > 15/20/40ms over 5min.
|
NA_NFSV4_Latency
|
NFS v4 average latency > 15/20/40ms over 5min.
|
NA_SYS_Avg_Latency
|
Average latency across controller for all operations, 20/30/40ms
|
These are based on the monitoring thresholds outlined in the Thresholds document. They do not include application specific thresholds, but based on this article you should be able to create those very easily.
Good Luck and happy monitoring!