OnCommand Performance Manager (OPM) 1.1 features user access to the underlying data collected allowing integration into third party tools and creation of custom data presentations. The database details have not yet been published in the Release Candidate documentation to date, but this post covers what I’ve worked out so far. Everything is subject to change of course before the final release (please see disclaimer at the end of this post). As there have been some other posts interested in the topic, I wanted to share what I’ve learned.
Database Access Setup
Both OPM 1.1 and the companion OnCommand Unified Manager (OCUM) 6.2 release candidates support a new user role “Database User”. This user is configured in the same manner as any other user through the GUI under the Administration / Manage User menu pulldowns. A limitation on the Database User role is that the defined user must be a locally authenticated user. You cannot use any remote authentication such as LDAP or Active Directory for this role at the current time. So you’ll have to add one more line item to the password rotation element of your normal security maintenance plan when you add database users to either OCUM or OPM. Additionally, OCUM and OPM do not share local user definitions – if you hook multiple OPMs under a single OCUM, you need to add and maintain database users separately to each instance.
Accessing the Database
OPM uses the MySQL database engine. You can use any driver/database connection mechanism to access the DB that supports MySQL. For example, you can install the ODBC drivers for MySQL on a Windows platform and thus make the database accessible to any application that can drive ODBC. OPM accepts connections on the typical MySQL port 3306 on a TCP connection.
OPM advertises a number of databases. Two are of interest to this discussion – netapp_model_view and netapp_performance. Database netapp_model_view has tables that define the objects and relationships among the objects for which performance data is collected, such as aggregates, SVMs, clusters, volumes, etc. Database netapp_performance has tables which contain the raw data collected as well as periodic rollups used to quickly generate the graphics OPM presents through its GUI.
Understanding the Data
Key to understanding the data layout is that OPM assigns each “object” a unique id. These ids are not coordinated with similar id fields maintained in the OCUM databases but only have meaning within the OPM context. The id fields define the relationships between objects. For instance, the query
SELECT name, objid, clusterId, nodeId FROM aggregates
run against the netapp_model_view database will list the known aggregates and the identifiers to which cluster and node the aggregates are associated. You can join on the various related tables through the object ids. Most tables have a base version and a “full” version, e.g. “aggregates” and “aggregates_full”. The full version seems to add just the column “objstate” to the data already available in the base version. I suspect that the base versions will be essentially stable and that the full versions may offer additional columns as OPM evolves in future versions.
Additional tables of interest in netapp_model_view are “change_delete_detail”, “change_record”, and ”changed_attribute”. These tables track object based changes over time as OPM discovers them. For example, a typical operation is to move a volume between aggregates. For that change the “change_record” table tracks type (“volume_move” in this case) the object moved by object id, the cluster affected by cluster object id, and the time it was noticed in OPM. The change records are needed for a complete performance analysis over time – for instance a 30 day view of the I/O load of the aggregate containing a particular volume needs to take into account whether that volume moved between aggregates during that time frame so the proper data is used in each part of the analysis.
Timestamps within the data set are UNIX style time since epoch. The timestamp includes milliseconds in the value, so be sure that your time conversion utilities take that into account.
Database “netapp_performance” contains the actual data samples collected. All tables of the form “sample_*” are the raw performance data collected for that period of time. Each table row contains the OPM object id for the element, the timestamp of the collection, and the raw data.
The trick for the performance tables is that not every table type is what I call simple. For instance, there is no “sample_aggregate” table. Instead, there is one sample_aggregate table per cluster, with the cluster’s object id appended to the base name “sample_aggregate” - for example “sample_aggregate_1748” which contains all aggregate samples for the cluster having OPM object id 1748. Other tables that have an embedded object id typically appear to be related to the highest level object that owns the object. For instance, aggregates can trace physically to cluster, volumes trace logically to SVMs, etc.
The “summary” versions of the tables contain roll-ups of the sample data. OPM is supposed to track 90 days of historical data for display purposes. I haven’t analyzed how long the detail is kept versus the rollup summaries. Any really long term data collection would ned to extract data from OPM and store in another system.
I hope this basic discussion of the OPM data layout is useful to those wanting to customize performance data presentation. OPM 1.1 also includes two additional data access methods – a regular push feed to external tools as well as a REST API to gather data on demand. So far I’ve not seen detail documentation on either of these (such as the format of the push data stream, etc.) but I understand that such documentation is expected sometime during the 1st quarter of 2015.
I remind everyone that I’m just a customer – I don’t speak for NetApp nor do I make any representations on their behalf. Everything discussed is based on either direct personal research or other information gathered that is not subject to NDA or similar restrictions.