Welcome to the third part of the “Self-Managing Storage” blog series. In this blog post, I explain Workload Performance Lifecycle Management and how Active IQ Unified Manager simplifies this management through lifecycle management.
Workload Performance Management
Workload Performance Lifecycle Management for NetApp® Active IQ Unified Manager is performed using performance service level management (PSLs), which is a framework that was inherited from NetApp Service Level Manager in the 9.7 release of Active IQ Unified Manager.
Workload PSL management is performed by making sure that the workloads are assigned appropriate PSLs so that the workload performance consumption stay within configured boundaries. This ensures that workload performance consumption does not grow abnormally so that it consumes all performance resources, ultimately affecting performance for other workloads.
In Active IQ Unified Manager, The PSL consists of peak and expected limits that define the quality of service threshold on the ONTAP cluster as well as the target latency which is used by Active IQ Unified Manager to alert customers in case of latency violations. The customer can define their PSL policies or use the predefined ones that are available in Active IQ Unified Manager on the Performance Service Level UI page.
Figure 1: Workload Performance Lifecycle Management
Preventive Management
Preventive performance management consists of PSL recommendations best suited for workloads based on historical trends of workload demand which are re-estimated every 24 hours using the data collected in 5-minute increments for the past 30 days. The estimation is aimed at ensuring that the suggested QoS limits do not cause workloads to be throttled or incur any extra latency.
Figure 2: Recommended Performance Service Level IOPS for workloads
Proactive Workload Performance Management
Proactive workload performance management in Active IQ Unified Manager consists of identifying workloads that have a performance mismatch with respect to the assigned PSLs and highlighting them for the customer on the Workloads Inventory page. The user can then take appropriate action for the workloads. The goal is to warn the customer of workload demand trends and sustain growth without throttling due to unecessarily low QoS limits. The best remediation would be to change the PSL assigned to the workload. To simplify taking remediation action, a new Bulk-Assign feature has been introduced in Active IQ Unified Manager 9.8 that allows the user to change or bulk assign recommended PSLs to workloads with one click. The bulk assignment of system-recommended PSLs is suggested after review of the affected workloads, their business criticality, and the corresponding PSL.
Figure 3: Bulk Assignment of Performance Service Levels
Reactive Workload Performance Management
Reactive workload performance management consists of alerting the user when nonconformance with the PSL target latency is observed for one or more workloads. Generally, the higher latencies are due to QoS throttling. Nonconformance events appear in the list of inventory events indicating that workload latency conformance has been violated. After the threshold has been violated, an actionable event with the appropriate recommendations is generated. This recommendation can be to assign the workload with the system recommended performance service level, or another higher PSL is generated to avoid further throttling of the workload.
Figure 3: Reactive Actions
There is More!
We hope that you now have an overall understanding of the Workload Performance Lifecycle Management feature introduced in Active IQ Unified manager. Watch out for this blog series for an in-depth understanding of performance, capacity, and security lifecycle management.
- Self-Managing Storage: Part 1 – Understanding Active IQ Unified Manager LifeCycle Management
- Self-Managing Storage: Part 2 – Understanding Storage Resource Performance LifeCycle Management
- Self-Managing Storage: Part 3 – Understanding Workload Performance LifeCycle Management
- Self-Managing Storage: Part 4 – Understanding Capacity LifeCycle Management
- Self-Managing Storage: Part 5 – Understanding Security Manager LifeCycle Management
We know that you may have questions because we couldn’t cover the entire topic, so please connect us if needed.