2017-03-30 08:37 AM
We monitor "sysstat -x" output to look at sum of "Kahuna" and "Kahu". We would like to display this information in Graphite. Does Harvest collect these counters? If so what are their names.
2017-03-31 07:19 AM
Hope you're doing well!
Harvest collects CPU domain utilization but misses the number in "()" from sysstat as there is no counter for it. Honestly though, these days when looking at a system from the CLI I no longer use that "()" number anyway which displays the % of time that either Kahuna or WAFL_Ex was being scheduled.
Some background, Kahuna handles serialized WAFL operations (and a few other things) and can run on max 1 CPU, and WAFL_Ex handles parallelized WAFL operations and can typically run on (sum CPUs - 1) CPUs. They are mutually exclusive when it comes to scheduling, so if Kahuna is active then WAFL_Ex will not be scheduled, and vice versa. Engineering continually works to increase overall throughput of the system by enhancing each new release and taking operations that were first implemented in a serialized manner and making them parallel.
My rule of thumb is if low latency is important to you then consider it a 'warning' level if Kahuna is > 50% or total CPU > 70%, for a continuous period. High levels of Kahuna can start to 'squeeze' the runtime of WAFL_Ex as well as introduce queueing for CPU. High levels of total CPU will also increase the amount of queueing for CPU adding latency to every request..
I discuss this topic even more in this post including a more useful way using QoS counters to determine the impact of a busy node on the latency of a volume by using the QoS breakdown view.
Solution Architect - 3rd Platform - Systems Engineering NetApp EMEA (and author of Harvest)
Blog: It all begins with data
If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO or both!