dfm perf data retrieve -s <seconds> "last sample in region"

korns · ‎2013-03-05

Environment: 4-node Cluster w/8.1.1 and OCUM 5.1 in cluster mode

I am tring to create useful summarized performance data by extracting perf data with the dfm perf data retrieve command. All the data is on 5-minute polling interval as fixed in 5.1 for cluster mode. I would like to create output that is for each hour. I can do that by adding a -s 3600 option on the command line. However, the -s description describes what the option does saying "the last sample in each of those regionss will be displayed" and that matches what I've been seeing (examples below). I find that (the "last sample in period) as useless. Why would I want to report on just what happened in the last 5 minute period of an hour? (or pick your favorite -s value)?

I might want the max of a value in the hour, or the average or mean of a value in the hour, but I'm struggling to think of any case where "last sample" would be useful. I'm attempting to reduce that amount of data for a week from 2016 individual samples (every 5 minutes over 7 days) to more like 288 values (24x7).

So started testing the -m and -S options. It appears the combination of -m mean and -S step returns what I'd like. However I'd really like to obtain mean and max on the same table extract. It looks like I have to do separate dfm perf data retrieve commands to get first mean, then max, right?

Also, I do not understand this description of -S. What do simple, step and rolling return.

-S For computing on fixed size data, this defines the method to advance

the chunks of data. Valid values are simple, step and rolling. Default

value is simple. Valid only when a statistical-method is specified.

Simple and step methods are available for all statistical methods.

Rolling is valid only for mean statistical method.

My questions: Could someone:

Confirm my thinking that -s 3600 -m mean -S step is returning the mean (or average) value of all samples within the hour (3600 second sample window)
Provide a better explaination of -S options and examples of when I might use each

=== here is an example to illustrate my questions. This is looking at one counter (cpu_busy) over a 3 hour period from 1am to 4am. ====

dfmc-atx$ dfm perf data retrieve -x timeindexed -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" # raw data over 3 hours

Timestamp cpu_busy

2013-03-04 01:09:20 98.843

2013-03-04 01:14:21 99.100

2013-03-04 01:19:20 96.549

2013-03-04 01:24:20 89.707

2013-03-04 01:29:20 90.713

2013-03-04 01:34:20 95.387

2013-03-04 01:39:20 90.508

2013-03-04 01:44:21 63.908

2013-03-04 01:49:20 73.308

2013-03-04 01:54:21 72.283

2013-03-04 01:59:20 73.046

2013-03-04 02:04:20 87.818

2013-03-04 02:09:20 95.400

2013-03-04 02:14:20 94.339

2013-03-04 02:19:21 79.851

2013-03-04 02:24:21 96.548

2013-03-04 02:29:20 99.237

2013-03-04 02:34:21 85.608

2013-03-04 02:39:20 96.100

2013-03-04 02:44:21 86.167

2013-03-04 02:49:21 72.928

2013-03-04 02:54:21 93.406

2013-03-04 02:59:20 78.812

2013-03-04 03:04:21 88.506

2013-03-04 03:09:20 98.307

2013-03-04 03:14:21 97.081

2013-03-04 03:19:20 96.002

2013-03-04 03:24:21 89.469

2013-03-04 03:29:20 92.037

2013-03-04 03:34:21 90.465

2013-03-04 03:39:21 89.942

2013-03-04 03:44:20 78.060

2013-03-04 03:49:20 89.095

2013-03-04 03:54:21 95.153

2013-03-04 03:59:20 86.885

dfmc-atx$ dfm perf data retrieve -x timeindexed -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" -s 3600 # sample rate 1 hour (shows last sample)

Timestamp cpu_busy

2013-03-04 02:09:20 95.400

2013-03-04 03:09:20 98.307

2013-03-04 03:59:20 86.885

dfmc-atx$ dfm perf data retrieve -x timeindexed -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" -s 3600 -m mean -S step # mean for each hour

Timestamp cpu_busy cpu_busy (mean)

2013-03-04 02:09:20 95.400 86.659

2013-03-04 03:09:20 98.307 89.151

2013-03-04 03:59:20 86.885 90.419 <= I added up the last 10 values and divided by 10 and get 90.419

dfmc-atx$

reide · ‎2013-03-21

David,

Based on what you're trying to accomplish, I believe you have the correct command:

# dfm perf data retrieve -x timeindexed -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" -m mean -S step -s 3600

-m mean tells it to do a statistical mean on the data

-S step tells it to calculate the mean in steps rather than for the entire range of data (i.e. simple)

-s 3600 tells the command what the statistical step-interval is. In this case, it will provide the mean for each 3600 seconds worth of data (3600 = 1 step).

As for the -S options:

"Simple" just applies a statistical metric to the entire range of data and gives you one result. For example, the mean across 3 days worth of data.
"Step" allows you to apply a statistical metric to a range of data in steps or chunks. For example, give me the mean for each hour across 3 days worth of data. You'd get multiple values back from this command.
"Rolling" allows you to apply a rolling average or a running average to a range of data. It is used to analyze a set of data points by creating a series of averages of different subsets of the full data set. It is only valid with the "mean" metric. Here is an example of a rolling average for every 600 seconds. To be honest, I'm not sure when you'd use a rolling average vs. a step.

Timestamp cpu_busy cpu_busy (mean)

2013-03-19 01:11:10 2.798 4.351

2013-03-19 01:12:10 - 3.623

2013-03-19 01:13:10 - 3.509

2013-03-19 01:14:10 - 3.247

2013-03-19 01:15:10 - 3.244

2013-03-19 01:16:10 - 3.243

2013-03-19 01:17:10 - 3.310

2013-03-19 01:18:10 - 3.244

2013-03-19 01:19:10 - 3.190

2013-03-19 01:20:10 - 3.137

2013-03-19 01:21:10 3.451 3.201

2013-03-19 01:22:10 - 3.250

2013-03-19 01:23:10 - 3.240

2013-03-19 01:24:10 - 3.252

2013-03-19 01:25:10 - 3.263

2013-03-19 01:26:10 - 3.267

2013-03-19 01:27:10 - 3.254

2013-03-19 01:28:10 - 3.198

2013-03-19 01:29:10 - 3.193

2013-03-19 01:30:10 - 3.234

2013-03-19 01:31:10 3.803 3.291

2013-03-19 01:32:10 - 3.319

2013-03-19 01:33:10 - 3.317

2013-03-19 01:34:10 - 3.360

2013-03-19 01:35:10 - 3.294

2013-03-19 01:36:10 - 3.344

2013-03-19 01:37:10 - 3.355

2013-03-19 01:38:10 - 3.383

2013-03-19 01:39:10 - 3.391

2013-03-19 01:40:10 - 3.563

2013-03-19 01:41:10 3.204 3.529

2013-03-19 01:42:10 - 3.482

2013-03-19 01:43:10 - 3.448

2013-03-19 01:44:10 - 3.451

2013-03-19 01:45:10 - 3.452

2013-03-19 01:46:10 - 3.503

2013-03-19 01:47:10 - 3.438

2013-03-19 01:48:10 - 3.434

2013-03-19 01:49:10 - 3.468

2013-03-19 01:50:10 - 3.494

2013-03-19 01:51:10 3.700 3.374

Reid

korns · ‎2013-03-22

Thanks Reid. That confirms what I was not sure about and I agree, not sure when I'd use rolling.

rmatsumoto · ‎2013-11-07

we use rolling average within PA but I've never used it within CLI. It's been helpful to have rolling average displayed on top of normal data to visualize a mid/long-term trend, but I'm not sure when we may use this in CLI.

dfm perf data retrieve -s <seconds> "last sample in region"

And the Legacy Continues! 🏆