Subscribe

dfm perf data retrieve -s <seconds> "last sample in region"

Environment: 4-node Cluster w/8.1.1 and OCUM 5.1 in cluster mode

I am tring to create useful summarized performance data by extracting perf data with the dfm perf data retrieve command. All the data is on 5-minute polling interval as fixed in 5.1 for cluster mode. I would like to create output that is for each hour. I can do that by adding a -s 3600 option on the command line. However, the -s description describes what the option does saying "the last sample in each of those regionss will be displayed" and that matches what I've been seeing (examples below). I find that (the "last sample in period) as useless. Why would I want to report on just what happened in the last 5 minute period of an hour? (or pick your favorite -s value)?

I might want the max of a value in the hour, or the average or mean of a value in the hour, but I'm struggling to think of any case where "last sample" would be useful. I'm attempting to reduce that amount of data for a week from  2016 individual samples (every 5 minutes over 7 days) to more like 288 values (24x7).

So started testing the -m and -S options. It appears the combination of -m mean and -S step returns what I'd like. However I'd really like to obtain mean and max on the same table extract. It looks like I have to do separate dfm perf data retrieve commands to get first mean, then max, right?

Also, I do not understand this description of -S. What do simple, step and rolling return.

-S For computing on fixed size data, this defines the method to advance

        the chunks of data. Valid values are simple, step and rolling. Default

        value is simple. Valid only when a statistical-method is specified.

        Simple and step methods are available for all statistical methods.

        Rolling is valid only for mean statistical method.

My questions: Could someone:

  1. Confirm my thinking that -s 3600 -m mean -S step is returning the mean (or average) value of all samples within the hour (3600 second sample window)
  2. Provide a better explaination of -S options and examples of when I might use each

=== here is an example to illustrate my questions. This is looking at one counter (cpu_busy) over a 3 hour period from 1am to 4am. ====

dfmc-atx$ dfm perf data retrieve -x timeindexed  -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" # raw data over 3 hours

          Timestamp     cpu_busy

2013-03-04 01:09:20        98.843

2013-03-04 01:14:21        99.100

2013-03-04 01:19:20        96.549

2013-03-04 01:24:20        89.707

2013-03-04 01:29:20        90.713

2013-03-04 01:34:20        95.387

2013-03-04 01:39:20        90.508

2013-03-04 01:44:21        63.908

2013-03-04 01:49:20        73.308

2013-03-04 01:54:21        72.283

2013-03-04 01:59:20        73.046

2013-03-04 02:04:20        87.818

2013-03-04 02:09:20        95.400

2013-03-04 02:14:20        94.339

2013-03-04 02:19:21        79.851

2013-03-04 02:24:21        96.548

2013-03-04 02:29:20        99.237

2013-03-04 02:34:21        85.608

2013-03-04 02:39:20        96.100

2013-03-04 02:44:21        86.167

2013-03-04 02:49:21        72.928

2013-03-04 02:54:21        93.406

2013-03-04 02:59:20        78.812

2013-03-04 03:04:21        88.506

2013-03-04 03:09:20        98.307

2013-03-04 03:14:21        97.081

2013-03-04 03:19:20        96.002

2013-03-04 03:24:21        89.469

2013-03-04 03:29:20        92.037

2013-03-04 03:34:21        90.465

2013-03-04 03:39:21        89.942

2013-03-04 03:44:20        78.060

2013-03-04 03:49:20        89.095

2013-03-04 03:54:21        95.153

2013-03-04 03:59:20        86.885

dfmc-atx$ dfm perf data retrieve  -x timeindexed  -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" -s 3600   # sample rate 1 hour (shows last sample)

          Timestamp     cpu_busy

2013-03-04 02:09:20        95.400

2013-03-04 03:09:20        98.307

2013-03-04 03:59:20        86.885

dfmc-atx$ dfm perf data retrieve  -x timeindexed  -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" -s 3600 -m mean -S step # mean for each hour

          Timestamp     cpu_busy        cpu_busy (mean)

2013-03-04 02:09:20        95.400                  86.659

2013-03-04 03:09:20        98.307                  89.151

2013-03-04 03:59:20        86.885                  90.419    <= I added up the last 10 values and divided by 10 and get 90.419

dfmc-atx$

Re: dfm perf data retrieve -s <seconds> "last sample in region"

David,

Based on what you're trying to accomplish, I believe you have the correct command:

# dfm perf data retrieve  -x timeindexed  -o atxcluster01-04 -C system:cpu_busy -b "2013-03-04 01:00:00" -e "2013-03-04 04:00:00" -m mean -S step -s 3600

-m mean tells it to do a statistical mean on the data

-S step tells it to calculate the mean in steps rather than for the entire range of data (i.e. simple)

-s 3600 tells the command what the statistical step-interval is.  In this case, it will provide the mean for each 3600 seconds worth of data (3600 = 1 step).

As for the -S options:

  • "Simple" just applies a statistical metric to the entire range of data and gives you one result.  For example, the mean across 3 days worth of data.
  • "Step" allows you to apply a statistical metric to a range of data in steps or chunks.  For example, give me the mean for each hour across 3 days worth of data.  You'd get multiple values back from this command.
  • "Rolling" allows you to apply a rolling average or a running average to a range of data. It is used to analyze a set of data points by creating a series of averages of different subsets of the full data set. It is only valid with the "mean" metric.  Here is an example of a rolling average for every 600 seconds.  To be honest, I'm not sure when you'd use a rolling average vs. a step.

          Timestamp     cpu_busy        cpu_busy (mean)

2013-03-19 01:11:10         2.798                   4.351

2013-03-19 01:12:10             -                   3.623

2013-03-19 01:13:10             -                   3.509

2013-03-19 01:14:10             -                   3.247

2013-03-19 01:15:10             -                   3.244

2013-03-19 01:16:10             -                   3.243

2013-03-19 01:17:10             -                   3.310

2013-03-19 01:18:10             -                   3.244

2013-03-19 01:19:10             -                   3.190

2013-03-19 01:20:10             -                   3.137

2013-03-19 01:21:10         3.451                   3.201

2013-03-19 01:22:10             -                   3.250

2013-03-19 01:23:10             -                   3.240

2013-03-19 01:24:10             -                   3.252

2013-03-19 01:25:10             -                   3.263

2013-03-19 01:26:10             -                   3.267

2013-03-19 01:27:10             -                   3.254

2013-03-19 01:28:10             -                   3.198

2013-03-19 01:29:10             -                   3.193

2013-03-19 01:30:10             -                   3.234

2013-03-19 01:31:10         3.803                   3.291

2013-03-19 01:32:10             -                   3.319

2013-03-19 01:33:10             -                   3.317

2013-03-19 01:34:10             -                   3.360

2013-03-19 01:35:10             -                   3.294

2013-03-19 01:36:10             -                   3.344

2013-03-19 01:37:10             -                   3.355

2013-03-19 01:38:10             -                   3.383

2013-03-19 01:39:10             -                   3.391

2013-03-19 01:40:10             -                   3.563

2013-03-19 01:41:10         3.204                   3.529

2013-03-19 01:42:10             -                   3.482

2013-03-19 01:43:10             -                   3.448

2013-03-19 01:44:10             -                   3.451

2013-03-19 01:45:10             -                   3.452

2013-03-19 01:46:10             -                   3.503

2013-03-19 01:47:10             -                   3.438

2013-03-19 01:48:10             -                   3.434

2013-03-19 01:49:10             -                   3.468

2013-03-19 01:50:10             -                   3.494

2013-03-19 01:51:10         3.700                   3.374

Reid

Re: dfm perf data retrieve -s <seconds> "last sample in region"

Thanks Reid. That confirms what I was not sure about and I agree, not sure when I'd use rolling.

Re: dfm perf data retrieve -s <seconds> "last sample in region"

we use rolling average within PA but I've never used it within CLI.  It's been helpful to have rolling average displayed on top of normal data to visualize a mid/long-term trend, but I'm not sure when we may use this in CLI.