VMware Solutions Discussions

ONTAP 8.1.2 7-Mode cpu_busy counter

GARDINEC_EBRD
9,304 Views

Hi All,

I must be having a bad day or something, but can't get my head around this today.  I've got a FAS3240 running at close 100% on the cpu_busy counter.  Latency looks fine, so whatever it is it's not causing a performance issue, but it is generating CPU alerts on DFM.

I have a number of NDMP tape to tape operations running which seem to be generating this as the vmware over NFS load is relatively light (1000-2000 NFS ops/sec).

So, sysstat 1 looks like this:

  CPU     NFS    CIFS    HTTP     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache

                                   in    out    read  write    read  write    age

92%    1165       0       0   15273   4320   31932  47209  145214 145214    51s

98%    2662       0       0    7119   1015   14508  55220  149094 149094    51s

99%    1251       0       0   13042   5001    3756  51936  151912 151978    51s

99%     650       0       0    5901    671     456   7308  155058 154993    51s

99%    1190       0       0    8936    757     296      8  154927 154927    51s

99%    2134       0       0    5885    899     256      0  154206 154206    51s

It doesn't appear to be a CPU core that's at 98% (sysstat -m)...

ANY  AVG  CPU0 CPU1 CPU2 CPU3

100%  58%   58%  57%  57%  58%

100%  54%   54%  54%  53%  55%

100%  57%   57%  56%  56%  58%

100%  55%   56%  55%  54%  56%

100%  77%   78%  77%  77%  77%

...so I assume it's a cpu domain that's maxed out.  Looking at statit output, I can't see any particular domain that's at high utilization:

   NetApp Release 8.1.2 7-Mode: Tue Oct 30 19:56:51 PDT 2012

    <1O>

  Start time: Wed Mar 27 09:32:55 GMT 2013

                       CPU Statistics

      14.141772 time (seconds)       100 %

      28.341048 system time          200 %

       0.611241 rupt time              4 %   (211010 rupts x 3 usec/rupt)

      27.729807 non-rupt system time 196 %

      28.226036 idle time            200 %

       2.514662 time in CP            18 %   100 %

       0.104331 rupt time in CP                4 %   (37134 rupts x 3 usec/rupt)

                       Multiprocessor Statistics (per second)

                          cpu0       cpu1       cpu2       cpu3      total

sk switches          161727.33  163272.40  163634.80  165143.94  653778.47

hard switches         78693.46   79794.03   80383.21   83490.46  322361.16

domain switches       45627.52   45924.02   46501.74   49155.37  187208.65

CP rupts                730.25     355.47     406.10    1134.02    2625.84

nonCP rupts            3352.41    1643.64    1731.04    5568.11   12295.21

IPI rupts                 0.00       0.00       0.00       0.00       0.00

grab kahuna               0.00       0.00       0.00       0.00       0.00

grab kahuna usec          0.00       0.00       0.00       0.00       0.00

CP rupt usec           4324.63     127.00     509.55    2416.25    7377.51

nonCP rupt usec       20245.62     579.42    2214.01   12805.68   35844.80

idle                 502422.54  502765.85  503554.86  487190.15 1995933.54

kahuna                12694.87   12387.56   12685.04   13127.49   50895.11

storage               28276.80   29771.09   30174.72   26242.11  114464.79

exempt               120301.76  122678.12  121024.58  124690.03  488694.63

raid                   4533.66    4430.14    4064.55    4970.66   17999.16

target                  626.23     744.81     609.47     849.12    2829.77

dnscache                  0.00       0.00       0.00       0.00       0.00

cifs                     26.87      46.10      55.79      65.20     194.04

wafl_exempt           33716.85   30728.04   29905.73   28599.32  122950.08

wafl_xcleaner          2258.34    2442.83    2412.92    1487.15    8601.33

sm_exempt                13.15      19.23      17.47      20.79      70.78

cluster                   0.00       0.00       0.00       0.00       0.00

protocol                 34.93      33.16      50.63      36.63     155.43

nwk_exclusive           629.98     857.46     483.53     895.08    2866.26

nwk_exempt            34561.79   51149.32   49054.67   51691.61  186457.47

nwk_legacy           222720.96  227699.82  229450.67  244670.12  924541.78

hostOS                12610.30   13539.39   13731.16     241.98   40122.98

       13.862242 seconds with one or more CPUs active   ( 98%)

       9.259903 seconds with 2 or more CPUs active     ( 65%)

       3.400337 seconds with 3 or more CPUs active     ( 24%)

        4.602338 seconds with one CPU active            ( 33%)

       5.859565 seconds with 2 CPUs active             ( 41%)

       2.451869 seconds with 3 CPUs active             ( 17%)

       0.948468 seconds with all CPUs active           (  7%)

                        Domain Utilization of Shared Domains (per second)

      0.00 idle                         106174.04 kahuna

      0.00 storage                           0.00 exempt

      0.00 raid                              0.00 target

      0.00 dnscache                          0.00 cifs

      0.00 wafl_exempt                       0.00 wafl_xcleaner

      0.00 sm_exempt                         0.00 cluster

      0.00 protocol                     956506.30 nwk_exclusive

      0.00 nwk_exempt                        0.00 nwk_legacy

      0.00 hostOS

Can anyone see what I'm missing????

Thanks,

Craig

16 REPLIES 16

maske
9,270 Views

Where is the rest of the output of sysstat -m?  I'd like to see what the CPU domains are doing for protocol, etc...

GARDINEC_EBRD
9,270 Views

I don't believe sysstat -m gives domain stats (?).  The statit output shows some domain info...

maske
9,270 Views

Sorry, what does sysstat -m -x 1 look like?

GARDINEC_EBRD
9,270 Views

Whatever was going on with that filer has stopped now, but I still have a sysstat -x 1 in my putty history from earlier.  seems to be the same output with -m -x as it is with just -x:

CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s

                                       in    out    read  write    read  write    age    hit  time  ty

util                            in    out      in    out

96%   4409      0      0    4440   47525  20247   21071  78693  139256 139321    36s    94%   82%  :

   10%       0     31      0     468    401       0      0

99%   5479      0      0    5496   84828  28699   10989     24  146130 145999    36s    96%    0%  -

    6%       5     12      0      40      5       0      0  99%   5118      1      0    5123   51956  16436    6283      0  148077 148274    36s    98%    0%  -

    4%       0      4      0      99      4       0      0  85%   4122      0      0    4353   60060  15606   22340  51172  129171 129106    36s    98%   50%  Ff   13%       0    231      0    4392   3971       0      0  95%   4258      0      0    4264   45687  27781   18496  61452  140313 140247    41s    97%  100%  :f    8%       0      6      0      11      0       0      0  97%   4312      0      0    4314   74270  13122   17403  54282  142726 142726    41s    98%  100%  :f    9%       0      2      0       9      0       0      0  97%   4623      0      0    4644   60221  17715   13232  75404  143458 143458    41s    98%  100%  :f   10%       5     16      0     113      0       0      0  97%   3878      1      0    3887   65123  15201   12303  14945  142554 142489    41s    98%   35%  Fn    6%       0      8      0      11      5       0      0  89%   3893      0      0    3939   58449  10234   19375  83097  131594 131659    41s    98%  100%  :f   12%       0     46      0     763    658       0      0  95%   4375      0      0    4410   59059  29603   26335  86619  134277 134277    41s    92%  100%  :f   13%       0     35      0     519    396       0      0  95%   3987      0      0    3995   54986   8975    9708  56304  141930 141930    45s    90%  100%  :f   11%       0      8      0      27      0       0      0  98%   2624      0      0    2634   36462  12530   15296  20542  144231 144166    45s    88%   64%  :    15%       5      5      0      29      0       0      0  97%   2662      0      0    2682   69821   4072   25248  93228  116828 116959    42s    90%   53%  Ms   24%       0     20      0     329    211       0      0  90%   2707      0      0    2721   62560   7956   20814  57417  135585 135585    37s    89%  100%  :f   29%       0     14      0     141    137       0      0  90%   2901      0      0    2910   25538  11683   33010  68322  136544 136480    38s    89%  100%  :f   33%       0      9      0      41      0       0      0  92%   2880      0      0    2883   27997  11600   19170  46875  140609 140738    34s    90%   90%  :    31%       0      3      0       2      0       0      0  94%   2538      1      0    2556   18032   9506   17258     24  145203 145074    35s    88%    0%  -    30%      10      7      0      15      0       0      0

maske
9,270 Views

Can you attach that output in a text document?  It's all wordwrapped.  CPU utilization on 8.x and above can go to 100% at any given time as the system uses all CPU when the filer is less busy to run background operations.  I do understand your concern though that DFM is tripping when this happens.  If you want to PM me your serial number I can see if we have autosupport perf data I can take a look at.

maske
9,270 Views

Hi Craig,

I have the data I said I'd collect for you.  Please PM me your email address and I'll send you what I have.

Thanks,

Doug

chad_petrie
9,270 Views

I am having a similar issue, was there anything specific that jumped out to resolve this issue?

thomas_glodde
9,270 Views

go for a

priv set diag

sysstat -M 1 (capital m)

GARDINEC_EBRD
9,270 Views

Hi,

I still haven't found out why cpu_busy is so high.  I have a system exhibiting this again right now.  I know this controller is running some NDMP tape operations for TSM at the moment, so suspect this is the reason, however, I'd like to see which domain is at 100%.

@Thomas: I've run sysstat -M 1 at diag level (thanks, didn't know this one) but still unclear which domain/core is at 100%.  Apologies for paste below, but this is a FAS3240, not busy enough to warrant 100% cpu, but what is causing it?  As before, latency is fine and seems to be unaffected:

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

100%   69%   27%    7%  52%  52%  52%  52%  54%    118%       0%      0%     12%   2%     1%     6%     14%( 13%)          0%        0%   0%    49%   5%   4%   2369   0%

100%   75%   35%   10%  57%  58%  57%  57%  57%    126%       0%      0%     11%   1%     0%     9%     20%( 19%)          0%        0%   0%    48%   7%   5%   3927   0%

100%   69%   27%    7%  52%  52%  52%  52%  53%    119%       0%      0%     11%   1%     0%     6%     13%( 12%)          0%        0%   0%    50%   5%   4%   2152   0%

100%   71%   31%   10%  55%  55%  54%  55%  56%    121%       0%      0%     12%   1%     0%     8%     18%( 16%)          0%        0%   0%    50%   5%   4%   2338   0%

100%   70%   28%    8%  53%  53%  52%  52%  54%    120%       0%      0%     11%   1%     0%     6%     14%( 13%)          0%        0%   0%    50%   5%   4%   2177   0%

100%   74%   33%    9%  56%  56%  56%  56%  57%    124%       0%      0%     11%   1%     0%     8%     18%( 17%)          0%        0%   0%    50%   6%   5%   3773   0%

100%   70%   29%    9%  54%  54%  53%  53%  55%    122%       0%      0%     11%   1%     0%     5%     15%( 14%)          0%        0%   0%    49%   6%   5%   3111   0%

100%   84%   58%   37%  71%  71%  71%  71%  72%    101%       0%      0%     17%  11%     0%     9%     64%( 43%)         23%        0%   0%    52%   6%   4%   2886  49%

100%   74%   36%   15%  58%  57%  57%  57%  59%    117%       0%      0%     13%   7%     0%     6%     19%( 16%)          0%        0%   0%    59%   5%   4%   2478 100%

100%   78%   44%   20%  62%  63%  62%  62%  63%    119%       0%      0%     14%   8%     0%     7%     31%( 24%)          0%        0%   0%    60%   6%   4%   3316 100%

100%   73%   35%   14%  57%  57%  56%  57%  58%    115%       0%      0%     14%   8%     0%     5%     18%( 15%)          0%        0%   0%    59%   5%   4%   2006 100%

100%   80%   44%   17%  64%  65%  64%  64%  64%    122%       0%      0%     14%   6%     0%     8%     28%( 24%)          0%        0%   0%    59%   7%  13%   3969 100%

100%   78%   41%   16%  61%  62%  61%  60%  62%    124%       0%      0%     15%   4%     0%     6%     27%( 24%)          0%        0%   0%    55%   8%   6%   3583  18%

100%   76%   38%   13%  59%  60%  58%  58%  60%    122%       0%      0%     15%   5%     0%     5%     24%( 21%)          0%        0%   0%    52%   7%   5%   3453   0%

100%   74%   35%   12%  57%  58%  57%  56%  58%    120%       0%      0%     15%   4%     0%     4%     20%( 18%)          0%        0%   0%    53%   6%   5%   2458   0%

100%   78%   41%   15%  62%  62%  62%  61%  62%    122%       0%      0%     17%   6%     0%     6%     25%( 22%)          0%        0%   0%    52%   7%  10%   3212   0%

100%   78%   42%   16%  61%  62%  61%  61%  61%    124%       0%      0%     17%   7%     0%     4%     28%( 24%)          0%        0%   0%    51%   8%   6%   3599   0%

100%   80%   44%   17%  63%  63%  62%  62%  63%    122%       0%      0%     18%   7%     2%     5%     29%( 25%)          0%        0%   0%    54%   8%   6%   3825   0%

100%   84%   53%   26%  68%  69%  68%  67%  68%    125%       0%      0%     20%   8%     0%     8%     41%( 32%)          0%        0%   0%    54%   9%   7%   6012   0%

100%   82%   49%   21%  66%  67%  65%  65%  66%    125%       0%      0%     20%   9%     0%     4%     33%( 29%)          0%        0%   0%    55%   9%   7%   4600   0%

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

100%   86%   58%   31%  72%  72%  71%  71%  72%    129%       0%      0%     20%   9%     1%     6%     51%( 38%)          0%        0%   0%    53%  10%   7%   5297   0%

100%   84%   51%   23%  67%  69%  66%  67%  67%    126%       0%      0%     21%  10%     0%     3%     36%( 30%)          0%        0%   0%    55%  10%   7%   4873   0%

100%   93%   77%   59%  84%  85%  84%  84%  84%     92%       0%      0%     24%  20%     0%     7%    104%( 57%)         22%        0%   0%    55%   7%   5%   4177  78%

100%   87%   59%   30%  73%  74%  72%  72%  72%    121%       0%      0%     23%  15%     0%     5%     41%( 33%)          0%        0%   0%    64%  10%  12%   5077 100%

100%   89%   64%   36%  76%  77%  75%  75%  75%    121%       0%      0%     23%  16%     0%     8%     52%( 38%)          0%        0%   0%    63%  10%   9%   5349 100%

100%   89%   63%   33%  74%  76%  74%  74%  74%    124%       0%      0%     24%  16%     0%     8%     43%( 35%)          0%        0%   0%    63%  11%   9%   5560 100%

100%   89%   63%   36%  75%  76%  75%  74%  74%    122%       0%      0%     23%  14%     0%     5%     56%( 40%)          0%        0%   0%    61%  10%   8%   6042 100%

100%   89%   62%   33%  74%  76%  74%  74%  74%    129%       0%      0%     25%  13%     0%     4%     48%( 39%)          0%        0%   0%    57%  12%  10%   6886   1%

100%   90%   66%   39%  77%  79%  77%  76%  76%    126%       0%      0%     25%  13%     0%     5%     61%( 44%)          0%        0%   0%    57%  12%   9%   7179   0%

100%   89%   63%   34%  74%  76%  74%  74%  74%    131%       0%      0%     25%  13%     0%     4%     49%( 40%)          0%        0%   0%    56%  11%   9%   7226   0%

100%   74%   38%   16%  59%  60%  59%  58%  60%    121%       0%      0%     17%   6%     0%     4%     25%( 21%)          0%        0%   0%    51%   7%   5%   3626   0%

100%   63%   20%    4%  48%  47%  47%  47%  49%    112%       0%      0%     11%   1%     0%     4%      7%(  6%)          0%        0%   0%    49%   4%   2%    829   0%

100%   67%   25%    6%  51%  51%  51%  51%  52%    117%       0%      0%     11%   1%     0%     5%     12%( 11%)          0%        0%   0%    49%   5%   3%   2277   0%

100%   69%   28%    8%  53%  53%  52%  52%  54%    119%       0%      0%     11%   1%     0%     6%     15%( 13%)          0%        0%   0%    50%   5%   3%   2478   0%

100%   65%   23%    5%  49%  49%  49%  49%  51%    115%       0%      0%     12%   1%     0%     4%      9%(  9%)          0%        0%   0%    50%   4%   2%   1247   0%

100%   66%   24%    7%  50%  50%  50%  49%  52%    116%       0%      0%     11%   1%     0%     5%     11%( 10%)          0%        0%   0%    50%   4%   3%   1214   0%

100%   66%   23%    6%  51%  50%  50%  50%  52%    116%       0%      0%     11%   1%     0%     4%     10%(  9%)          0%        0%   0%    50%   4%   5%   1488   0%

100%   80%   53%   35%  68%  68%  67%  68%  69%     94%       0%      0%     18%  13%     1%     8%     56%( 34%)         18%        0%   0%    57%   4%   2%   1357  97%

100%   74%   38%   18%  59%  59%  59%  58%  60%    119%       0%      0%     13%   7%     1%     7%     23%( 19%)          0%        0%   0%    57%   6%   3%   3751 100%

100%   67%   26%    9%  51%  51%  51%  50%  54%    112%       0%      0%     12%   6%     0%     4%      9%(  8%)          0%        0%   0%    56%   4%   2%    929 100%

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

100%   80%   44%   17%  63%  63%  62%  62%  63%    126%       0%      0%     12%   3%     1%    13%     27%( 24%)          0%        0%   0%    55%   7%   6%   3182  61%

100%   69%   27%    7%  52%  51%  52%  52%  53%    118%       0%      0%     12%   2%     0%     6%     12%( 11%)          0%        0%   0%    50%   5%   3%   2184   0%

100%   67%   24%    6%  51%  51%  50%  50%  52%    117%       0%      0%     11%   1%     0%     5%     11%( 10%)          0%        0%   0%    50%   5%   3%   1883   0%

100%   80%   43%   16%  62%  62%  62%  61%  62%    130%       0%      0%     12%   1%     0%    11%     30%( 25%)          0%        0%   0%    49%   7%   6%   5291   0%

100%   72%   30%    8%  54%  54%  54%  53%  55%    122%       0%      0%     11%   1%     0%     8%     16%( 15%)          0%        0%   0%    49%   6%   4%   3123   0%

100%   80%   40%   12%  60%  60%  60%  59%  61%    128%       0%      0%     11%   1%     0%    13%     28%( 24%)          0%        0%   0%    49%   6%   5%   3490   0%

100%   62%   18%    4%  47%  46%  46%  46%  49%    112%       0%      0%     11%   1%     0%     3%      6%(  5%)          0%        0%   0%    49%   4%   2%    881   0%

100%   74%   42%   26%  62%  61%  61%  62%  63%    105%       0%      0%     12%   6%     0%     6%     47%( 30%)         14%        0%   0%    49%   4%   3%   1525  16%

100%   78%   48%   29%  65%  64%  64%  65%  66%    101%       0%      0%     16%  11%     0%     6%     49%( 31%)          8%        0%   0%    62%   4%   2%   1682 100%

100%   73%   34%   12%  56%  56%  56%  56%  57%    114%       0%      0%     14%   8%     0%     5%     16%( 14%)          0%        0%   0%    59%   5%   3%   1881 100%

100%   70%   31%   12%  54%  54%  54%  54%  56%    111%       0%      0%     14%   8%     1%     4%     13%( 11%)          0%        0%   0%    59%   4%   2%   1135 100%

100%   69%   29%   10%  53%  53%  52%  53%  55%    112%       0%      0%     14%   6%     0%     3%     12%( 10%)          0%        0%   0%    59%   4%   2%   1119 100%

100%   66%   24%    7%  51%  50%  50%  50%  52%    114%       0%      0%     12%   2%     0%     4%     11%( 10%)          0%        0%   0%    52%   4%   3%   1320  19%

100%   64%   22%    5%  49%  49%  48%  49%  51%    113%       0%      0%     12%   1%     0%     5%      8%(  7%)          0%        0%   0%    50%   4%   2%   1157   0%

100%   65%   22%    5%  49%  49%  49%  49%  51%    115%       0%      0%     11%   1%     0%     4%      9%(  8%)          0%        0%   0%    50%   4%   3%   1594   0%

100%   65%   22%    5%  50%  50%  49%  49%  51%    115%       0%      0%     11%   1%     0%     5%      9%(  8%)          0%        0%   0%    49%   4%   5%   1530   0%

100%   62%   19%    4%  47%  47%  47%  47%  49%    113%       0%      0%     11%   1%     0%     3%      7%(  6%)          0%        0%   0%    48%   4%   2%   1022   0%

100%   63%   20%    5%  48%  48%  48%  48%  50%    114%       0%      0%     11%   1%     0%     3%      9%(  8%)          0%        0%   0%    48%   4%   2%   1073   0%

100%   62%   19%    4%  48%  47%  47%  47%  50%    114%       0%      0%     11%   1%     0%     3%      8%(  7%)          0%        0%   0%    49%   4%   2%   1199   0%

100%   61%   18%    4%  47%  46%  46%  46%  49%    113%       0%      0%     11%   1%     0%     3%      7%(  6%)          0%        0%   0%    48%   4%   2%    803   0%

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

100%   67%   24%    6%  51%  50%  50%  50%  52%    116%       0%      0%     11%   1%     1%     5%     12%( 11%)          0%        0%   0%    49%   5%   3%   2117   0%

100%   65%   21%    4%  49%  48%  48%  48%  50%    115%       0%      0%     11%   1%     0%     4%      9%(  8%)          0%        0%   0%    49%   4%   3%   1638   0%

thomas_glodde
8,555 Views

Looking at the dat above i´d say everything is fine, especialy since you said you dont notice any bad performace. These ANYX+ values are possibly some additions of any individual values. I´d be alarmed if any CPU or noted process goes 100 or above. Be aware tho that some processes scale over more CPUs, so you should only worry if its an even value like 100, 200, 300 or 400 since then its peaking out X ammount of CPUs.

GARDINEC_EBRD
8,555 Views

Thanks for the response Thomas, I agree totally, and I'm not worried, just curious.  Well...that and people keep asking why DFM is alerting for high CPU and I'm having trouble explaining why it's nothing to worry about!!  It would be good to have an understanding of what is happening, either an error in the cpu_busy counter, or a domain that isn't included in the sysstat/statit output???

chad_petrie
8,554 Views

How would you say this compares? I don't know that I have any latency issues, no complaints, nothing I can see....but the DFM alerts and these stats are a little worrisome.

prod-v3140-2*> sysstat -M 3

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

  99%   79%   47%   19%  68%  62%  57%  69%  83%     50%       0%      0%      6%   5%     0%    47%     69%( 46%)          0%        4%  41%    16%  15%  18%  30459   7%

  99%   82%   49%   19%  69%  63%  58%  71%  85%     52%       0%      0%      6%   5%     0%    52%     64%( 43%)          0%        5%  45%    12%  16%  19%  34936   0%

  99%   81%   48%   17%  68%  63%  59%  72%  80%     49%       0%      0%      6%   5%     0%    51%     63%( 44%)          0%        5%  45%    13%  16%  19%  34022   0%

100%   87%   65%   33%  76%  74%  71%  80%  79%     52%       0%      0%      6%   6%     0%    43%     96%( 56%)          5%        4%  44%    22%  14%  15%  40605  59%

  99%   89%   71%   40%  80%  78%  75%  83%  83%     54%       0%      0%      6%   6%     0%    42%    107%( 57%)          0%        3%  45%    29%  14%  13%  42087  53%

100%   89%   67%   37%  78%  75%  72%  81%  86%     54%       0%      0%      6%   5%     0%    40%    105%( 59%)          0%        4%  43%    27%  16%  15%  36804   0%

100%   86%   62%   32%  76%  73%  68%  77%  87%     58%       0%      0%      6%   5%     0%    42%     91%( 55%)          0%        5%  44%    21%  17%  17%  36414   0%

  95%   71%   38%   14%  60%  56%  51%  60%  75%     42%       0%      0%      5%   4%     0%    38%     65%( 47%)          4%        5%  34%    15%  14%  16%  24124  13%

100%   84%   63%   36%  76%  72%  69%  77%  85%     51%       0%      0%      6%   6%     0%    42%     97%( 56%)          1%        4%  41%    28%  13%  13%  35809  99%

  99%   87%   66%   35%  77%  74%  71%  81%  81%     54%       0%      0%      5%   4%     0%    40%    101%( 57%)          0%        4%  45%    26%  14%  14%  40047   0%

  99%   89%   67%   35%  78%  75%  72%  82%  82%     54%       0%      0%      5%   4%     0%    44%    102%( 55%)          0%        4%  45%    26%  14%  14%  40027   0%

100%   89%   66%   32%  77%  74%  69%  83%  83%     58%       0%      0%      5%   4%     0%    48%     90%( 50%)          0%        4%  51%    20%  15%  15%  46956   0%

100%   87%   67%   40%  78%  74%  71%  80%  87%     53%       0%      0%      5%   5%     0%    41%    104%( 58%)          4%        3%  43%    26%  14%  13%  39152  67%

100%   85%   64%   37%  77%  72%  70%  78%  86%     52%       0%      0%      7%   6%     0%    39%    102%( 59%)          0%        4%  39%    29%  15%  14%  33303  29%

100%   88%   65%   35%  78%  73%  69%  81%  89%     59%       0%      0%      5%   4%     0%    49%     90%( 50%)          0%        4%  49%    20%  17%  16%  43233   0%

100%   85%   65%   37%  77%  73%  70%  80%  86%     57%       0%      0%      5%   4%     0%    43%     97%( 56%)          0%        4%  46%    23%  15%  15%  43031   0%

100%   86%   66%   40%  78%  75%  72%  79%  87%     51%       0%      0%      6%   5%     0%    38%    107%( 59%)          4%        4%  39%    30%  14%  14%  35299  38%

  99%   84%   60%   32%  75%  70%  66%  76%  87%     53%       0%      0%      6%   6%     0%    43%     90%( 54%)          0%        4%  41%    24%  16%  15%  33791  62%

100%   88%   66%   36%  78%  75%  71%  81%  87%     59%       0%      0%      6%   5%     0%    44%     96%( 53%)          0%        4%  47%    21%  16%  16%  41673   0%

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

100%   89%   68%   36%  79%  76%  73%  82%  83%     55%       0%      0%      6%   5%     0%    44%    101%( 55%)          0%        4%  45%    26%  15%  14%  39683   0%

  98%   85%   60%   29%  75%  70%  68%  76%  83%     55%       0%      0%      6%   5%     0%    41%     87%( 54%)          3%        4%  43%    22%  15%  18%  35617   5%

100%   87%   64%   35%  77%  74%  69%  79%  88%     56%       0%      0%      6%   6%     0%    42%     92%( 55%)          2%        4%  44%    25%  16%  16%  39133  97%

  99%   90%   72%   39%  80%  79%  76%  84%  81%     57%       0%      0%      6%   5%     0%    37%    110%( 61%)          0%        4%  44%    29%  14%  15%  41319   0%

  99%   87%   64%   32%  76%  74%  71%  79%  82%     53%       0%      0%      6%   5%     0%    39%     99%( 58%)          0%        5%  41%    27%  15%  17%  33659   0%

  99%   84%   57%   27%  73%  70%  64%  74%  84%     58%       0%      0%      6%   5%     0%    42%     80%( 53%)          0%        5%  44%    20%  16%  17%  35250   0%

  99%   84%   57%   26%  72%  70%  65%  74%  79%     49%       0%      0%      7%   7%     0%    40%     87%( 56%)          4%        4%  39%    21%  15%  15%  31586  70%

  99%   85%   62%   29%  74%  72%  68%  78%  80%     53%       0%      0%      6%   5%     0%    42%     92%( 55%)          0%        4%  43%    23%  15%  15%  35775  17%

thomas_glodde
8,554 Views

chad, seems fine to me too. And im not sure if an alert for ANY CPU usage is of any help. Havent checked myself but maybe there are different triggers available.

scottgelb
8,555 Views

Agreed. Any CPU can cause concern and if all CPUs are at 1% it is pegged at that metric. Not my favorite counter.

DER_OEST_
8,554 Views

Hello Craig,

I just stumbled across this post and it's still flagged unresolved.

We have the same situation - DFM generates alerts for cpu_busy near 100%.

What I learned is that cpu_busy measures the percentage of "CPU Active time" over a timeframe, lets say 1 second.

Though each core may only be utilized 10%, if you have 10 cores and each core is active at an individual time, you would get 100%.

So if you have 4 CPU Cores and at individual times each core works for 0.1s during that 1s interval, then you'll have a cpu_busy counter of 40% (4x 0.1s)

For us, it looks like that: sysstat -m 1

ANY  AVG  CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11

100%  15%   15%  14%  14%  14%  14%  14%  16%  15%  16%  16%  18%  16%

That just shows that there is very light load on the system. But because over that imaginary 1sec window there was always one (or more) cores working, the total CPU busy (or active) is 100%.

I hope this hels anybody else stumbling over this post.

Greetings, Timo.

cscott1
7,276 Views

looks to me like nwk_legacy is high... don't know what it means,, but I have any+1 stuck hard at 100% and nwk_legacy very high along with piles of domain switches from storage to netwk_legacy.

Public