Network and Storage Protocols

CPU usage troubleshooting

DAUGAVPILS
13,223 Views

Hi

We have few FAS3240 and recently we started to see alerts from monitoring systems about high CPU usage.

Running systat shows these results:

ANY1+ ANY2+ ANY3+ ANY4+  AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host  Ops/s   CP

  84%   55%   29%   13%  50%  53%  50%  51%  48%     32%       0%      0%     18%  27%     7%    26%     39%( 30%)          9%        0%   0%    19%   8%  16%   6623  42%

  77%   45%   22%    9%  43%  45%  43%  44%  41%     29%       0%      0%     18%  24%     5%    20%     35%( 27%)          0%        0%   0%    19%   7%  15%   6218 100%

  86%   62%   37%   21%  55%  58%  54%  56%  53%     25%       0%      0%     24%  40%     5%    32%     31%( 23%)         13%        0%   0%    33%   7%  11%   5841  85%

  71%   32%   11%    3%  35%  37%  35%  36%  32%     28%       0%      0%     14%  13%     4%    16%     33%( 27%)          0%        0%   0%     6%   8%  18%   5425  24%

  90%   62%   36%   19%  56%  58%  55%  57%  54%     26%       0%      0%     22%  39%     4%    35%     32%( 25%)         12%        0%   0%    34%   7%  12%   5462  73%

  70%   30%    9%    2%  34%  35%  34%  34%  31%     27%       0%      0%     15%  14%     3%    16%     30%( 25%)          0%        0%   0%     6%   7%  17%   4832  54%

  86%   60%   35%   19%  54%  57%  55%  53%  53%     28%       0%      0%     22%  35%     5%    31%     34%( 26%)         10%        0%   0%    31%   7%  13%   5495  94%

  70%   34%   13%    4%  36%  38%  36%  36%  33%     26%       0%      0%     11%  13%     4%    20%     31%( 26%)          6%        0%   0%     8%   7%  17%   5045  14%

  84%   55%   31%   16%  51%  53%  50%  51%  50%     27%       0%      0%     16%  33%     5%    29%     33%( 26%)          7%        0%   0%    33%   7%  14%   5523 100%

  78%   43%   19%    8%  43%  44%  44%  44%  40%     30%       0%      0%     12%  18%     5%    21%     38%( 30%)          6%        0%   0%    15%   8%  18%   6439  64%

  80%   52%   29%   15%  48%  51%  47%  48%  46%     25%       0%      0%     17%  31%     6%    23%     37%( 28%)          3%        0%   0%    30%   7%  13%   5322  86%

  75%   41%   18%    7%  41%  42%  42%  40%  38%     29%       0%      0%     13%  17%     5%    20%     35%( 28%)          7%        0%   0%    14%   8%  16%   6251  20%

  79%   47%   23%   11%  45%  47%  43%  45%  43%     25%       0%      0%     15%  26%     4%    26%     34%( 26%)          4%        0%   0%    24%   7%  13%   5096 100%

  69%   38%   17%    5%  36%  39%  33%  40%  33%     17%       0%      0%     12%  21%     3%    22%     24%( 20%)         12%        0%   0%    17%   6%  11%   4380  33%

  72%   37%   13%    3%  35%  39%  30%  41%  32%     19%       0%      0%     16%  22%     3%    18%     30%( 23%)          0%        0%   0%    16%   7%  11%   5088  98%

  69%   37%   17%    6%  37%  37%  34%  38%  39%     20%       0%      0%     11%  17%     4%    27%     27%( 22%)          7%        0%   0%    16%   7%  13%   4814  24%

  83%   49%   21%    7%  44%  48%  36%  49%  42%     21%       0%      0%     19%  27%     5%    27%     32%( 24%)          5%        0%   0%    21%   7%  11%   5131 100%

  81%   52%   28%   11%  47%  49%  44%  53%  43%     23%       0%      0%     13%  29%     7%    27%     32%( 26%)         11%        0%   0%    29%   7%  12%   5698  59%

  79%   42%   15%    3%  40%  45%  37%  42%  36%     30%       0%      0%     15%  15%     8%    17%     42%( 33%)          0%        0%   0%     9%   9%  16%   6994  28%

As you see all CPU are quite busy , however it is not clear for me if this is cause for alarm. AVG does hover around 45% trhought day and night more or less the same.

Where do I start troubleshooting ?

Thank you!

1 REPLY 1

ismopuuronen
13,111 Views

Hello,

I think you have no need for troubleshooting in here, if CPU is usage is about 50%.
CPU is not causing any slowliness for your storage in those values.
If CPU usage is more than 90% for individual CPU, then you might want to take a closer look what is causing it.
you can also check >sysstat -x 1 and see if there is some network / protocol load, or disk read/write operations for the system, which could be the reason for the cpu load.

Do you know what is the limit to trigger the alert?
Do you get these alerts randomly during a week, or is it happening cyclic?


I would also check the statit command output, it will also show you some cpu and disk statistics and more.
It is going to be Average, so 30-60s is enough. This is a good way to gather some performance statistics if you have problem on.
>priv set advanced
>statit -b
wait....
>statit -e
>priv set


Br.
Ismo.

Public