VMware Solutions Discussions
VMware Solutions Discussions
Hi All,
I must be having a bad day or something, but can't get my head around this today. I've got a FAS3240 running at close 100% on the cpu_busy counter. Latency looks fine, so whatever it is it's not causing a performance issue, but it is generating CPU alerts on DFM.
I have a number of NDMP tape to tape operations running which seem to be generating this as the vmware over NFS load is relatively light (1000-2000 NFS ops/sec).
So, sysstat 1 looks like this:
CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache
in out read write read write age
92% 1165 0 0 15273 4320 31932 47209 145214 145214 51s
98% 2662 0 0 7119 1015 14508 55220 149094 149094 51s
99% 1251 0 0 13042 5001 3756 51936 151912 151978 51s
99% 650 0 0 5901 671 456 7308 155058 154993 51s
99% 1190 0 0 8936 757 296 8 154927 154927 51s
99% 2134 0 0 5885 899 256 0 154206 154206 51s
It doesn't appear to be a CPU core that's at 98% (sysstat -m)...
ANY AVG CPU0 CPU1 CPU2 CPU3
100% 58% 58% 57% 57% 58%
100% 54% 54% 54% 53% 55%
100% 57% 57% 56% 56% 58%
100% 55% 56% 55% 54% 56%
100% 77% 78% 77% 77% 77%
...so I assume it's a cpu domain that's maxed out. Looking at statit output, I can't see any particular domain that's at high utilization:
NetApp Release 8.1.2 7-Mode: Tue Oct 30 19:56:51 PDT 2012
<1O>
Start time: Wed Mar 27 09:32:55 GMT 2013
CPU Statistics
14.141772 time (seconds) 100 %
28.341048 system time 200 %
0.611241 rupt time 4 % (211010 rupts x 3 usec/rupt)
27.729807 non-rupt system time 196 %
28.226036 idle time 200 %
2.514662 time in CP 18 % 100 %
0.104331 rupt time in CP 4 % (37134 rupts x 3 usec/rupt)
Multiprocessor Statistics (per second)
cpu0 cpu1 cpu2 cpu3 total
sk switches 161727.33 163272.40 163634.80 165143.94 653778.47
hard switches 78693.46 79794.03 80383.21 83490.46 322361.16
domain switches 45627.52 45924.02 46501.74 49155.37 187208.65
CP rupts 730.25 355.47 406.10 1134.02 2625.84
nonCP rupts 3352.41 1643.64 1731.04 5568.11 12295.21
IPI rupts 0.00 0.00 0.00 0.00 0.00
grab kahuna 0.00 0.00 0.00 0.00 0.00
grab kahuna usec 0.00 0.00 0.00 0.00 0.00
CP rupt usec 4324.63 127.00 509.55 2416.25 7377.51
nonCP rupt usec 20245.62 579.42 2214.01 12805.68 35844.80
idle 502422.54 502765.85 503554.86 487190.15 1995933.54
kahuna 12694.87 12387.56 12685.04 13127.49 50895.11
storage 28276.80 29771.09 30174.72 26242.11 114464.79
exempt 120301.76 122678.12 121024.58 124690.03 488694.63
raid 4533.66 4430.14 4064.55 4970.66 17999.16
target 626.23 744.81 609.47 849.12 2829.77
dnscache 0.00 0.00 0.00 0.00 0.00
cifs 26.87 46.10 55.79 65.20 194.04
wafl_exempt 33716.85 30728.04 29905.73 28599.32 122950.08
wafl_xcleaner 2258.34 2442.83 2412.92 1487.15 8601.33
sm_exempt 13.15 19.23 17.47 20.79 70.78
cluster 0.00 0.00 0.00 0.00 0.00
protocol 34.93 33.16 50.63 36.63 155.43
nwk_exclusive 629.98 857.46 483.53 895.08 2866.26
nwk_exempt 34561.79 51149.32 49054.67 51691.61 186457.47
nwk_legacy 222720.96 227699.82 229450.67 244670.12 924541.78
hostOS 12610.30 13539.39 13731.16 241.98 40122.98
13.862242 seconds with one or more CPUs active ( 98%)
9.259903 seconds with 2 or more CPUs active ( 65%)
3.400337 seconds with 3 or more CPUs active ( 24%)
4.602338 seconds with one CPU active ( 33%)
5.859565 seconds with 2 CPUs active ( 41%)
2.451869 seconds with 3 CPUs active ( 17%)
0.948468 seconds with all CPUs active ( 7%)
Domain Utilization of Shared Domains (per second)
0.00 idle 106174.04 kahuna
0.00 storage 0.00 exempt
0.00 raid 0.00 target
0.00 dnscache 0.00 cifs
0.00 wafl_exempt 0.00 wafl_xcleaner
0.00 sm_exempt 0.00 cluster
0.00 protocol 956506.30 nwk_exclusive
0.00 nwk_exempt 0.00 nwk_legacy
0.00 hostOS
Can anyone see what I'm missing????
Thanks,
Craig
Where is the rest of the output of sysstat -m? I'd like to see what the CPU domains are doing for protocol, etc...
I don't believe sysstat -m gives domain stats (?). The statit output shows some domain info...
Sorry, what does sysstat -m -x 1 look like?
Whatever was going on with that filer has stopped now, but I still have a sysstat -x 1 in my putty history from earlier. seems to be the same output with -m -x as it is with just -x:
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty
util in out in out
96% 4409 0 0 4440 47525 20247 21071 78693 139256 139321 36s 94% 82% :
10% 0 31 0 468 401 0 0
99% 5479 0 0 5496 84828 28699 10989 24 146130 145999 36s 96% 0% -
6% 5 12 0 40 5 0 0 99% 5118 1 0 5123 51956 16436 6283 0 148077 148274 36s 98% 0% -
4% 0 4 0 99 4 0 0 85% 4122 0 0 4353 60060 15606 22340 51172 129171 129106 36s 98% 50% Ff 13% 0 231 0 4392 3971 0 0 95% 4258 0 0 4264 45687 27781 18496 61452 140313 140247 41s 97% 100% :f 8% 0 6 0 11 0 0 0 97% 4312 0 0 4314 74270 13122 17403 54282 142726 142726 41s 98% 100% :f 9% 0 2 0 9 0 0 0 97% 4623 0 0 4644 60221 17715 13232 75404 143458 143458 41s 98% 100% :f 10% 5 16 0 113 0 0 0 97% 3878 1 0 3887 65123 15201 12303 14945 142554 142489 41s 98% 35% Fn 6% 0 8 0 11 5 0 0 89% 3893 0 0 3939 58449 10234 19375 83097 131594 131659 41s 98% 100% :f 12% 0 46 0 763 658 0 0 95% 4375 0 0 4410 59059 29603 26335 86619 134277 134277 41s 92% 100% :f 13% 0 35 0 519 396 0 0 95% 3987 0 0 3995 54986 8975 9708 56304 141930 141930 45s 90% 100% :f 11% 0 8 0 27 0 0 0 98% 2624 0 0 2634 36462 12530 15296 20542 144231 144166 45s 88% 64% : 15% 5 5 0 29 0 0 0 97% 2662 0 0 2682 69821 4072 25248 93228 116828 116959 42s 90% 53% Ms 24% 0 20 0 329 211 0 0 90% 2707 0 0 2721 62560 7956 20814 57417 135585 135585 37s 89% 100% :f 29% 0 14 0 141 137 0 0 90% 2901 0 0 2910 25538 11683 33010 68322 136544 136480 38s 89% 100% :f 33% 0 9 0 41 0 0 0 92% 2880 0 0 2883 27997 11600 19170 46875 140609 140738 34s 90% 90% : 31% 0 3 0 2 0 0 0 94% 2538 1 0 2556 18032 9506 17258 24 145203 145074 35s 88% 0% - 30% 10 7 0 15 0 0 0
Can you attach that output in a text document? It's all wordwrapped. CPU utilization on 8.x and above can go to 100% at any given time as the system uses all CPU when the filer is less busy to run background operations. I do understand your concern though that DFM is tripping when this happens. If you want to PM me your serial number I can see if we have autosupport perf data I can take a look at.
Hi Craig,
I have the data I said I'd collect for you. Please PM me your email address and I'll send you what I have.
Thanks,
Doug
I am having a similar issue, was there anything specific that jumped out to resolve this issue?
go for a
priv set diag
sysstat -M 1 (capital m)
Hi,
I still haven't found out why cpu_busy is so high. I have a system exhibiting this again right now. I know this controller is running some NDMP tape operations for TSM at the moment, so suspect this is the reason, however, I'd like to see which domain is at 100%.
@Thomas: I've run sysstat -M 1 at diag level (thanks, didn't know this one) but still unclear which domain/core is at 100%. Apologies for paste below, but this is a FAS3240, not busy enough to warrant 100% cpu, but what is causing it? As before, latency is fine and seems to be unaffected:
ANY1+ ANY2+ ANY3+ ANY4+ AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host Ops/s CP
100% 69% 27% 7% 52% 52% 52% 52% 54% 118% 0% 0% 12% 2% 1% 6% 14%( 13%) 0% 0% 0% 49% 5% 4% 2369 0%
100% 75% 35% 10% 57% 58% 57% 57% 57% 126% 0% 0% 11% 1% 0% 9% 20%( 19%) 0% 0% 0% 48% 7% 5% 3927 0%
100% 69% 27% 7% 52% 52% 52% 52% 53% 119% 0% 0% 11% 1% 0% 6% 13%( 12%) 0% 0% 0% 50% 5% 4% 2152 0%
100% 71% 31% 10% 55% 55% 54% 55% 56% 121% 0% 0% 12% 1% 0% 8% 18%( 16%) 0% 0% 0% 50% 5% 4% 2338 0%
100% 70% 28% 8% 53% 53% 52% 52% 54% 120% 0% 0% 11% 1% 0% 6% 14%( 13%) 0% 0% 0% 50% 5% 4% 2177 0%
100% 74% 33% 9% 56% 56% 56% 56% 57% 124% 0% 0% 11% 1% 0% 8% 18%( 17%) 0% 0% 0% 50% 6% 5% 3773 0%
100% 70% 29% 9% 54% 54% 53% 53% 55% 122% 0% 0% 11% 1% 0% 5% 15%( 14%) 0% 0% 0% 49% 6% 5% 3111 0%
100% 84% 58% 37% 71% 71% 71% 71% 72% 101% 0% 0% 17% 11% 0% 9% 64%( 43%) 23% 0% 0% 52% 6% 4% 2886 49%
100% 74% 36% 15% 58% 57% 57% 57% 59% 117% 0% 0% 13% 7% 0% 6% 19%( 16%) 0% 0% 0% 59% 5% 4% 2478 100%
100% 78% 44% 20% 62% 63% 62% 62% 63% 119% 0% 0% 14% 8% 0% 7% 31%( 24%) 0% 0% 0% 60% 6% 4% 3316 100%
100% 73% 35% 14% 57% 57% 56% 57% 58% 115% 0% 0% 14% 8% 0% 5% 18%( 15%) 0% 0% 0% 59% 5% 4% 2006 100%
100% 80% 44% 17% 64% 65% 64% 64% 64% 122% 0% 0% 14% 6% 0% 8% 28%( 24%) 0% 0% 0% 59% 7% 13% 3969 100%
100% 78% 41% 16% 61% 62% 61% 60% 62% 124% 0% 0% 15% 4% 0% 6% 27%( 24%) 0% 0% 0% 55% 8% 6% 3583 18%
100% 76% 38% 13% 59% 60% 58% 58% 60% 122% 0% 0% 15% 5% 0% 5% 24%( 21%) 0% 0% 0% 52% 7% 5% 3453 0%
100% 74% 35% 12% 57% 58% 57% 56% 58% 120% 0% 0% 15% 4% 0% 4% 20%( 18%) 0% 0% 0% 53% 6% 5% 2458 0%
100% 78% 41% 15% 62% 62% 62% 61% 62% 122% 0% 0% 17% 6% 0% 6% 25%( 22%) 0% 0% 0% 52% 7% 10% 3212 0%
100% 78% 42% 16% 61% 62% 61% 61% 61% 124% 0% 0% 17% 7% 0% 4% 28%( 24%) 0% 0% 0% 51% 8% 6% 3599 0%
100% 80% 44% 17% 63% 63% 62% 62% 63% 122% 0% 0% 18% 7% 2% 5% 29%( 25%) 0% 0% 0% 54% 8% 6% 3825 0%
100% 84% 53% 26% 68% 69% 68% 67% 68% 125% 0% 0% 20% 8% 0% 8% 41%( 32%) 0% 0% 0% 54% 9% 7% 6012 0%
100% 82% 49% 21% 66% 67% 65% 65% 66% 125% 0% 0% 20% 9% 0% 4% 33%( 29%) 0% 0% 0% 55% 9% 7% 4600 0%
ANY1+ ANY2+ ANY3+ ANY4+ AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host Ops/s CP
100% 86% 58% 31% 72% 72% 71% 71% 72% 129% 0% 0% 20% 9% 1% 6% 51%( 38%) 0% 0% 0% 53% 10% 7% 5297 0%
100% 84% 51% 23% 67% 69% 66% 67% 67% 126% 0% 0% 21% 10% 0% 3% 36%( 30%) 0% 0% 0% 55% 10% 7% 4873 0%
100% 93% 77% 59% 84% 85% 84% 84% 84% 92% 0% 0% 24% 20% 0% 7% 104%( 57%) 22% 0% 0% 55% 7% 5% 4177 78%
100% 87% 59% 30% 73% 74% 72% 72% 72% 121% 0% 0% 23% 15% 0% 5% 41%( 33%) 0% 0% 0% 64% 10% 12% 5077 100%
100% 89% 64% 36% 76% 77% 75% 75% 75% 121% 0% 0% 23% 16% 0% 8% 52%( 38%) 0% 0% 0% 63% 10% 9% 5349 100%
100% 89% 63% 33% 74% 76% 74% 74% 74% 124% 0% 0% 24% 16% 0% 8% 43%( 35%) 0% 0% 0% 63% 11% 9% 5560 100%
100% 89% 63% 36% 75% 76% 75% 74% 74% 122% 0% 0% 23% 14% 0% 5% 56%( 40%) 0% 0% 0% 61% 10% 8% 6042 100%
100% 89% 62% 33% 74% 76% 74% 74% 74% 129% 0% 0% 25% 13% 0% 4% 48%( 39%) 0% 0% 0% 57% 12% 10% 6886 1%
100% 90% 66% 39% 77% 79% 77% 76% 76% 126% 0% 0% 25% 13% 0% 5% 61%( 44%) 0% 0% 0% 57% 12% 9% 7179 0%
100% 89% 63% 34% 74% 76% 74% 74% 74% 131% 0% 0% 25% 13% 0% 4% 49%( 40%) 0% 0% 0% 56% 11% 9% 7226 0%
100% 74% 38% 16% 59% 60% 59% 58% 60% 121% 0% 0% 17% 6% 0% 4% 25%( 21%) 0% 0% 0% 51% 7% 5% 3626 0%
100% 63% 20% 4% 48% 47% 47% 47% 49% 112% 0% 0% 11% 1% 0% 4% 7%( 6%) 0% 0% 0% 49% 4% 2% 829 0%
100% 67% 25% 6% 51% 51% 51% 51% 52% 117% 0% 0% 11% 1% 0% 5% 12%( 11%) 0% 0% 0% 49% 5% 3% 2277 0%
100% 69% 28% 8% 53% 53% 52% 52% 54% 119% 0% 0% 11% 1% 0% 6% 15%( 13%) 0% 0% 0% 50% 5% 3% 2478 0%
100% 65% 23% 5% 49% 49% 49% 49% 51% 115% 0% 0% 12% 1% 0% 4% 9%( 9%) 0% 0% 0% 50% 4% 2% 1247 0%
100% 66% 24% 7% 50% 50% 50% 49% 52% 116% 0% 0% 11% 1% 0% 5% 11%( 10%) 0% 0% 0% 50% 4% 3% 1214 0%
100% 66% 23% 6% 51% 50% 50% 50% 52% 116% 0% 0% 11% 1% 0% 4% 10%( 9%) 0% 0% 0% 50% 4% 5% 1488 0%
100% 80% 53% 35% 68% 68% 67% 68% 69% 94% 0% 0% 18% 13% 1% 8% 56%( 34%) 18% 0% 0% 57% 4% 2% 1357 97%
100% 74% 38% 18% 59% 59% 59% 58% 60% 119% 0% 0% 13% 7% 1% 7% 23%( 19%) 0% 0% 0% 57% 6% 3% 3751 100%
100% 67% 26% 9% 51% 51% 51% 50% 54% 112% 0% 0% 12% 6% 0% 4% 9%( 8%) 0% 0% 0% 56% 4% 2% 929 100%
ANY1+ ANY2+ ANY3+ ANY4+ AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host Ops/s CP
100% 80% 44% 17% 63% 63% 62% 62% 63% 126% 0% 0% 12% 3% 1% 13% 27%( 24%) 0% 0% 0% 55% 7% 6% 3182 61%
100% 69% 27% 7% 52% 51% 52% 52% 53% 118% 0% 0% 12% 2% 0% 6% 12%( 11%) 0% 0% 0% 50% 5% 3% 2184 0%
100% 67% 24% 6% 51% 51% 50% 50% 52% 117% 0% 0% 11% 1% 0% 5% 11%( 10%) 0% 0% 0% 50% 5% 3% 1883 0%
100% 80% 43% 16% 62% 62% 62% 61% 62% 130% 0% 0% 12% 1% 0% 11% 30%( 25%) 0% 0% 0% 49% 7% 6% 5291 0%
100% 72% 30% 8% 54% 54% 54% 53% 55% 122% 0% 0% 11% 1% 0% 8% 16%( 15%) 0% 0% 0% 49% 6% 4% 3123 0%
100% 80% 40% 12% 60% 60% 60% 59% 61% 128% 0% 0% 11% 1% 0% 13% 28%( 24%) 0% 0% 0% 49% 6% 5% 3490 0%
100% 62% 18% 4% 47% 46% 46% 46% 49% 112% 0% 0% 11% 1% 0% 3% 6%( 5%) 0% 0% 0% 49% 4% 2% 881 0%
100% 74% 42% 26% 62% 61% 61% 62% 63% 105% 0% 0% 12% 6% 0% 6% 47%( 30%) 14% 0% 0% 49% 4% 3% 1525 16%
100% 78% 48% 29% 65% 64% 64% 65% 66% 101% 0% 0% 16% 11% 0% 6% 49%( 31%) 8% 0% 0% 62% 4% 2% 1682 100%
100% 73% 34% 12% 56% 56% 56% 56% 57% 114% 0% 0% 14% 8% 0% 5% 16%( 14%) 0% 0% 0% 59% 5% 3% 1881 100%
100% 70% 31% 12% 54% 54% 54% 54% 56% 111% 0% 0% 14% 8% 1% 4% 13%( 11%) 0% 0% 0% 59% 4% 2% 1135 100%
100% 69% 29% 10% 53% 53% 52% 53% 55% 112% 0% 0% 14% 6% 0% 3% 12%( 10%) 0% 0% 0% 59% 4% 2% 1119 100%
100% 66% 24% 7% 51% 50% 50% 50% 52% 114% 0% 0% 12% 2% 0% 4% 11%( 10%) 0% 0% 0% 52% 4% 3% 1320 19%
100% 64% 22% 5% 49% 49% 48% 49% 51% 113% 0% 0% 12% 1% 0% 5% 8%( 7%) 0% 0% 0% 50% 4% 2% 1157 0%
100% 65% 22% 5% 49% 49% 49% 49% 51% 115% 0% 0% 11% 1% 0% 4% 9%( 8%) 0% 0% 0% 50% 4% 3% 1594 0%
100% 65% 22% 5% 50% 50% 49% 49% 51% 115% 0% 0% 11% 1% 0% 5% 9%( 8%) 0% 0% 0% 49% 4% 5% 1530 0%
100% 62% 19% 4% 47% 47% 47% 47% 49% 113% 0% 0% 11% 1% 0% 3% 7%( 6%) 0% 0% 0% 48% 4% 2% 1022 0%
100% 63% 20% 5% 48% 48% 48% 48% 50% 114% 0% 0% 11% 1% 0% 3% 9%( 8%) 0% 0% 0% 48% 4% 2% 1073 0%
100% 62% 19% 4% 48% 47% 47% 47% 50% 114% 0% 0% 11% 1% 0% 3% 8%( 7%) 0% 0% 0% 49% 4% 2% 1199 0%
100% 61% 18% 4% 47% 46% 46% 46% 49% 113% 0% 0% 11% 1% 0% 3% 7%( 6%) 0% 0% 0% 48% 4% 2% 803 0%
ANY1+ ANY2+ ANY3+ ANY4+ AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host Ops/s CP
100% 67% 24% 6% 51% 50% 50% 50% 52% 116% 0% 0% 11% 1% 1% 5% 12%( 11%) 0% 0% 0% 49% 5% 3% 2117 0%
100% 65% 21% 4% 49% 48% 48% 48% 50% 115% 0% 0% 11% 1% 0% 4% 9%( 8%) 0% 0% 0% 49% 4% 3% 1638 0%
Looking at the dat above i´d say everything is fine, especialy since you said you dont notice any bad performace. These ANYX+ values are possibly some additions of any individual values. I´d be alarmed if any CPU or noted process goes 100 or above. Be aware tho that some processes scale over more CPUs, so you should only worry if its an even value like 100, 200, 300 or 400 since then its peaking out X ammount of CPUs.
Thanks for the response Thomas, I agree totally, and I'm not worried, just curious. Well...that and people keep asking why DFM is alerting for high CPU and I'm having trouble explaining why it's nothing to worry about!! It would be good to have an understanding of what is happening, either an error in the cpu_busy counter, or a domain that isn't included in the sysstat/statit output???
How would you say this compares? I don't know that I have any latency issues, no complaints, nothing I can see....but the DFM alerts and these stats are a little worrisome.
prod-v3140-2*> sysstat -M 3
ANY1+ ANY2+ ANY3+ ANY4+ AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host Ops/s CP
99% 79% 47% 19% 68% 62% 57% 69% 83% 50% 0% 0% 6% 5% 0% 47% 69%( 46%) 0% 4% 41% 16% 15% 18% 30459 7%
99% 82% 49% 19% 69% 63% 58% 71% 85% 52% 0% 0% 6% 5% 0% 52% 64%( 43%) 0% 5% 45% 12% 16% 19% 34936 0%
99% 81% 48% 17% 68% 63% 59% 72% 80% 49% 0% 0% 6% 5% 0% 51% 63%( 44%) 0% 5% 45% 13% 16% 19% 34022 0%
100% 87% 65% 33% 76% 74% 71% 80% 79% 52% 0% 0% 6% 6% 0% 43% 96%( 56%) 5% 4% 44% 22% 14% 15% 40605 59%
99% 89% 71% 40% 80% 78% 75% 83% 83% 54% 0% 0% 6% 6% 0% 42% 107%( 57%) 0% 3% 45% 29% 14% 13% 42087 53%
100% 89% 67% 37% 78% 75% 72% 81% 86% 54% 0% 0% 6% 5% 0% 40% 105%( 59%) 0% 4% 43% 27% 16% 15% 36804 0%
100% 86% 62% 32% 76% 73% 68% 77% 87% 58% 0% 0% 6% 5% 0% 42% 91%( 55%) 0% 5% 44% 21% 17% 17% 36414 0%
95% 71% 38% 14% 60% 56% 51% 60% 75% 42% 0% 0% 5% 4% 0% 38% 65%( 47%) 4% 5% 34% 15% 14% 16% 24124 13%
100% 84% 63% 36% 76% 72% 69% 77% 85% 51% 0% 0% 6% 6% 0% 42% 97%( 56%) 1% 4% 41% 28% 13% 13% 35809 99%
99% 87% 66% 35% 77% 74% 71% 81% 81% 54% 0% 0% 5% 4% 0% 40% 101%( 57%) 0% 4% 45% 26% 14% 14% 40047 0%
99% 89% 67% 35% 78% 75% 72% 82% 82% 54% 0% 0% 5% 4% 0% 44% 102%( 55%) 0% 4% 45% 26% 14% 14% 40027 0%
100% 89% 66% 32% 77% 74% 69% 83% 83% 58% 0% 0% 5% 4% 0% 48% 90%( 50%) 0% 4% 51% 20% 15% 15% 46956 0%
100% 87% 67% 40% 78% 74% 71% 80% 87% 53% 0% 0% 5% 5% 0% 41% 104%( 58%) 4% 3% 43% 26% 14% 13% 39152 67%
100% 85% 64% 37% 77% 72% 70% 78% 86% 52% 0% 0% 7% 6% 0% 39% 102%( 59%) 0% 4% 39% 29% 15% 14% 33303 29%
100% 88% 65% 35% 78% 73% 69% 81% 89% 59% 0% 0% 5% 4% 0% 49% 90%( 50%) 0% 4% 49% 20% 17% 16% 43233 0%
100% 85% 65% 37% 77% 73% 70% 80% 86% 57% 0% 0% 5% 4% 0% 43% 97%( 56%) 0% 4% 46% 23% 15% 15% 43031 0%
100% 86% 66% 40% 78% 75% 72% 79% 87% 51% 0% 0% 6% 5% 0% 38% 107%( 59%) 4% 4% 39% 30% 14% 14% 35299 38%
99% 84% 60% 32% 75% 70% 66% 76% 87% 53% 0% 0% 6% 6% 0% 43% 90%( 54%) 0% 4% 41% 24% 16% 15% 33791 62%
100% 88% 66% 36% 78% 75% 71% 81% 87% 59% 0% 0% 6% 5% 0% 44% 96%( 53%) 0% 4% 47% 21% 16% 16% 41673 0%
ANY1+ ANY2+ ANY3+ ANY4+ AVG CPU0 CPU1 CPU2 CPU3 Network Protocol Cluster Storage Raid Target Kahuna WAFL_Ex(Kahu) WAFL_XClean SM_Exempt Cifs Exempt Intr Host Ops/s CP
100% 89% 68% 36% 79% 76% 73% 82% 83% 55% 0% 0% 6% 5% 0% 44% 101%( 55%) 0% 4% 45% 26% 15% 14% 39683 0%
98% 85% 60% 29% 75% 70% 68% 76% 83% 55% 0% 0% 6% 5% 0% 41% 87%( 54%) 3% 4% 43% 22% 15% 18% 35617 5%
100% 87% 64% 35% 77% 74% 69% 79% 88% 56% 0% 0% 6% 6% 0% 42% 92%( 55%) 2% 4% 44% 25% 16% 16% 39133 97%
99% 90% 72% 39% 80% 79% 76% 84% 81% 57% 0% 0% 6% 5% 0% 37% 110%( 61%) 0% 4% 44% 29% 14% 15% 41319 0%
99% 87% 64% 32% 76% 74% 71% 79% 82% 53% 0% 0% 6% 5% 0% 39% 99%( 58%) 0% 5% 41% 27% 15% 17% 33659 0%
99% 84% 57% 27% 73% 70% 64% 74% 84% 58% 0% 0% 6% 5% 0% 42% 80%( 53%) 0% 5% 44% 20% 16% 17% 35250 0%
99% 84% 57% 26% 72% 70% 65% 74% 79% 49% 0% 0% 7% 7% 0% 40% 87%( 56%) 4% 4% 39% 21% 15% 15% 31586 70%
99% 85% 62% 29% 74% 72% 68% 78% 80% 53% 0% 0% 6% 5% 0% 42% 92%( 55%) 0% 4% 43% 23% 15% 15% 35775 17%
chad, seems fine to me too. And im not sure if an alert for ANY CPU usage is of any help. Havent checked myself but maybe there are different triggers available.
Agreed. Any CPU can cause concern and if all CPUs are at 1% it is pegged at that metric. Not my favorite counter.
Hello Craig,
I just stumbled across this post and it's still flagged unresolved.
We have the same situation - DFM generates alerts for cpu_busy near 100%.
What I learned is that cpu_busy measures the percentage of "CPU Active time" over a timeframe, lets say 1 second.
Though each core may only be utilized 10%, if you have 10 cores and each core is active at an individual time, you would get 100%.
So if you have 4 CPU Cores and at individual times each core works for 0.1s during that 1s interval, then you'll have a cpu_busy counter of 40% (4x 0.1s)
For us, it looks like that: sysstat -m 1
ANY AVG CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11
100% 15% 15% 14% 14% 14% 14% 14% 16% 15% 16% 16% 18% 16%
That just shows that there is very light load on the system. But because over that imaginary 1sec window there was always one (or more) cores working, the total CPU busy (or active) is 100%.
I hope this hels anybody else stumbling over this post.
Greetings, Timo.
looks to me like nwk_legacy is high... don't know what it means,, but I have any+1 stuck hard at 100% and nwk_legacy very high along with piles of domain switches from storage to netwk_legacy.