Microsoft Virtualization Discussions
Microsoft Virtualization Discussions
I'm trying to help my team do some semi-automated (scripted) performance reporting in our Netapp environment and I'm really struggling.
Let's say for example the CDOT Cluster is called "CDOT01" and the cluster nodes are called "CDOT01A", "CDOT01B", and so on.
So far I've been able to log into each Netapp CDOT Cluster and do this:
Putty into CDOT01 Cluster
> set diag
> node run -node CDOT01A sysstat -c 5 -M 1
(Copy/paste output into text file and do text to columns in Excel.)
This gives us a 5-second snapshot for whenever we run it but we need to run it against over 30 different nodes multiple times a day to capture trends.
I'm pretty good at building reports with Powershell so I've been trying to get different perf commands to run with the Powershell Toolkit. I’ve identified the following performance monitoring cmdlets:
Get-NcPerfData
Get-NcPerfInstance
Get-NcPerfObject
Get-NcPerfCounter
Invoke-NcSysstat (Node column reports Mutliple_Values, CPU column always shows 0% and I haven't been able to get a breakdown of the various CPU counters.)
Invoke-NcPerfstat (Error "Could not determine node mgmt IP")
I haven’t been able to get any of them to return useful data. The closest I got was doing this:
(get-ncperfdata -name system -instance cluster).counters
It returns numbers but I can’t tell what they are, they’re just raw numbers that don't seem to correlate to anything meaningful.
Going in manually to each controller through SSH and running sysstat isn’t really optimal. If I can get the ncperf commands to return data from Powershell we can start building reports. That would be the ideal outcome I think. . Even if we can just get point-in-time snapshots throughout the day we can schedule scripts and at least start capturing trends.
We need a CPU usage breakdown for each controller. If someone can help me get useful data and counters, I can get it formatted into a report. It's just the cmdlets I'm struggling with. Any ideas?
Solved! See The Solution
Performance reporting with the API is a bit strange and takes some getting used to. Performance reporting consists of three things:
To collect CPU information we need to find a few things...
# show processor related perf objects Get-NcPerfObject | ?{ $_.Name -like "*processor*" } # results: # processor # processor:node #
# "processor" is the individual CPU stats for each CPU on each node (many counters per node)
# "processor:node" is the summary CPU stats for each node (one counter per node)
# get counters associated with the object we're interested in Get-NcPerfCounter -Name "processor" | %{ $_.Name } # the results we care about: # processor_busy # processor_elapsed_time # get the instances of the object we care about Get-NcPerfInstance -Name "processor" # sample output: # Name Uuid # ---- ---- # processor0 VICE-01:kernel:processor0 # processor1 VICE-01:kernel:processor1 # processor2 VICE-01:kernel:processor2 # processor3 VICE-01:kernel:processor3
With CPU busy time, among others, there is two things we need to be aware of:
CPU busy time is the amount of time that the CPU was busy from time "A" until time "B". Which means we must collect the processor_busy (I'll refer to it as "pb") and processor_elapsed_time (pet) at two times (t1 and t2), subtract the values at t1 from the values at t2, then divide busy time by elapsed time. It sounds complicated, but it's easier to see a simple equation....
($t2_pb - $t1_pb) / ($t2_pet - $t1_pet)
Since it is a percentage, we would also multiple that value by 100 to get the readable percentage value.
There's some complicating factors when looking at detailed CPU stats, namely the instances are by processor, not node. We can limit by node using the "FilterData" parameter of Get-NcPerfData, but by default if you specify "processor0" as the instance it will return processor0 for each of the nodes...which doesn't give us a very good idea of the full picture of CPU utilization.
I have created an example function which you can use to view the CPU utilization breakdown for each of the nodes in a cluster:
function Get-DetailedCpuStats { <# .SYNOPSIS A function to collect and display the breakdown of individual CPU utilization for a node or nodes in a cluster. .DESCRIPTION This function will query the cluster performance API to determine the CPU busy percentage time for the interval specified (defaults to 5 seconds). This function requires a connection to a clustered Data ONTAP system by a user with the ability to use privilege level "admin". .PARAMETER Node A node name to limit the results. Can only be a single node. If not specified all nodes in the cluster will be reported. .PARAMETER Interval The number of milliseconds to wait between queries for CPU busy time. Shorter intervals will cause more CPU utilization on the system as it self-reports. The default is 5000 milliseconds (5 seconds). Min = 1000, Max = 60000. .PARAMETER Iterations The number of iterations to display. Execution time will be # Iterations * Interval Milliseconds. .EXAMPLE Get-DetailedCpuStats Report on all cluster node CPU stats for one five second interval. .EXAMPLE Get-DetailedCpuStats -Interval 10000 -Iterations 5 Report on all cluster node CPU stats for five 10 second intervals .EXAMPLE Get-DetailedCpuStats -Interval 1000 -Iterations 60 -Node NODE-01 Report CPU stats for NODE-01 every 1 second for 60 iterations #> param( # filter to a single node [parameter(mandatory=$false)] [string]$Node , # wait time between checks, in milliseconds [parameter(mandatory=$false)] [ValidateRange(1000,60000)] [int]$Interval = 5000 , # number of iterations to show [parameter(mandatory=$false)] [int]$Iterations = 1 ) process{ # get all instances of the processors $instances = (Get-NcPerfInstance -Name processor | Group-Object -Property Name | %{ $_.Name }) -join "," # collect the inital value of the counters if ($node -ne "" -and $node -ne $null) { $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)" } else { $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time } # create the template object $processor = New-Object -TypeName PSObject $processor | Add-Member -MemberType NoteProperty -Name Node -Value $null # create a property for each processor $one | %{ ($_.Uuid).Split(":")[2] } | Sort-Object | Get-Unique | %{ $processor | Add-Member -MemberType NoteProperty -Name $_ -Value $null } # we will want to do at least one iteration $continue = $true # iteration count $i = 0 while ($continue -eq $true) { # values for this iteration $iteration = New-Object System.Collections.ArrayList # wait some period of time Start-Sleep -Milliseconds $interval # collect the second set if ($node -ne "" -and $node -ne $null) { $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)" } else { $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time } #$two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time # loop once to get all the nodes $nodes = $one | %{ ($_.Uuid).Split(":")[0] } | Sort-Object | Get-Unique foreach ($nodeName in $nodes) { $thisProcessor = $processor.PSObject.Copy() $thisProcessor.Node = $nodeName # for the processors for this node $one | ?{ ($_.Uuid).Split(":")[0] -eq $nodeName } | Sort-Object -Property Uuid | %{ $thisUuid = $_.Uuid #$pb1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_busy" }).value #$pet1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value # get the counters from the first check $pb1 = ($_.Counters | ?{ $_.Name -eq "processor_busy" }).value $pet1 = ($_.Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value # get the counters from the second check $pb2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_busy" }).value $pet2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value # calculate and store #Write-Host "5 Second Average % Busy for $($thisUuid): $([Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100, 2))%" $procId = $thisUuid.Split(":")[2] $percentBusy = [Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100) $thisProcessor.($procId) = $percentBusy } # store the value $iteration.Add($thisProcessor.PSObject.Copy()) | Out-Null } # display the result $iteration | Format-Table $i++ # stop looping if max iterations has been met if ($i -ge $iterations) { $continue = $false } # set the last iteration to the input values for the next $one = $two } } }
Some example output:
PS C:\Users\Andrew> Get-DetailedCpuStats Node processor0 processor1 processor2 processor3 processor4 processor5 processor6 processor7 ---- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- VICE-01 7 6 8 6 VICE-02 3 2 2 5 VICE-03 3 2 1 4 VICE-04 5 8 4 9 VICE-05 4 3 4 4 VICE-06 3 1 1 3 VICE-07 2 1 1 0 1 1 0 3 VICE-08 2 1 1 0 0 1 0 3
PS C:\Users\Andrew> Get-DetailedCpuStats -Node VICE-01 -Interval 30000 Node processor0 processor1 processor2 processor3 ---- ---------- ---------- ---------- ---------- VICE-01 16 15 14 18
Hope that helps, let me know if you have any other questions/issues.
Andrew
Can you give more details about your environment? Specifically what version of CDOT, PowerShell, and the PSToolkit?
CDOT 8.3, Powershell 3, Powershell Toolkit 4.0.0
Use -node to get results for an individual node. CPU still comes back 0, but CPUBUSY may be usefull.
Performance reporting with the API is a bit strange and takes some getting used to. Performance reporting consists of three things:
To collect CPU information we need to find a few things...
# show processor related perf objects Get-NcPerfObject | ?{ $_.Name -like "*processor*" } # results: # processor # processor:node #
# "processor" is the individual CPU stats for each CPU on each node (many counters per node)
# "processor:node" is the summary CPU stats for each node (one counter per node)
# get counters associated with the object we're interested in Get-NcPerfCounter -Name "processor" | %{ $_.Name } # the results we care about: # processor_busy # processor_elapsed_time # get the instances of the object we care about Get-NcPerfInstance -Name "processor" # sample output: # Name Uuid # ---- ---- # processor0 VICE-01:kernel:processor0 # processor1 VICE-01:kernel:processor1 # processor2 VICE-01:kernel:processor2 # processor3 VICE-01:kernel:processor3
With CPU busy time, among others, there is two things we need to be aware of:
CPU busy time is the amount of time that the CPU was busy from time "A" until time "B". Which means we must collect the processor_busy (I'll refer to it as "pb") and processor_elapsed_time (pet) at two times (t1 and t2), subtract the values at t1 from the values at t2, then divide busy time by elapsed time. It sounds complicated, but it's easier to see a simple equation....
($t2_pb - $t1_pb) / ($t2_pet - $t1_pet)
Since it is a percentage, we would also multiple that value by 100 to get the readable percentage value.
There's some complicating factors when looking at detailed CPU stats, namely the instances are by processor, not node. We can limit by node using the "FilterData" parameter of Get-NcPerfData, but by default if you specify "processor0" as the instance it will return processor0 for each of the nodes...which doesn't give us a very good idea of the full picture of CPU utilization.
I have created an example function which you can use to view the CPU utilization breakdown for each of the nodes in a cluster:
function Get-DetailedCpuStats { <# .SYNOPSIS A function to collect and display the breakdown of individual CPU utilization for a node or nodes in a cluster. .DESCRIPTION This function will query the cluster performance API to determine the CPU busy percentage time for the interval specified (defaults to 5 seconds). This function requires a connection to a clustered Data ONTAP system by a user with the ability to use privilege level "admin". .PARAMETER Node A node name to limit the results. Can only be a single node. If not specified all nodes in the cluster will be reported. .PARAMETER Interval The number of milliseconds to wait between queries for CPU busy time. Shorter intervals will cause more CPU utilization on the system as it self-reports. The default is 5000 milliseconds (5 seconds). Min = 1000, Max = 60000. .PARAMETER Iterations The number of iterations to display. Execution time will be # Iterations * Interval Milliseconds. .EXAMPLE Get-DetailedCpuStats Report on all cluster node CPU stats for one five second interval. .EXAMPLE Get-DetailedCpuStats -Interval 10000 -Iterations 5 Report on all cluster node CPU stats for five 10 second intervals .EXAMPLE Get-DetailedCpuStats -Interval 1000 -Iterations 60 -Node NODE-01 Report CPU stats for NODE-01 every 1 second for 60 iterations #> param( # filter to a single node [parameter(mandatory=$false)] [string]$Node , # wait time between checks, in milliseconds [parameter(mandatory=$false)] [ValidateRange(1000,60000)] [int]$Interval = 5000 , # number of iterations to show [parameter(mandatory=$false)] [int]$Iterations = 1 ) process{ # get all instances of the processors $instances = (Get-NcPerfInstance -Name processor | Group-Object -Property Name | %{ $_.Name }) -join "," # collect the inital value of the counters if ($node -ne "" -and $node -ne $null) { $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)" } else { $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time } # create the template object $processor = New-Object -TypeName PSObject $processor | Add-Member -MemberType NoteProperty -Name Node -Value $null # create a property for each processor $one | %{ ($_.Uuid).Split(":")[2] } | Sort-Object | Get-Unique | %{ $processor | Add-Member -MemberType NoteProperty -Name $_ -Value $null } # we will want to do at least one iteration $continue = $true # iteration count $i = 0 while ($continue -eq $true) { # values for this iteration $iteration = New-Object System.Collections.ArrayList # wait some period of time Start-Sleep -Milliseconds $interval # collect the second set if ($node -ne "" -and $node -ne $null) { $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)" } else { $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time } #$two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time # loop once to get all the nodes $nodes = $one | %{ ($_.Uuid).Split(":")[0] } | Sort-Object | Get-Unique foreach ($nodeName in $nodes) { $thisProcessor = $processor.PSObject.Copy() $thisProcessor.Node = $nodeName # for the processors for this node $one | ?{ ($_.Uuid).Split(":")[0] -eq $nodeName } | Sort-Object -Property Uuid | %{ $thisUuid = $_.Uuid #$pb1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_busy" }).value #$pet1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value # get the counters from the first check $pb1 = ($_.Counters | ?{ $_.Name -eq "processor_busy" }).value $pet1 = ($_.Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value # get the counters from the second check $pb2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_busy" }).value $pet2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value # calculate and store #Write-Host "5 Second Average % Busy for $($thisUuid): $([Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100, 2))%" $procId = $thisUuid.Split(":")[2] $percentBusy = [Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100) $thisProcessor.($procId) = $percentBusy } # store the value $iteration.Add($thisProcessor.PSObject.Copy()) | Out-Null } # display the result $iteration | Format-Table $i++ # stop looping if max iterations has been met if ($i -ge $iterations) { $continue = $false } # set the last iteration to the input values for the next $one = $two } } }
Some example output:
PS C:\Users\Andrew> Get-DetailedCpuStats Node processor0 processor1 processor2 processor3 processor4 processor5 processor6 processor7 ---- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- VICE-01 7 6 8 6 VICE-02 3 2 2 5 VICE-03 3 2 1 4 VICE-04 5 8 4 9 VICE-05 4 3 4 4 VICE-06 3 1 1 3 VICE-07 2 1 1 0 1 1 0 3 VICE-08 2 1 1 0 0 1 0 3
PS C:\Users\Andrew> Get-DetailedCpuStats -Node VICE-01 -Interval 30000 Node processor0 processor1 processor2 processor3 ---- ---------- ---------- ---------- ---------- VICE-01 16 15 14 18
Hope that helps, let me know if you have any other questions/issues.
Andrew