Microsoft Virtualization Discussions

CDOT: CPU Stats with Powershell (invoke-ncsysstat, get-ncperfdata, etc.)

Magyk
9,652 Views

I'm trying to help my team do some semi-automated (scripted) performance reporting in our Netapp environment and I'm really struggling.

 

Let's say for example the CDOT Cluster is called "CDOT01" and the cluster nodes are called "CDOT01A", "CDOT01B", and so on.

 

So far I've been able to log into each Netapp CDOT Cluster and do this:

 

Putty into CDOT01 Cluster

> set diag

> node run -node CDOT01A sysstat -c 5 -M 1

(Copy/paste output into text file and do text to columns in Excel.)

 

This gives us a 5-second snapshot for whenever we run it but we need to run it against over 30 different nodes multiple times a day to capture trends.

 

I'm pretty good at building reports with Powershell so I've been trying to get different perf commands to run with the Powershell Toolkit. I’ve identified the following performance monitoring cmdlets:

 

Get-NcPerfData

Get-NcPerfInstance

Get-NcPerfObject

Get-NcPerfCounter

 

Invoke-NcSysstat (Node column reports Mutliple_Values, CPU column always shows 0% and I haven't been able to get a breakdown of the various CPU counters.)

Invoke-NcPerfstat (Error "Could not determine node mgmt IP")

 

I haven’t been able to get any of them to return useful data. The closest I got was doing this:

 

(get-ncperfdata -name system -instance cluster).counters

 

It returns numbers but I can’t tell what they are, they’re just raw numbers that don't seem to correlate to anything meaningful.

 

Going in manually to each controller through SSH and running sysstat isn’t really optimal. If I can get the ncperf commands to return data from Powershell we can start building reports. That would be the ideal outcome I think. . Even if we can just get point-in-time snapshots throughout the day we can schedule scripts and at least start capturing trends.

 

We need a CPU usage breakdown for each controller. If someone can help me get useful data and counters, I can get it formatted into a report. It's just the cmdlets I'm struggling with. Any ideas?

1 ACCEPTED SOLUTION

asulliva
9,546 Views

Performance reporting with the API is a bit strange and takes some getting used to.  Performance reporting consists of three things:

  • The object to be monitored
  • The instance of the object
  • The counter for the instance of the object

To collect CPU information we need to find a few things...

 

# show processor related perf objects
Get-NcPerfObject | ?{ $_.Name -like "*processor*" }

# results: 
#  processor
#  processor:node
#
# "processor" is the individual CPU stats for each CPU on each node (many counters per node)
# "processor:node" is the summary CPU stats for each node (one counter per node)
# get counters associated with the object we're interested in Get-NcPerfCounter -Name "processor" | %{ $_.Name } # the results we care about: # processor_busy # processor_elapsed_time # get the instances of the object we care about Get-NcPerfInstance -Name "processor" # sample output: # Name Uuid # ---- ---- # processor0 VICE-01:kernel:processor0 # processor1 VICE-01:kernel:processor1 # processor2 VICE-01:kernel:processor2 # processor3 VICE-01:kernel:processor3

With CPU busy time, among others, there is two things we need to be aware of:

  1. It is measured by taking measurements at two intervals and then subtracting the second value from the first
  2. It has a base counter (processor_elapsed_time) which it must be divided by

CPU busy time is the amount of time that the CPU was busy from time "A" until time "B".  Which means we must collect the processor_busy (I'll refer to it as "pb") and processor_elapsed_time (pet) at two times (t1 and t2), subtract the values at t1 from the values at t2, then divide busy time by elapsed time.  It sounds complicated, but it's easier to see a simple equation....

 

($t2_pb - $t1_pb) / ($t2_pet - $t1_pet)

Since it is a percentage, we would also multiple that value by 100 to get the readable percentage value.

 

There's some complicating factors when looking at detailed CPU stats, namely the instances are by processor, not node.  We can limit by node using the "FilterData" parameter of Get-NcPerfData, but by default if you specify "processor0" as the instance it will return processor0 for each of the nodes...which doesn't give us a very good idea of the full picture of CPU utilization. 

 

I have created an example function which you can use to view the CPU utilization breakdown for each of the nodes in a cluster:

 

function Get-DetailedCpuStats {
    <#
        .SYNOPSIS
        A function to collect and display the breakdown of individual CPU utilization
        for a node or nodes in a cluster.

        .DESCRIPTION
        This function will query the cluster performance API to determine the CPU
        busy percentage time for the interval specified (defaults to 5 seconds). This
        function requires a connection to a clustered Data ONTAP system by a user
        with the ability to use privilege level "admin".

        .PARAMETER Node
        A node name to limit the results.  Can only be a single node.  If not specified
        all nodes in the cluster will be reported.

        .PARAMETER Interval
        The number of milliseconds to wait between queries for CPU busy time.  Shorter
        intervals will cause more CPU utilization on the system as it self-reports.
        The default is 5000 milliseconds (5 seconds). Min = 1000, Max = 60000.

        .PARAMETER Iterations
        The number of iterations to display.  Execution time will be # Iterations *
        Interval Milliseconds.

        .EXAMPLE
        Get-DetailedCpuStats
        Report on all cluster node CPU stats for one five second interval.

        .EXAMPLE
        Get-DetailedCpuStats -Interval 10000 -Iterations 5
        Report on all cluster node CPU stats for five 10 second intervals

        .EXAMPLE
        Get-DetailedCpuStats -Interval 1000 -Iterations 60 -Node NODE-01
        Report CPU stats for NODE-01 every 1 second for 60 iterations

    #>
    param(
        # filter to a single node
        [parameter(mandatory=$false)]
        [string]$Node
        ,

        # wait time between checks, in milliseconds
        [parameter(mandatory=$false)]
        [ValidateRange(1000,60000)]
        [int]$Interval = 5000
        ,

        # number of iterations to show
        [parameter(mandatory=$false)]
        [int]$Iterations = 1

    )
    process{
        # get all instances of the processors
        $instances = (Get-NcPerfInstance -Name processor | Group-Object -Property Name | %{ $_.Name }) -join ","

        # collect the inital value of the counters
        if ($node -ne "" -and $node -ne $null) {
            $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)"
        } else {
            $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time
        }

        # create the template object
        $processor = New-Object -TypeName PSObject
        $processor | Add-Member -MemberType NoteProperty -Name Node -Value $null

        # create a property for each processor
        $one | %{ ($_.Uuid).Split(":")[2] } | Sort-Object | Get-Unique | %{ $processor | Add-Member -MemberType NoteProperty -Name $_ -Value $null }

        # we will want to do at least one iteration
        $continue = $true

        # iteration count
        $i = 0

        while ($continue -eq $true) {
            # values for this iteration
            $iteration = New-Object System.Collections.ArrayList

            # wait some period of time
            Start-Sleep -Milliseconds $interval

            # collect the second set
            if ($node -ne "" -and $node -ne $null) {
                $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)"
            } else {
                $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time
            }

            #$two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time

            # loop once to get all the nodes
            $nodes = $one | %{ ($_.Uuid).Split(":")[0] } | Sort-Object | Get-Unique

            foreach ($nodeName in $nodes) {
                $thisProcessor = $processor.PSObject.Copy()
                $thisProcessor.Node = $nodeName
        
                # for the processors for this node
                $one | ?{ ($_.Uuid).Split(":")[0] -eq $nodeName } | Sort-Object -Property Uuid | %{
                    $thisUuid = $_.Uuid

                    #$pb1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_busy" }).value
                    #$pet1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value

                    # get the counters from the first check
                    $pb1 = ($_.Counters | ?{ $_.Name -eq "processor_busy" }).value
                    $pet1 = ($_.Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value
            
                    # get the counters from the second check
                    $pb2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_busy" }).value
                    $pet2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value
        
                    # calculate and store
                    #Write-Host "5 Second Average % Busy for $($thisUuid): $([Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100, 2))%"

                    $procId = $thisUuid.Split(":")[2]
                    $percentBusy = [Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100)

                    $thisProcessor.($procId) = $percentBusy
                }
        
                # store the value
                $iteration.Add($thisProcessor.PSObject.Copy()) | Out-Null
            }

            # display the result
            $iteration | Format-Table

            $i++

            # stop looping if max iterations has been met
            if ($i -ge $iterations) {
                $continue = $false
            }

            # set the last iteration to the input values for the next
            $one = $two
        }
    }
}

Some example output:

 

 

PS C:\Users\Andrew> Get-DetailedCpuStats

Node    processor0 processor1 processor2 processor3 processor4 processor5 processor6 processor7
----    ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
VICE-01          7          6          8          6                                            
VICE-02          3          2          2          5                                            
VICE-03          3          2          1          4                                            
VICE-04          5          8          4          9                                            
VICE-05          4          3          4          4                                            
VICE-06          3          1          1          3                                            
VICE-07          2          1          1          0 1          1          0          3         
VICE-08          2          1          1          0 0          1          0          3         
PS C:\Users\Andrew> Get-DetailedCpuStats -Node VICE-01 -Interval 30000

Node    processor0 processor1 processor2 processor3
----    ---------- ---------- ---------- ----------
VICE-01         16         15         14         18

 

Hope that helps, let me know if you have any other questions/issues.

 

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

View solution in original post

4 REPLIES 4

jpulk
9,615 Views

Can you give more details about your environment? Specifically what version of CDOT, PowerShell, and the PSToolkit?

Magyk
9,603 Views

CDOT 8.3, Powershell 3, Powershell Toolkit 4.0.0

SeanHatfield
9,559 Views

Use -node to get results for an individual node.  CPU still comes back 0, but CPUBUSY may be usefull.  

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

asulliva
9,547 Views

Performance reporting with the API is a bit strange and takes some getting used to.  Performance reporting consists of three things:

  • The object to be monitored
  • The instance of the object
  • The counter for the instance of the object

To collect CPU information we need to find a few things...

 

# show processor related perf objects
Get-NcPerfObject | ?{ $_.Name -like "*processor*" }

# results: 
#  processor
#  processor:node
#
# "processor" is the individual CPU stats for each CPU on each node (many counters per node)
# "processor:node" is the summary CPU stats for each node (one counter per node)
# get counters associated with the object we're interested in Get-NcPerfCounter -Name "processor" | %{ $_.Name } # the results we care about: # processor_busy # processor_elapsed_time # get the instances of the object we care about Get-NcPerfInstance -Name "processor" # sample output: # Name Uuid # ---- ---- # processor0 VICE-01:kernel:processor0 # processor1 VICE-01:kernel:processor1 # processor2 VICE-01:kernel:processor2 # processor3 VICE-01:kernel:processor3

With CPU busy time, among others, there is two things we need to be aware of:

  1. It is measured by taking measurements at two intervals and then subtracting the second value from the first
  2. It has a base counter (processor_elapsed_time) which it must be divided by

CPU busy time is the amount of time that the CPU was busy from time "A" until time "B".  Which means we must collect the processor_busy (I'll refer to it as "pb") and processor_elapsed_time (pet) at two times (t1 and t2), subtract the values at t1 from the values at t2, then divide busy time by elapsed time.  It sounds complicated, but it's easier to see a simple equation....

 

($t2_pb - $t1_pb) / ($t2_pet - $t1_pet)

Since it is a percentage, we would also multiple that value by 100 to get the readable percentage value.

 

There's some complicating factors when looking at detailed CPU stats, namely the instances are by processor, not node.  We can limit by node using the "FilterData" parameter of Get-NcPerfData, but by default if you specify "processor0" as the instance it will return processor0 for each of the nodes...which doesn't give us a very good idea of the full picture of CPU utilization. 

 

I have created an example function which you can use to view the CPU utilization breakdown for each of the nodes in a cluster:

 

function Get-DetailedCpuStats {
    <#
        .SYNOPSIS
        A function to collect and display the breakdown of individual CPU utilization
        for a node or nodes in a cluster.

        .DESCRIPTION
        This function will query the cluster performance API to determine the CPU
        busy percentage time for the interval specified (defaults to 5 seconds). This
        function requires a connection to a clustered Data ONTAP system by a user
        with the ability to use privilege level "admin".

        .PARAMETER Node
        A node name to limit the results.  Can only be a single node.  If not specified
        all nodes in the cluster will be reported.

        .PARAMETER Interval
        The number of milliseconds to wait between queries for CPU busy time.  Shorter
        intervals will cause more CPU utilization on the system as it self-reports.
        The default is 5000 milliseconds (5 seconds). Min = 1000, Max = 60000.

        .PARAMETER Iterations
        The number of iterations to display.  Execution time will be # Iterations *
        Interval Milliseconds.

        .EXAMPLE
        Get-DetailedCpuStats
        Report on all cluster node CPU stats for one five second interval.

        .EXAMPLE
        Get-DetailedCpuStats -Interval 10000 -Iterations 5
        Report on all cluster node CPU stats for five 10 second intervals

        .EXAMPLE
        Get-DetailedCpuStats -Interval 1000 -Iterations 60 -Node NODE-01
        Report CPU stats for NODE-01 every 1 second for 60 iterations

    #>
    param(
        # filter to a single node
        [parameter(mandatory=$false)]
        [string]$Node
        ,

        # wait time between checks, in milliseconds
        [parameter(mandatory=$false)]
        [ValidateRange(1000,60000)]
        [int]$Interval = 5000
        ,

        # number of iterations to show
        [parameter(mandatory=$false)]
        [int]$Iterations = 1

    )
    process{
        # get all instances of the processors
        $instances = (Get-NcPerfInstance -Name processor | Group-Object -Property Name | %{ $_.Name }) -join ","

        # collect the inital value of the counters
        if ($node -ne "" -and $node -ne $null) {
            $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)"
        } else {
            $one = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time
        }

        # create the template object
        $processor = New-Object -TypeName PSObject
        $processor | Add-Member -MemberType NoteProperty -Name Node -Value $null

        # create a property for each processor
        $one | %{ ($_.Uuid).Split(":")[2] } | Sort-Object | Get-Unique | %{ $processor | Add-Member -MemberType NoteProperty -Name $_ -Value $null }

        # we will want to do at least one iteration
        $continue = $true

        # iteration count
        $i = 0

        while ($continue -eq $true) {
            # values for this iteration
            $iteration = New-Object System.Collections.ArrayList

            # wait some period of time
            Start-Sleep -Milliseconds $interval

            # collect the second set
            if ($node -ne "" -and $node -ne $null) {
                $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time -FilterData "node_name=$($node)"
            } else {
                $two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time
            }

            #$two = Get-NcPerfData -Name processor -Instance $instances -counter processor_busy,processor_elapsed_time

            # loop once to get all the nodes
            $nodes = $one | %{ ($_.Uuid).Split(":")[0] } | Sort-Object | Get-Unique

            foreach ($nodeName in $nodes) {
                $thisProcessor = $processor.PSObject.Copy()
                $thisProcessor.Node = $nodeName
        
                # for the processors for this node
                $one | ?{ ($_.Uuid).Split(":")[0] -eq $nodeName } | Sort-Object -Property Uuid | %{
                    $thisUuid = $_.Uuid

                    #$pb1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_busy" }).value
                    #$pet1 = (($one | ?{ $_.Uuid -eq $_ }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value

                    # get the counters from the first check
                    $pb1 = ($_.Counters | ?{ $_.Name -eq "processor_busy" }).value
                    $pet1 = ($_.Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value
            
                    # get the counters from the second check
                    $pb2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_busy" }).value
                    $pet2 = (($two | ?{ $_.Uuid -eq $thisUuid }).Counters | ?{ $_.Name -eq "processor_elapsed_time" }).value
        
                    # calculate and store
                    #Write-Host "5 Second Average % Busy for $($thisUuid): $([Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100, 2))%"

                    $procId = $thisUuid.Split(":")[2]
                    $percentBusy = [Math]::Round((($pb2 - $pb1) / ($pet2 - $pet1)) * 100)

                    $thisProcessor.($procId) = $percentBusy
                }
        
                # store the value
                $iteration.Add($thisProcessor.PSObject.Copy()) | Out-Null
            }

            # display the result
            $iteration | Format-Table

            $i++

            # stop looping if max iterations has been met
            if ($i -ge $iterations) {
                $continue = $false
            }

            # set the last iteration to the input values for the next
            $one = $two
        }
    }
}

Some example output:

 

 

PS C:\Users\Andrew> Get-DetailedCpuStats

Node    processor0 processor1 processor2 processor3 processor4 processor5 processor6 processor7
----    ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
VICE-01          7          6          8          6                                            
VICE-02          3          2          2          5                                            
VICE-03          3          2          1          4                                            
VICE-04          5          8          4          9                                            
VICE-05          4          3          4          4                                            
VICE-06          3          1          1          3                                            
VICE-07          2          1          1          0 1          1          0          3         
VICE-08          2          1          1          0 0          1          0          3         
PS C:\Users\Andrew> Get-DetailedCpuStats -Node VICE-01 -Interval 30000

Node    processor0 processor1 processor2 processor3
----    ---------- ---------- ---------- ----------
VICE-01         16         15         14         18

 

Hope that helps, let me know if you have any other questions/issues.

 

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.
Public