Software Development Kit (SDK) and API Discussions

Get Disk busy-ness and disk latency counter metrics using SDK API

senorcalimero
3,995 Views

Hi all,

 

I'm developing a nagios plugin to get some performance metrics from a NetApp Device using the SDK API. I would like to get the Disk Busy and Disk Latency values for each disk but I need some help with the counter values.

 

Does anybody know what is the arithmetic to get the following counter metrics using the API?:

 

NetApp-Controller01> stats show disk

...

disk:50000C90:00...000000:user_read_latency:0us

disk:50000C90:00...000000:user_write_latency:150.54us

disk:50000C90:00...000000::disk_busy:2%

 

Listing the counters with the API I can see the following ones that I suppose have something to do with the values showed on the CLI and their descriptions:

 

- For Read Latency:

 

Counter Name = user_read_latency  Counter Value = XXXX

Counter Name = user_read_latency       
Base Counter = user_read_blocks    
Privilege_level = basic    
Unit = microsec    

 

Counter Name = user_read_blocks  Counter Value = XXXX

Counter Name = user_read_blocks       
Base Counter = none    
Privilege_level = basic    
Unit = per_sec    

 

 

- For Write Latency:

 

Counter Name = user_write_latency  Counter Value = XXXX

Counter Name = user_write_latency       
Base Counter = user_write_blocks    
Privilege_level = basic    
Unit = microsec    

 

Counter Name = user_write_blocks  Counter Value = XXXX

Counter Name = user_write_blocks       
Base Counter = none    
Privilege_level = basic    
Unit = per_sec    

 

 

- For Disk Busy:

 

Counter Name = disk_busy  Counter Value = XXXX

Counter Name = disk_busy       
Base Counter = base_for_disk_busy    
Privilege_level = basic    
Unit = percent    

 

Counter Name = base_for_disk_busy  Counter Value = XXXX

Counter Name = base_for_disk_busy       
Base Counter = none    
Privilege_level = basic    
Unit = none    

 

Counter Name = io_pending  Counter Value = XXXX

Counter Name = io_pending       
Base Counter = base_for_disk_busy    
Privilege_level = diag    
Unit = none     

 

Counter Name = io_queued  Counter Value = XXXX

Counter Name = io_queued       
Base Counter = base_for_disk_busy    
Privilege_level = diag    
Unit = none    

 

Any help will be very appreciated!!

 

Kind Regards!

1 REPLY 1

EMICHAELSALMON
3,995 Views

The counters that you get from the API are not as described, they are like SNMP counters i.e. continuously increasing but modulo 2^32 or 2^64. You need to fetch the counters, wait for the time you are interested in, fetch the counters again and calculate the difference. Even though the counters wrap around the difference is correct. Once you have a delta you need to divide it by the correct delta base so the value for io_queued is (io_queued(2) - io_queued(1)) / (base_for_disk_busy(2) - base_for_disk_busy(1)). For the ops counters you need to divide by elapsed time. If you are doing continuous measurements then you can just save the current values to use as old values for the next run.

I am pretty sure that this is how stats works as well. It copies the counters, waits for a second then copies the counters and calculates the deltas.

Public