Get CPU usage and CPU ready values for all VMs in a cluster

Standard

When working with VMWare it is crucial to monitor the performance of your HA clusters and VMs. It is important to be able to get performance data so you can make sure your VMs have the resources they need and to know if they are sized correctly. To determine this I often use the CPU usage and CPU ready performance counters.

– What are the cpu usage/ready perf. counters and what do they show us?

The CPU usage value is sort of like the value you see in the task manager in windows, the main difference being that this value is not measured by the VM’s OS but by the ESXi host it is running on. As in the task manager this value indicates how much CPU is being used by your VM.
The CPU ready value is the amount of time your VM’s CPU has been “ready”, meaning how much time it has been waiting, doing nothing, for CPU cycles to be assigned to it by the host. If this value is high it is often an indicator of cluster health problems and/or bad VM sizing. I urge you to search for information about these counters and learn about them as they are highly useful when managing a VMWare environment. You can read more about CPU ready time here:
http://blogs.totalcaos.com/understanding-rdy-cpu-ready/

– Why should I use the script found in this blog post?

Gathering data like the performance counters described above can be done with Vcenter Operations Manager or with Veeam if you have this. However these applications are complicated to use and configure (and cost a lot of money). The script I have written below will give you a simple list/csv of the VMs in a given cluster with their CPU usage and CPU ready values in percent. You can then review your VMs and see check for any high/problematic values you find and improve the efficiency of your infrastructure. Any VMs with an average cpu usage over 70 % should probably be assigned more resources. If there are any VMs with ready times of 10% or more it probably means that your cluster/host is overcommitted or that the VM is oversized.

– The script itself and how I wrote it

To simplify the structure of the script I have divided it into three sections variables, functions and script main.

Variables

#VARIABLES
$scriptpath = $MyInvocation.MyCommand.Path
$dir = Split-Path $scriptpath
$CSV = "$dir\Average_CPU_Usage_Peak_Hours.csv"
#Parameters
$vcenter = "Some_Vcenter_server" #Name of your vcenter server
$ClusterName = "Some_cluster" #CPU usage and CPU ready values will be collected for all VMs in this cluster.
$DaysBack = 14 #Number of days back to collect performance counters.
$PeakTimeStart = 8 #hour of the day in 24 hour format 
$PeakTimeEnd = 20 #hour of the day in 24 hour format
$rdy_interval = 7200 #interval of rdy time values aggregation/averaging in seconds. This value should be changed according to your vcenter statistics settings.

Before running the script you will need to change the values in the #parameters section according to your needs/environment. $vcenter should be the name of your vcenter server and $ClusterName the name of the cluster you wish to collect VM performance data from. $DaysBack is the amount of time you wish to collect performance from. E.g if it is set to 14 days you will collect performance data going back 14 days to the time you are running the script. Make sure you set this value to a valid number of days (your vcenter DB must retain the info for the amount of time specified).

I have found that performance data is more useful when it is gathered in business hours so the results show values reflecting a busy system. To gather performance counters from the relevant times of day you must change the values for $peakTimeStart and $peakTimeEnd to suit your needs.

The last parameter is the $rdy_interval. When gathering the cpu ready time performance counters the values are returned as a number of milliseconds. To convert this number correctly to a percent value it is necessary to take into account the aggregation interval (the time between the data points stored in the vcenter database). On my vcenter installation the ready time values are aggregated/averaged every two hours and therefore I have set the $rdy_interval to 7200 seconds. You can read more about this here:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2002181

Now to the functions.

Functions

1. Connect-Vcenter

function Connect-Vcenter {
	param(
		$vcenter
	)
	#Load snap-in
	if (-not (Get-PSSnapin -Name VMware.VimAutomation.Core)) {
		Add-PSSnapin VMware.VimAutomation.Core
	}
	#Connect to vCenter
	if ($global:DefaultVIServers.Count -lt 1) {
		Connect-VIServer $vCenter
	}
}

Connect-Vcenter is a small function that simply adds the snap-in for VMware cmdlets and connects to a vcenter server making the powershell instance ready to fire commands.
2. Get-VMCPUAverage

function Get-VMCPUAverage {
    param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
    
	$stats = get-vm $VM | get-stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat "cpu.usage.average" | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd }
	$aggr_stats = $stats | Measure-Object -Property Value -Average
	$avg = $aggr_stats.Average
	return $avg
}

This function uses the vmware cmdlet Get-Stat to collect the performance counter cpu.usage.average in the specified timeframe. The average value of all the collected data points are then returned.
3. Get-VMCpuRDY

function Get-VMCpuRDY {
	param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
	
	$stats = get-vm $VM | Get-Stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat Cpu.Ready.Summation | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd -and $_.Instance -eq ""}	
	$aggr_stats = $stats | Measure-Object -Property Value -Average	
	$rdy = [Math]::Round(((($aggr_stats.Average)/1000)/$rdy_interval) * 100,1)
	return $rdy
}

Get-VMCpuRDY collects the average values of the performance counter Cpu.Ready.Summation in the specified timeframe. As explained earlier in the post this counter returns a summation of the milliseconds in which the CPU was waiting for resources. In order to convert this into a percentage value I first divide the amount of milliseconds by 1000 to convert to seconds. I divide this amount of seconds by the $rdy_interval, which is the amount of seconds between each data point, and then multiply by 100. I then use [Math]::Round to round off the value to one decimal which is the number returned by the function.

Script Main

#SCRIPT MAIN
clear
#Load the VMWare module and connect to vCenter
Connect-Vcenter -vcenter $vcenter
$AvgCPUValues = @() #Create array to hold the CPU usage and CPU ready values
Get-Cluster $ClusterName | Get-VM | ? {$_.PowerState -eq "PoweredOn"} | % {	#loop through all powered on VMs in the cluster
	$AvgCPUValue = "" | Select "VM","CpuAvg","CpuRdy" #create a custom object with these properties.
	$AvgCPUValue.VM = $_.Name
	$AvgCPUValue.CpuAvg = "{0:N2}" -f $(Get-VMCPUAverage -VM $_) #Get VM CPU usage and round to two decimals
	$AvgCPUValue.CpuRdy = Get-VMCpuRDY -VM $_
	$AvgCPUValues += $AvgCPUValue
}
$AvgCPUValues | Export-Csv $CSV -NoTypeInformation -Force

In the script main I first connect to the vcenter server using my function Connect-Vcenter. I then create an array $AvgCPUValues. The next step is to loop through all the VMs in the cluster using Get-Cluster $ClusterName | Get-VM. In the loop I create a custom object for each VM with the properties VM, CPUAvg and CPURdy. I use the functions Get-VMCPUAverage and Get-VMCpuRDY to get cpu usage and cpu ready values and then assign these to the corresponding properties on the custom object. The custom object is then added to the array $AvgCPUValues which I created in the beginning of the script main.
When the loop has completed I then export this array to CSV in the directory the script was run.

– That’s it! You’re ready to go!

I really hope you find this script useful. Remember to fill out the #parameters section before you run the script.
I have copied in the full script below.

#############################################################################################################
##script:			Get-VMCPUAverage.ps1
##
##Description:		Gets "CPU usage" and "cpu ready" for all VMs in a given cluster and exports the results
#+					to a CSV file in the script directory.
##Created by:		Noam Wajnman
##Creation Date:	March 11, 2014
##Updated on:		May 20, 2014
##############################################################################################################
#VARIABLES
$scriptpath = $MyInvocation.MyCommand.Path
$dir = Split-Path $scriptpath
$CSV = "$dir\Average_CPU_Usage_Peak_Hours.csv"
#Parameters
$vcenter = "Some_Vcenter_server" #Name of your vcenter server
$ClusterName = "Some_cluster" #CPU usage and CPU ready values will be collected for all VMs in this cluster.
$DaysBack = 14 #Number of days back to collect performance counters.
$PeakTimeStart = 8 #hour of the day in 24 hour format 
$PeakTimeEnd = 20 #hour of the day in 24 hour format
$rdy_interval = 7200 #interval of rdy time values aggregation/averaging in seconds. This value should be changed according to your vcenter statistics settings.
#FUNCTIONS
function Connect-Vcenter {
	param(
		$vcenter
	)
	#Load snap-in
	if (-not (Get-PSSnapin -Name VMware.VimAutomation.Core)) {
		Add-PSSnapin VMware.VimAutomation.Core
	}
	#Connect to vCenter
	if ($global:DefaultVIServers.Count -lt 1) {
		Connect-VIServer $vCenter
	}
}
function Get-VMCPUAverage {
    param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
    
	$stats = get-vm $VM | get-stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat "cpu.usage.average" | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd }
	$aggr_stats = $stats | Measure-Object -Property Value -Average
	$avg = $aggr_stats.Average
	return $avg
}
function Get-VMCpuRDY {
	param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
	
	$stats = get-vm $VM | Get-Stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat Cpu.Ready.Summation | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd -and $_.Instance -eq ""}	
	$aggr_stats = $stats | Measure-Object -Property Value -Average	
	$rdy = [Math]::Round(((($aggr_stats.Average)/1000)/$rdy_interval) * 100,1)
	return $rdy
}
#SCRIPT MAIN
clear
#Load the VMWare module and connect to vCenter
Connect-Vcenter -vcenter $vcenter
$AvgCPUValues = @() #Create array to hold the CPU usage and CPU ready values
Get-Cluster $ClusterName | Get-VM | ? {$_.PowerState -eq "PoweredOn"} | % {	#loop through all powered on VMs in the cluster
	$AvgCPUValue = "" | Select "VM","CpuAvg","CpuRdy" #create a custom object with these properties.
	$AvgCPUValue.VM = $_.Name
	$AvgCPUValue.CpuAvg = "{0:N2}" -f $(Get-VMCPUAverage -VM $_) #Get VM CPU usage and round to two decimals
	$AvgCPUValue.CpuRdy = Get-VMCpuRDY -VM $_
	$AvgCPUValues += $AvgCPUValue
}
$AvgCPUValues | Export-Csv $CSV -NoTypeInformation -Force