Get CPU usage and CPU ready values for all VMs in a cluster

Standard

When working with VMWare it is crucial to monitor the performance of your HA clusters and VMs. It is important to be able to get performance data so you can make sure your VMs have the resources they need and to know if they are sized correctly. To determine this I often use the CPU usage and CPU ready performance counters.

– What are the cpu usage/ready perf. counters and what do they show us?

The CPU usage value is sort of like the value you see in the task manager in windows, the main difference being that this value is not measured by the VM’s OS but by the ESXi host it is running on. As in the task manager this value indicates how much CPU is being used by your VM.
The CPU ready value is the amount of time your VM’s CPU has been “ready”, meaning how much time it has been waiting, doing nothing, for CPU cycles to be assigned to it by the host. If this value is high it is often an indicator of cluster health problems and/or bad VM sizing. I urge you to search for information about these counters and learn about them as they are highly useful when managing a VMWare environment. You can read more about CPU ready time here:
http://blogs.totalcaos.com/understanding-rdy-cpu-ready/

– Why should I use the script found in this blog post?

Gathering data like the performance counters described above can be done with Vcenter Operations Manager or with Veeam if you have this. However these applications are complicated to use and configure (and cost a lot of money). The script I have written below will give you a simple list/csv of the VMs in a given cluster with their CPU usage and CPU ready values in percent. You can then review your VMs and see check for any high/problematic values you find and improve the efficiency of your infrastructure. Any VMs with an average cpu usage over 70 % should probably be assigned more resources. If there are any VMs with ready times of 10% or more it probably means that your cluster/host is overcommitted or that the VM is oversized.

– The script itself and how I wrote it

To simplify the structure of the script I have divided it into three sections variables, functions and script main.

Variables

#VARIABLES
$scriptpath = $MyInvocation.MyCommand.Path
$dir = Split-Path $scriptpath
$CSV = "$dir\Average_CPU_Usage_Peak_Hours.csv"
#Parameters
$vcenter = "Some_Vcenter_server" #Name of your vcenter server
$ClusterName = "Some_cluster" #CPU usage and CPU ready values will be collected for all VMs in this cluster.
$DaysBack = 14 #Number of days back to collect performance counters.
$PeakTimeStart = 8 #hour of the day in 24 hour format 
$PeakTimeEnd = 20 #hour of the day in 24 hour format
$rdy_interval = 7200 #interval of rdy time values aggregation/averaging in seconds. This value should be changed according to your vcenter statistics settings.

Before running the script you will need to change the values in the #parameters section according to your needs/environment. $vcenter should be the name of your vcenter server and $ClusterName the name of the cluster you wish to collect VM performance data from. $DaysBack is the amount of time you wish to collect performance from. E.g if it is set to 14 days you will collect performance data going back 14 days to the time you are running the script. Make sure you set this value to a valid number of days (your vcenter DB must retain the info for the amount of time specified).

I have found that performance data is more useful when it is gathered in business hours so the results show values reflecting a busy system. To gather performance counters from the relevant times of day you must change the values for $peakTimeStart and $peakTimeEnd to suit your needs.

The last parameter is the $rdy_interval. When gathering the cpu ready time performance counters the values are returned as a number of milliseconds. To convert this number correctly to a percent value it is necessary to take into account the aggregation interval (the time between the data points stored in the vcenter database). On my vcenter installation the ready time values are aggregated/averaged every two hours and therefore I have set the $rdy_interval to 7200 seconds. You can read more about this here:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2002181

Now to the functions.

Functions

1. Connect-Vcenter

function Connect-Vcenter {
	param(
		$vcenter
	)
	#Load snap-in
	if (-not (Get-PSSnapin -Name VMware.VimAutomation.Core)) {
		Add-PSSnapin VMware.VimAutomation.Core
	}
	#Connect to vCenter
	if ($global:DefaultVIServers.Count -lt 1) {
		Connect-VIServer $vCenter
	}
}

Connect-Vcenter is a small function that simply adds the snap-in for VMware cmdlets and connects to a vcenter server making the powershell instance ready to fire commands.
2. Get-VMCPUAverage

function Get-VMCPUAverage {
    param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
    
	$stats = get-vm $VM | get-stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat "cpu.usage.average" | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd }
	$aggr_stats = $stats | Measure-Object -Property Value -Average
	$avg = $aggr_stats.Average
	return $avg
}

This function uses the vmware cmdlet Get-Stat to collect the performance counter cpu.usage.average in the specified timeframe. The average value of all the collected data points are then returned.
3. Get-VMCpuRDY

function Get-VMCpuRDY {
	param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
	
	$stats = get-vm $VM | Get-Stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat Cpu.Ready.Summation | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd -and $_.Instance -eq ""}	
	$aggr_stats = $stats | Measure-Object -Property Value -Average	
	$rdy = [Math]::Round(((($aggr_stats.Average)/1000)/$rdy_interval) * 100,1)
	return $rdy
}

Get-VMCpuRDY collects the average values of the performance counter Cpu.Ready.Summation in the specified timeframe. As explained earlier in the post this counter returns a summation of the milliseconds in which the CPU was waiting for resources. In order to convert this into a percentage value I first divide the amount of milliseconds by 1000 to convert to seconds. I divide this amount of seconds by the $rdy_interval, which is the amount of seconds between each data point, and then multiply by 100. I then use [Math]::Round to round off the value to one decimal which is the number returned by the function.

Script Main

#SCRIPT MAIN
clear
#Load the VMWare module and connect to vCenter
Connect-Vcenter -vcenter $vcenter
$AvgCPUValues = @() #Create array to hold the CPU usage and CPU ready values
Get-Cluster $ClusterName | Get-VM | ? {$_.PowerState -eq "PoweredOn"} | % {	#loop through all powered on VMs in the cluster
	$AvgCPUValue = "" | Select "VM","CpuAvg","CpuRdy" #create a custom object with these properties.
	$AvgCPUValue.VM = $_.Name
	$AvgCPUValue.CpuAvg = "{0:N2}" -f $(Get-VMCPUAverage -VM $_) #Get VM CPU usage and round to two decimals
	$AvgCPUValue.CpuRdy = Get-VMCpuRDY -VM $_
	$AvgCPUValues += $AvgCPUValue
}
$AvgCPUValues | Export-Csv $CSV -NoTypeInformation -Force

In the script main I first connect to the vcenter server using my function Connect-Vcenter. I then create an array $AvgCPUValues. The next step is to loop through all the VMs in the cluster using Get-Cluster $ClusterName | Get-VM. In the loop I create a custom object for each VM with the properties VM, CPUAvg and CPURdy. I use the functions Get-VMCPUAverage and Get-VMCpuRDY to get cpu usage and cpu ready values and then assign these to the corresponding properties on the custom object. The custom object is then added to the array $AvgCPUValues which I created in the beginning of the script main.
When the loop has completed I then export this array to CSV in the directory the script was run.

– That’s it! You’re ready to go!

I really hope you find this script useful. Remember to fill out the #parameters section before you run the script.
I have copied in the full script below.

#############################################################################################################
##script:			Get-VMCPUAverage.ps1
##
##Description:		Gets "CPU usage" and "cpu ready" for all VMs in a given cluster and exports the results
#+					to a CSV file in the script directory.
##Created by:		Noam Wajnman
##Creation Date:	March 11, 2014
##Updated on:		May 20, 2014
##############################################################################################################
#VARIABLES
$scriptpath = $MyInvocation.MyCommand.Path
$dir = Split-Path $scriptpath
$CSV = "$dir\Average_CPU_Usage_Peak_Hours.csv"
#Parameters
$vcenter = "Some_Vcenter_server" #Name of your vcenter server
$ClusterName = "Some_cluster" #CPU usage and CPU ready values will be collected for all VMs in this cluster.
$DaysBack = 14 #Number of days back to collect performance counters.
$PeakTimeStart = 8 #hour of the day in 24 hour format 
$PeakTimeEnd = 20 #hour of the day in 24 hour format
$rdy_interval = 7200 #interval of rdy time values aggregation/averaging in seconds. This value should be changed according to your vcenter statistics settings.
#FUNCTIONS
function Connect-Vcenter {
	param(
		$vcenter
	)
	#Load snap-in
	if (-not (Get-PSSnapin -Name VMware.VimAutomation.Core)) {
		Add-PSSnapin VMware.VimAutomation.Core
	}
	#Connect to vCenter
	if ($global:DefaultVIServers.Count -lt 1) {
		Connect-VIServer $vCenter
	}
}
function Get-VMCPUAverage {
    param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
    
	$stats = get-vm $VM | get-stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat "cpu.usage.average" | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd }
	$aggr_stats = $stats | Measure-Object -Property Value -Average
	$avg = $aggr_stats.Average
	return $avg
}
function Get-VMCpuRDY {
	param (
        $VM
    )
	$Start = (get-date).AddDays(-$DaysBack)
	$Finish = get-date
	
	$stats = get-vm $VM | Get-Stat -MaxSamples "1000" -Start $Start -Finish $Finish -Stat Cpu.Ready.Summation | `
    ? { ($_.TimeStamp).Hour -gt $PeakTimeStart -and ($_.TimeStamp).Hour -lt $PeakTimeEnd -and $_.Instance -eq ""}	
	$aggr_stats = $stats | Measure-Object -Property Value -Average	
	$rdy = [Math]::Round(((($aggr_stats.Average)/1000)/$rdy_interval) * 100,1)
	return $rdy
}
#SCRIPT MAIN
clear
#Load the VMWare module and connect to vCenter
Connect-Vcenter -vcenter $vcenter
$AvgCPUValues = @() #Create array to hold the CPU usage and CPU ready values
Get-Cluster $ClusterName | Get-VM | ? {$_.PowerState -eq "PoweredOn"} | % {	#loop through all powered on VMs in the cluster
	$AvgCPUValue = "" | Select "VM","CpuAvg","CpuRdy" #create a custom object with these properties.
	$AvgCPUValue.VM = $_.Name
	$AvgCPUValue.CpuAvg = "{0:N2}" -f $(Get-VMCPUAverage -VM $_) #Get VM CPU usage and round to two decimals
	$AvgCPUValue.CpuRdy = Get-VMCpuRDY -VM $_
	$AvgCPUValues += $AvgCPUValue
}
$AvgCPUValues | Export-Csv $CSV -NoTypeInformation -Force
Advertisements

20 thoughts on “Get CPU usage and CPU ready values for all VMs in a cluster

  1. Patrick L. Parks

    When you discuss the $rdy_interval, you mention that “in your install” of vCenter, the time values are aggregated/averaged every 2 hours..thus you set the value to 7200 seconds.

    How do I determine the value for my vCenter installation?

    • Patrick L. Parks

      Nevermind, figured it out. From the vCenter settings, I have to look at the Statistics tab and find the value associated for the timeframe I’m looking for. So, for 14 days, that falls into the 1 month category which shows 2 hours. If I was looking at a week or less, it would be every 30 minutes. If i was looking at a day or less, it’s every 3 minutes.

      • Actually I recently discovered that you can get the interval directly from the counters you are collecting. Take a look at the below code:
        $stats = Get-Stat -Stat $metrics -Entity $VMs
        $interval = $stats[0].IntervalSecs
        As you can see the $interval can be found as a property of the collected performance statistics. Check out LUCD’s blog for more information and great scripts in VMWare!

        http://www.lucd.info/2011/04/22/get-the-maximum-iops/

        Also, you are right that you can see it in the vcenter server settings in the statistics tab.

  2. Patrick L. Parks

    I’ve got one other question you might be able to provide insight towards. In your script, you mention a startpeaktime and an endpeaktime.

    Do you happen to know, does this use the timezone defined on the ESX servers themselves, or does it use the timezone from the client where you are running the PowerShell CLI commands? In my case, my workstation is in the US with the VMWare PowerCLI, but my vCenter that I am running against is in Australia. So, when I say from 8am to 11am, I don’t know if that’s in Australian Time or US time.

    • Hi,
      The startpeaktime/startendtime declared are compared to the timestamps on the collected performance counters, meaning the vcenter server. You will need to enter values in the same time zone as your vcenter server (Australia as I understand it).
      So if your peak times are 08:00-11:00 AM US time you will need to convert these to Australian time for running the script.
      Hope this answers your question.
      Regards,
      Noam

  3. I’m amazed, I have to admit. Seldom do I come across a blog
    that’s equally educative and entertaining, and let me tell you, you have hit the nail on the head.
    The issue is an issue that not enough folks are speaking intelligently about.
    I’m very happy I came across this in my search for something concerning this.

  4. brianresume@hotmail.com

    When I try to run the script I get the following error message. Where can I get the missing module?

    Get-PSSnapin : No Windows PowerShell snap-ins matching the pattern ‘VMware.VimAutomation.Core’ were found. Check the
    pattern and then try the command again.
    At C:\scripts\Get-VMCPUAverage.ps1:27 char:15
    + if (-not (Get-PSSnapin -Name VMware.VimAutomation.Core)) {
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : InvalidArgument: (VMware.VimAutomation.Core:String) [Get-PSSnapin], PSArgumentException
    + FullyQualifiedErrorId : NoPSSnapInsFound,Microsoft.PowerShell.Commands.GetPSSnapinCommand

    Add-PSSnapin : The Windows PowerShell snap-in ‘VMware.VimAutomation.Core’ is not installed on this computer.
    At C:\scripts\Get-VMCPUAverage.ps1:28 char:9
    + Add-PSSnapin VMware.VimAutomation.Core
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : InvalidArgument: (VMware.VimAutomation.Core:String) [Add-PSSnapin], PSArgumentException
    + FullyQualifiedErrorId : AddPSSnapInRead,Microsoft.PowerShell.Commands.AddPSSnapinCommand

    Connect-VIServer : The term ‘Connect-VIServer’ is not recognized as the name of a cmdlet, function, script file, or
    operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try
    again.
    At C:\scripts\Get-VMCPUAverage.ps1:32 char:9
    + Connect-VIServer $vCenter
    + ~~~~~~~~~~~~~~~~
    + CategoryInfo : ObjectNotFound: (Connect-VIServer:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

    Get-Cluster : The term ‘Get-Cluster’ is not recognized as the name of a cmdlet, function, script file, or operable
    program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
    At C:\scripts\Get-VMCPUAverage.ps1:66 char:1
    + Get-Cluster $ClusterName | Get-VM | ? {$_.PowerState -eq “PoweredOn”} | % { #loo …
    + ~~~~~~~~~~~
    + CategoryInfo : ObjectNotFound: (Get-Cluster:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

  5. Magnificent beat ! I would like to apprentice while you amend
    your website, how coupd i subscribe ffor a blog
    web site? Thee accunt helped mee a acceptable deal.

    I were a litle bit acquained of this your broadcast offered vibrant clear idea

  6. Craig Lawyer

    Excellent script. Two questions: does the $DaysBack = 1 mean that “yesterday’s” metrics are analyzed only or would it also include today’s values? Is there a way to report realtime CPU ready into a CSV file? Thanks!

    • It means the includes the metrics from the time you run the script -X days back. E.g if it’s set to $DaysBack = 14 then you get metrics from the last 14 days.

      • My blog is about my adventures through computing. Essentially just an aggregation point for information, tutorials and functions I’ve written that essentially make my life easier. Specifically your functions around ‘Get-VMCPUAverage’ & ‘Get-VMCpuRDY’ – I’ve used these and created a Get-VMMaxDiskLatenceyAverage to compliment your scripts while achieving a professional goal.
        I will use this as my first FULL post (other than my intro). And yes, I will be saying good things. Thank you very much!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s