Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

AWS AutoScaling Group Size Metrics (or Lack Thereof)

DZone's Guide to

AWS AutoScaling Group Size Metrics (or Lack Thereof)

· Cloud Zone
Free Resource

Deploy and scale data-rich applications in minutes and with ease. Mesosphere DC/OS includes everything you need to elastically run containerized apps and data services in production.

One of the notably lacking metrics from CloudWatch has been the current and previous AutoScaling group sizes – in other words, how many nodes are in the cluster. I’ve worked around this by using the regular EC2 APIs, querying the current cluster size and the desired size and logging this to Graphite. However, it only gives you the current values – not anything in the past, which regular CloudWatch metrics do (up to 2 weeks in the past).

My colleague Sean came up with a nice workaround – using the SampleCount statistic of the CPUUtilization metric within a given AutoScaler group namespace. Here’s an example, using the AWS Python CLI:

$ aws cloudwatch get-metric-statistics --dimensions Name=AutoScalingGroupName,Value=XXXXXXXXProdCluster1-XXXXXXXX --metric CPUUtilization --namespace AWS/EC2 --period 60 --statistics SampleCount --start-time 2015-01-17T00:00:00 --end-time 2015-01-17T00:05:00
{
    "Datapoints": [
        {
            "SampleCount": 69.0,
            "Timestamp": "2015-01-17T00:00:00Z",
            "Unit": "Percent"
        },
        {
            "SampleCount": 69.0,
            "Timestamp": "2015-01-17T00:01:00Z",
            "Unit": "Percent"
        },
        {
            "SampleCount": 69.0,
            "Timestamp": "2015-01-17T00:03:00Z",
            "Unit": "Percent"
        },
        {
            "SampleCount": 69.0,
            "Timestamp": "2015-01-17T00:02:00Z",
            "Unit": "Percent"
        },
        {
            "SampleCount": 67.0,
            "Timestamp": "2015-01-17T00:04:00Z",
            "Unit": "Percent"
        }
    ],
    "Label": "CPUUtilization"
}

Some things to note:

  • Ignore the units – it’s not a percentage!
  • You will need to adjust your –period parameter to match that of your metric sampling period on the EC2 instances in the AutoScale group – if you have regular monitoring enabled this will be one sample per 5 minutes (300 seconds), if you have detailed monitoring enabled it will be one sample per 1 minute (60 seconds).
  • The last point also means that if you want to gather less frequent data points for historical data, you’ll need to do some division – e.g. using –period 3600 will require you to divide the resulting sample count by 12 (regular monitoring) or 60 (detailed monitoring) before you store it.
  • Going via CloudWatch in this way means you can see your cluster size history for the last two weeks, just like any other CloudWatch metric!
  • Unfortunately you will lose your desired cluster size metric, which is not captured. In practice I haven’t really required both desired and actual cluster size metrics.

We’ll start using this almost immediately, as we can remove one crufty metric collection script in the process. Hope it also helps some of you out there in AWS land!

Discover new technologies simplifying running containers and data services in production with this free eBook by O'Reilly. Courtesy of Mesosphere.

Topics:

Published at DZone with permission of Oliver Hookins, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}