DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. How to Properly Collect AWS EMR Metrics

How to Properly Collect AWS EMR Metrics

When it comes to metrics, AWS does not supply a proper solution for collecting cluster metrics. Click here to learn how to properly collect AWS EMR metrics.

Avi Yehuda user avatar by
Avi Yehuda
·
Sep. 06, 18 · Tutorial
Like (1)
Save
Tweet
Share
10.69K Views

Join the DZone community and get the full member experience.

Join For Free

working with aws emr has a lot of benefits. but, when it comes to metrics, aws currently does not supply a proper solution for collecting cluster metrics from emrs.

well, there is aws cloudwatch, of course, which works out of the box and gives you loads of emr metrics. the problem with cloudwatch is that it doesn't give you the ability to follow metrics per business unit or a tag — only per a specific emr id. this simply means that you cannot compare the metrics over time and only for specific emrs.

let me explain the problem again. a common use of emr is that you write some kind of code that will be executed inside an emr and will be triggered every given amount of time —let's say every five hours.

this means that every five hours, a new emr, with a new id, will be spawned. in cloudwatch, you can see each of these emrs individually but not in a single graph, which is huge a disadvantage.

just to note, i am referring only to machine metrics, like memory, cpu, and disk. other metrics, like jvm metrics or business metrics, are usually collected by the process itself and, obviously, can be collected over time per business unit.

another problem is that some of these metrics demand extra cost, so they would be collected and displayed by cloudwatch.

i found a nice and easy solution to this problem. i wrote a small script that collects metrics from the machine it is executed on for every given amount of time and sends those metrics to a graphite host. this script should be added to the emr as a bootstrap action. in this way, all of the cluster machines will send their metrics to graphite. since the script would use the same namespace all the time, the outcome graph will show you not only the metrics for the current execution, but also for the history of previous executions.
also, i used graphite, because this is what we prefer over at my job. but, the same solution could easily be used for other apis instead, like aws cloudwatch api.

export graphite_host="metrics.mydomain.com"
export graphite_port=2003
 
export tier=$(aws emr describe-cluster --cluster-id $(sudo cat /mnt/var/lib/info/job-flow.json | jq -r ".jobflowid") --query cluster.tags | jq -r -c '.[] | select(.key | contains("tier"))? | .value'  | tr '[:upper:]' '[:lower:]') || exit 1
 
export is_master=$(cat  /mnt/var/lib/info/instance.json | jq -r ".ismaster") || exit 1
 
if [[ $is_master == "true" ]]; then
   export namespace_prefix="master"
else
   export namespace_prefix="nodes"
fi
 
send_loop()
{
   while :
   do
      echo "${1}.${namespace_prefix}_free_memory `free -m | awk -v rs="" '{print $10 "+" $17 "+" $21}' | bc` `date +%s`" | nc  ${graphite_host} ${graphite_port}
      echo "${1}.${namespace_prefix}_cpu_utilization `top -b -n1 | grep "cpu(s)" | awk '{print $2 + $4}' | bc` `date +%s`" | nc  ${graphite_host} ${graphite_port}
      echo "${1}.${namespace_prefix}_free_disk `df --output=avail / | grep -v avail | bc` `date +%s`" | nc  ${graphite_host} ${graphite_port}
      sleep ${2}
   done
}
 
 
send_loop $1 $2 &


this script currently sends three types of metrics — available memory , cpu usage, and free disk space .

those metrics are actually aggregated separately for the master node and worker nodes. so, in fact, we have here six metrics.

  1. this script has two parameters:
    a namespace prefix . this namespace prefix will be attached with each of the three metrics and for master and workers nodes. so, in case the namespace is a.b, the metrics that will be sent will be:
    a.b.master_free_memory
    a.b.master_cpu_utilization
    a.b.master_free_disk
    a.b.nodes_free_memory
    a.b.nodes_cpu_utilization
    a.b.nodes_free_disk

  2. the frequency in seconds that the metrics should be collected and sent to graphite will look like the following:

how to use the script

  1. copy it and set the graphite_host value inside the script.

  2. upload it to s3.

  3. add it as a bootstrap action to your emrs.

  4. set the 2 input parameters mentioned above.

the result of this will look like the following:

Metric (unit) AWS

Published at DZone with permission of Avi Yehuda, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Public Key and Private Key Pairs: Know the Technical Difference
  • How To Best Use Java Records as DTOs in Spring Boot 3
  • Testing Repository Adapters With Hexagonal Architecture
  • Shift-Left: A Developer's Pipe(line) Dream?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: