DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • A Systematic Approach for Java Software Upgrades
  • Building a Simple RAG Application With Java and Quarkus
  • Dust Actors and Large Language Models: An Application
  • Five Java Developer Must-Haves for Ultra-Fast Startup Solutions

Trending

  • Is Agile Right for Every Project? When To Use It and When To Avoid It
  • Memory Leak Due to Time-Taking finalize() Method
  • AWS to Azure Migration: A Cloudy Journey of Challenges and Triumphs
  • Unlocking AI Coding Assistants: Generate Unit Tests
  1. DZone
  2. Coding
  3. Java
  4. CPU Profiling - Flame Graphs

CPU Profiling - Flame Graphs

This article gives an introduction to flame graphs and discusses how they can be used to analyze CPU hot spots in large systems to reduce cost.

By 
Parveen Saini user avatar
Parveen Saini
·
May. 29, 24 · Analysis
Likes (1)
Comment
Save
Tweet
Share
2.3K Views

Join the DZone community and get the full member experience.

Join For Free

For any distributed large-scale software application keeping cost under control is one of the most important aspects. As the business grows, the cost of core software applications can get very high.

For cloud-based elastic distributed systems, the cost can be managed by monitoring and optimizing the CPU of the application, if system scale-out and scale-in are controlled by CPU usage threshold. One way is to generate a flame graph of the application to understand CPU hot spots.

In my experience, I have been able to reduce large distributed systems costs by 10–15% by analyzing and relieving CPU hotspots through flame graphs. For bigger systems reducing even 1% can result in huge cost reduction.

Flame Graph

A flame graph is a visualization of application control flow. It shows the sequence of functions called within the application call stack along with the CPU percentage used by those code segments.

The CPU percentage for a certain routine or call stack is calculated based on the number of samples belonging to the call stack of a certain routine compared to the total samples for the duration of profiling.

Understanding Flame Graphs

Below is a sample visualization of the flame graph. A common mistake is to assume that the x-axis is the passage of time. Instead, the x-axis is alphabetically ordered function calls or call stacks and the y-axis is call stack depth.

sample graph

In the sample graph, the main call stack is Main.main which is calling Main.FunctionA, SampleA.FunctionA and SampleB.FunctionA but not in the same order, the calls are arranged alphabetically.

On hovering the mouse over a particular routine, it shows the total samples and CPU usage by the routine. Below it shows 858 samples, 30.58% usage for SampleA.functionC routine.

CPU profile

Now, if the routine is not expected to take as long as shown in the flame graph, it should be inspected for any possible optimization.

Some wasteful activities could be unnecessary polling, too many layers of objects, redundant checks, and unnecessary deserialization and serialization of objects.

Flame graphs for a large-scale application or ones using existing frameworks could become very dense. In that case, it is efficient to search a specific routine within the graph rather than combing through each call stack. e.g. below is a flame graph for a sample Java reactor-based gRPC service.

a flame graph for a sample Java reactor-based gRPC service

It is easier to search the routine to pinpoint its usage. In this case, routine greet() is searched.

routine greet() is searched

Generation

There are many tools to generate flame graphs. The one used for the article is async-profiler. It is a simple low-overhead profiler that also prevents issues due to safe point sampling and is simple to download and use.

Below is the command to generate a flame graph for a sample window of 30 sec using an async-profiler.

asprof -d 30 -f cpu_profile.html <process_id>


Conclusion

CPU profiling is an important aspect to identify cpu hot spots and flame graphs are a good way to analyze cpu hot spots.

If done correctly, CPU profiling can result in a good amount of cost reduction for large-scale elastic applications. In my experience, I have been able to successfully reduce the 10–15% cost of large applications by identifying CPU hot spots through flame graphs.

Feel free to reach out in case of any doubts and in case any guidance is required for your use case.

application Graph (Unix) CPU cache Java (programming language) Distributed Computing

Published at DZone with permission of Parveen Saini. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • A Systematic Approach for Java Software Upgrades
  • Building a Simple RAG Application With Java and Quarkus
  • Dust Actors and Large Language Models: An Application
  • Five Java Developer Must-Haves for Ultra-Fast Startup Solutions

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!