DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Memory Optimization and Utilization in Java 25 LTS: Practical Best Practices
  • Optimizing Java Applications for Arm64 in the Cloud
  • Debugging Performance Regressions in High-Scale Java Web Services: A Systematic Approach
  • Memory Leak Due To Mutable Keys in Java Collections

Trending

  • DevOps and Platform Engineering Readiness Checklist: Everything Needed for a Scalable, Secure, High-Velocity Delivery Platform
  • Architecting an Embedded Efficiency Layer: A Platform Deep Dive into Day-Two Operational Tuning
  • The Agentic Agile Office: Streamlining Enterprise Agile With Autonomous AI Agents
  • Solving the Mystery: Why Java RSS Grows in Docker on M1 Macs
  1. DZone
  2. Coding
  3. Java
  4. Optimizing Robotics Application’s Performance

Optimizing Robotics Application’s Performance

In this post, we would like to share our real-world experience in optimizing a Java application that was controlling the robots in a warehouse.

By 
Ram Lakshmanan user avatar
Ram Lakshmanan
DZone Core CORE ·
Mar. 14, 24 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
7.0K Views

Join the DZone community and get the full member experience.

Join For Free

In this post, we would like to share our real-world experience in optimizing a Java application that was controlling the robots in a warehouse. This application would give instructions to the robots in the warehouse on what actions to perform. Based on those instructions, robots carry out their job in the warehouse. Occasionally, this application was slowing down and not giving instructions to the robots. If robots didn't receive instructions from the application, they would start to make autonomous decisions causing degenerated behavior in them, which in turn was affecting the delivery and shipments in the warehouse.

Long Garbage Collection Pause

The best way to start troubleshooting the Java application’s performance is to study its Garbage Collection performance. This is even more true when the application suffers from a slowdown. We took this application’s Garbage Collection log file and uploaded it to the GCeasy tool. (Note: Garbage Collection log contains vital statistics that are not reported by most monitoring tools, and they add almost no overhead to your application. Thus, it’s a best practice to enable the Garbage Collection log on all your production instances). The tool analyzed the Garbage Collection log and instantly generated this insightful GC log analysis report. 

The tool reported various interesting metrics and graphs. The Garbage Collection Pause time graph in the report is of most interest to our discussion. Below is that graph:

Garbage Collection Pause Duration Graph generated by GCeasy
Fig 1: Garbage Collection Pause Duration Graph generated by GCeasy

Whenever the Garbage Collection event runs, it pauses the entire application. During that pause period, none of the customer transactions will be processed. All the transactions that are in flight will be halted. From the above graph you can notice that at 11:35 am, during the peak traffic time, a Garbage Collection event paused the entire application for 329 seconds (i.e. 5 minutes and 29 seconds). It means during that entire 5+ minutes window all the robots wouldn’t have gotten any instructions from this application. They would have taken the decisions autonomously, causing disruption to the business.  

What Is Causing the Long GC Pause?

There were two primary reasons causing such a long Garbage Collection Pause:

  1. Large heap size: The application is configured to run with a 126GB heap size. Typically, when the heap size is large, garbage collection time will also be longer. Because a lot of objects would have been accumulated and it would take a long time to be evicted from the memory. 
  2. CMS (Concurrent Mark and Sweep) algorithm: The CMS GC algorithm runs well and is an apt fit for several applications, however, its major drawback is its occasional long GC pause. CMS GC algorithm doesn’t always cause long GC Pauses, most of its GC pauses are in the acceptable range, however occasionally it causes long GC pauses due to heap fragmentation, which can last for several seconds (sometimes even minutes) like in this situation.

Potential Solutions To Reduce Long GC Pauses

Here is a blog post that highlights potential solutions to reduce long Garbage Collection pauses. We were contemplating a couple of solutions to address this long GC pause.

1. Reducing Heap Size

Reducing the application’s heap size is a potential solution. However, the object creation rate of this monolith application was very high, and reducing the heap size has the potential to affect the application’s responsiveness. Heap size can be reduced only if the application’s memory consumption can be reduced, which warrants the refactoring of the application code. Already this monolith application’s re-architecture was underway. This monolith application was getting broken down and re-written as microservices with much lesser heap size. However, this application re-architecture was slated to go live in 6 – 9 months later. Thus, the customer was hesitant to reduce the heap size until then.

2. Switching From CMS to G1 GC Algorithm

The other solution was to migrate away from the CMS GC algorithm. Irrespective of this GC performance, this CMS GC algorithm has been deprecated since Java 9 and it will be permanently removed from Java 14. If we want to move away from the CMS GC algorithm, what are the alternatives we have? Below are the alternate GC algorithms that are available in OpenJDK:

  1. Serial GC
  2. Parallel GC
  3. G1 GC
  4. ZGC
  5. Shenandoah GC

The serial GC algorithm is useful only for single-threaded, desktop type of applications. Since this application has multiple concurrent threads with a very heavy object creation rate, we eliminated the Serial GC algorithm. Since this application was running on Java 8, we ruled out ZGC and Shenandoah GC because they are all stable only from Java 17+. Thus, we were left with the choice of either using Parallel GC or G1 GC. 

We simulated production traffic volume in the performance lab and experimented with Parallel GC and G1 GC algorithm settings based on the best GC tuning practices. We found out that Parallel GC pause time was not as bad as CMS GC, but it was not better than G1 GC. Thus, we ended up switching from the CMS GC algorithm to the G1 GC algorithm. Here is the GC log analysis report of this robotics application in the performance lab when using the G1 GC algorithm. Below is the GC Pause duration graph when using the G1 GC algorithm:

G1 GC pause time graph generated by GCeasy
Fig 2: G1 GC pause time graph generated by GCeasy

From the graph, you can notice that the maximum GC pause time was 2.17 seconds. This is a phenomenal improvement from 5 minutes and 29 seconds. Also, the average GC pause time was only 198ms; way better than the CMS GC algorithm for this application.

Conclusion

After switching to the G1 GC algorithm, the application’s random slowdowns completely stopped. Thus, without major architectural changes, without code refactoring, without any JDK/infrastructural upgrades, just by tweaking the GC arguments in the JVM, we were able to bring this significant optimization to this robotics application’s performance.

garbage collection Java (programming language)

Published at DZone with permission of Ram Lakshmanan. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Memory Optimization and Utilization in Java 25 LTS: Practical Best Practices
  • Optimizing Java Applications for Arm64 in the Cloud
  • Debugging Performance Regressions in High-Scale Java Web Services: A Systematic Approach
  • Memory Leak Due To Mutable Keys in Java Collections

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook