DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Charge Vertical Scaling With the Latest Java GCs
  • Understanding Root Causes of Out of Memory (OOM) Issues in Java Containers
  • Java Thread Dump Analysis
  • Building Your Own Automatic Garbage Collector: A Guide for Developers

Trending

  • The Future of Java and AI: Coding in 2025
  • Monolith: The Good, The Bad and The Ugly
  • How To Introduce a New API Quickly Using Quarkus and ChatGPT
  • How to Create a Successful API Ecosystem
  1. DZone
  2. Coding
  3. Java
  4. Degradation in String Deduplication Performance in Recent Java Versions

Degradation in String Deduplication Performance in Recent Java Versions

String deduplication is an important feature aimed at optimizing memory usage by eliminating duplicate strings from heap memory.

By 
Ram Lakshmanan user avatar
Ram Lakshmanan
DZone Core CORE ·
May. 07, 24 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
2.4K Views

Join the DZone community and get the full member experience.

Join For Free

String deduplication is an important feature aimed at optimizing memory usage by eliminating duplicate strings from heap memory. However, recent observations suggest a concerning trend – a degradation in string deduplication performance across newer Java versions. Thus, we embarked on a comparative analysis to assess the string deduplication performance behavior in Java versions 11, 17, and 21. This post intends to share our observations and insights gleaned from this analysis.

WebCrawler Spring Boot Application

In order to experiment with string deduplication, we used an open-source Web Crawler application developed in Spring Boot. The web crawler is a REST-based application, which will crawl any given website and archive that site’s information into the H2 Database. When an HTTP POST request is made, WebCrawler will start the crawling job on the site and return a jobId. This jobId can be used later to query the status of the crawling task.

To crawl the Wikipedia website, you need to pass the seed with depth. The depth will decide how deep it should crawl the websites and extract the information. For example, this URL will crawl the Wikipedia site with a depth of 100 in the background.

Once you make a POST request, it will return the jobId as a response and the crawling will start in the background. Once the crawling task reaches the depth level, automatically will stop that particular crawling task.

We are going to load test this web crawler with a POST request in Java 11, 17, and 21 and study the String deduplication performance.

Enabling String Deduplication

To study the behavior of String Deduplication, it must first be enabled in the JVM. This can be achieved by passing the following JVM argument:

-XX:+UseStringDeduplication


String Deduplication events performance characteristics (like how many strings examined, how many of them were deduplicated, how long it took to complete, …) can be printed into the log file by passing the following JVM arguments to the application:

-Xlog:stringdedup*=debug:file=string-dup-logfile.log  -Xloggc:string-dup-logfile.log


Based on the above JVM arguments, we are directing both string deduplication debug events and the GC events log messages into one single ‘string-dup-logfile.log’ file

For detailed information about the JVM arguments, refer to this article for String deduplication.

JMeter Load Test

We conducted load testing on the WebCrawler application using JMeter, simulating a load of 50 users submitting crawling tasks with identical seed URLs and depths for approximately 1 hour. We are submitting the same URL so that a lot of duplicate strings will be simulated in the application. 

Note this test was conducted on my local laptop, whose configuration is:

Operating System: Windows 11
System type: 64-bit operating system, x64-based processor
RAM: 12 GB 
Processor: Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz   1.19 GHz
Java Heap Size (i.e. -Xm512mb)

The image below shows the JMeter configuration in the above case:

POST method configuration in JMeter
Fig: POST method configuration in JMeter

Garbage Collection Analysis Study

After completing the tests, we uploaded the generated Garbage Collection log file to the online GCeasy tool for analysis. The tool promptly generated reports showcasing String Deduplication performance metrics for each Java version tested. Here are the reports generated by the tool:

  • Java 11 GC report  
  • Java 17 GC report  
  • Java 21 GC report   

Comparison Study of String Deduplication Key Metrics Across Java Versions

The table below summarizes String deduplication key metrics across the Java versions:


 Java 11  Java 17  Java 21
Inspected Strings 1,540,837 997,693 650,245
Deduplicated Strings 1,342,865 505,376 299,261
Deduplicated Size 102.4 mb 39.57 mb 15.81 mb
Deduplicated Percentage 34.3 % 9.3 % 3.4 %
Deduplication time 1,264.442 ms 2,323.492 ms 3,439.466 ms

The analytics reveal that Java 11 exhibits the best String Deduplication performance, eliminating 34.3% of strings. Conversely, Java 17 and 21 only managed to eliminate 9.3% and 3.4%, respectively. Moreover, the time taken to deduplicate strings increased in modern Java versions, with Java 11 completing the process in only 1,264.442 ms, compared to 2,323.492 ms for Java 17 and 3,439.466 ms for Java 21.

In essence, modern JVMs are spending more time inspecting a higher number of strings while eliminating fewer duplicates from memory. This underscores a clear degradation in JVM performance.

Conclusion

The investigation into string deduplication performance across Java versions 11, 17, and 21 yields significant insights into the evolution of JVM behavior. While string deduplication serves as a vital mechanism for optimizing memory utilization, our analysis reveals a concerning trend of degradation in performance across newer Java releases. Java 11 emerges as the standout performer, efficiently eliminating a substantial portion of duplicate strings within a shorter time frame. In contrast, Java 17 and 21 exhibit diminished effectiveness, both in terms of the percentage of strings deduplicated and the time taken to execute the process.

Java virtual machine garbage collection Java (programming language) Strings Performance

Published at DZone with permission of Ram Lakshmanan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Charge Vertical Scaling With the Latest Java GCs
  • Understanding Root Causes of Out of Memory (OOM) Issues in Java Containers
  • Java Thread Dump Analysis
  • Building Your Own Automatic Garbage Collector: A Guide for Developers

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!