DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Floyd's Cycle Algorithm for Fraud Detection in Java Systems
  • Java Stream API: 3 Things Every Developer Should Know About
  • Merge Multiple PDFs in MuleSoft
  • Scaling Java Microservices to Extreme Performance Using NCache

Trending

  • A Developer's Guide to Mastering Agentic AI: From Theory to Practice
  • Top Book Picks for Site Reliability Engineers
  • Breaking Bottlenecks: Applying the Theory of Constraints to Software Development
  • Unlocking the Benefits of a Private API in AWS API Gateway
  1. DZone
  2. Data Engineering
  3. Data
  4. Performance Evaluation of Java ArrayLists

Performance Evaluation of Java ArrayLists

Let's do a performance breakdown of the ArrayList add operation to see how to get the most throughout out of your ArrayLists.

By 
Viraj Salaka user avatar
Viraj Salaka
·
Malith Jayasinghe user avatar
Malith Jayasinghe
·
Isuru Perera user avatar
Isuru Perera
·
Srinath Perera user avatar
Srinath Perera
·
Oct. 18, 17 · Analysis
Likes (12)
Comment
Save
Tweet
Share
28.2K Views

Join the DZone community and get the full member experience.

Join For Free

In this article, we will look at the performance characteristics of the Java ArrayList (java.util.ArrayList) add operation.

An ArrayList has an initial capacity, which is simply the size of the array used to store the elements in the list. When you create an ArrayList you can specify the initial capacity. For example:

ArrayList<Integer> arrayList = new ArrayList<>(100);


In this case, the initial size of the ArrayList will be 100. As you add elements to an ArrayList, its capacity grows automatically. The growing factor is 1.5. If we do not specify an initial capacity then an ArrayList object will be created containing an initial array of size ten.

ArrayList<Integer> arrayList = new ArrayList<>();


Performance Evaluation

In this section, we will provide details of how we conducted our experiments. The performance tests are done using JMH (Java Microbenchmark Hardness). It is a tool that can be used to implement benchmarks correctly for the applications run top of the JVM.  JVM's HotSpot VM has the ability to do optimizations like dead code elimination. 

The two main performance metrics considered are 

  1. Average latency (measured in nanoseconds)

  2. Throughput (measured in number of operations per millisecond)

The scenario implemented is sequentially adding elements to an ArrayList using the add operation. We run the benchmark by varying the following parameters:

  1. Initial capacity (size) of the ArrayList

  2. Final size of the ArrayList (the maximum number of elements used)

The part of the benchmark code is shown below:

…………….

@Benchmark

public void test_ArrListAdd(Blackhole bh) {

        ArrayList < Integer > m;

        m = new ArrayList < > (initialCapacity);

        for (int j = 0; j < finalSize; j++) {

            m.add(j, 0);

        }

        bh.consume(m);

        count += 1;

    }

    ………….


The following are the parameters specified for JMH, which are fixed for all runs.

  1. Number of warmup iterations: 10

  2. Numbers of iterations: 10

  3. Number of forks: 2

  4. Number of threads: 1

  5. Time unit ns (nanoseconds)

  6. Mode: benchmark mode

The performance test was run on a 4-core machine with 8 GB RAM and JDK used was Oracle 1.8_131

Performance Results

This section presents the performance results. These are the results obtained:

Final size

Initial Size

Average Latency (ns)

Throughput (operations/ms)





9

10

79.53

13045.111


100

92.945

10471.726


1000

345.138

2957.518


10000

2066.279

495.057





99

10

675.109

1506.031


100

440.578

2244.508


1000

669.46

1378.31


10000

2542.431

426.257





999

10

5797.814

171.068


100

5583.3

170.748


1000

4145.378

232.636


10000

5995.691

171.921





9999

10

54751.808

18.212


100

53713.615

17.492


1000

50556.323

19.876


10000

41174.069

24.14


Let's now make a few plots so that we can understand the behavior. The following graph shows how the average latency varies with the initial array size when the final size is 99.

The following graph shows how the throughput varies with the initial array size when the final size is 99.

The following graph shows how the average latency varies with the initial array size when the final size is 999.

The following graph shows how the throughput varies with the initial array size when the final size is 999.

The following graph shows how the average latency varies with the initial array size when the final size is 9999.

The following graph shows how the throughput varies with the initial array size when the used capacity is 9999.

Discussion

Initial array size > Final size

If the initial array size is greater than the final size, there can be a significant degradation in performance.

For example, when we create an array with an initial capacity of 1000 and use only the first 10 elements, we observed an average latency of 345 ns. On the other hand, when we create an array with an initial capacity of 10 and used the first 10 elements, then we get an average latency of 57 ns.

The main reason for the degradation in performance for the larger initial array size case is the cost of initializing the large array itself. When we profile applications, we noticed array initialization taking a significant amount of processing time.

Initial Array Size < Final Size

If the initial size is smaller than the final size, then there can also be a significant degradation in performance.

For example, when we create an array with an initial capacity of 10 and use the first 1000 elements, then we observed an average latency of 5797 ns. On the other hand, when we create an array with an initial capacity of 1000 and use 1000 elements, then we get an average latency of 4145 ns.

The main reason behind the degradation in performance for the smaller initial array size case is the additional processing needed for copying existing elements into a new array with a large size together. When we profile benchmark code, we noticed array copy operation taking a significant amount of processing time. In addition to this, there may be degradation due to the creation of arrays multiple times and initializing them.

Over Specifying vs. Under Specifying Initial Array Size

We note that the degradation in performance, as a result of over specifying the initial size, is less compared to that of under specifying. Let's now discuss this behavior in a bit more detail. Let's assume that the final size of the array list is 1000. The following figure plots the variation in the throughput with increasing ArrayList size.

imageedit_12_2540778946.png


We note that the maximum throughput is achieved when the initial array list size is 1000 (i.e. when initial size = final size). Let us now assume that TPS_1 and TPS_2 represent the throughput if the initial array size is 500 and 1500 respectively (refer to the figure above). We note that TPS_2 is significantly higher than TPS_1, which validates our claim regarding the performance differences in over specifying vs. under specifying the initial size.

Conclusion

In this article, we have done an in-depth performance analysis of the Java ArrayList add operation.

It is clear from the from the results (considering the add operation of ArrayList) that if the required maximum capacity of the ArrayList is known, we can get the optimal performance (both average latency and throughput) by specifying the initial capacity to the required capacity.

In doing so, we can get 24% to 34% improvement in average latency and 30% to 50% improvement in throughput. In our future work, we hope to consider benchmarking other operations such as insert, remove, and get in addition to the add operation. In our current tests, although we vary the final size between different tests, for a given test scenario final size is fixed. We hope to extend our work to cater for scenarios where the final size varies. In such cases, the final size can come from probability distribution, which represents (actual) access pattern of users.



If you enjoyed this article and want to learn more about Java Collections, check out this collection of tutorials and articles on all things Java Collections.

Data structure Java (programming language) Evaluation

Opinions expressed by DZone contributors are their own.

Related

  • Floyd's Cycle Algorithm for Fraud Detection in Java Systems
  • Java Stream API: 3 Things Every Developer Should Know About
  • Merge Multiple PDFs in MuleSoft
  • Scaling Java Microservices to Extreme Performance Using NCache

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!