DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Coding
  3. Java
  4. Iteration Over Java Collections With High Performance

Iteration Over Java Collections With High Performance

Learn more about the forEach loop in Java and how it compares to C style and Stream API in this article on dealing with collections in Java.

Dang Ngoc Vu user avatar by
Dang Ngoc Vu
·
Jul. 13, 18 · Analysis
Like (20)
Save
Tweet
Share
80.43K Views

Join the DZone community and get the full member experience.

Join For Free

Introduction

Java developers usually deal with collections such as ArrayList and HashSet. Java 8 came with lambda and the streaming API that helps us to easily work with collections. In most cases, we work with a few thousands of items and performance isn't a concern. But, in some extreme situations, when we have to travel over a few millions of items several times, performance will become a pain.

I use JMH for checking the running time of each code snippet.

forEach vs. C Style vs. Stream API

Iteration is a basic feature. All programming languages have simple syntax to allow programmers to run through collections. Stream API can iterate over Collections in a very straightforward manner.

public List<Integer> streamSingleThread(BenchMarkState state){
  List<Integer> result = new ArrayList<>(state.testData.size());
  state.testData.stream().forEach(item -> {
    result.add(item);
  });
  return result;
}

public List<Integer> streamMultiThread(BenchMarkState state){
  List<Integer> result = new ArrayList<>(state.testData.size());
  state.testData.stream().parallel().forEach(item -> {
    result.add(item);
  });
  return result;
}


The forEach  loop is just as simple:

public List<Integer> forEach(BenchMarkState state){
  List<Integer> result = new ArrayList<>(state.testData.size());
  for(Integer item : state.testData){
    result.add(item);
  }
  return result;
}


C style is more verbose, but still very compact:

public List<Integer> forCStyle(BenchMarkState state){
  int size = state.testData.size();
  List<Integer> result = new ArrayList<>(size);
  for(int j = 0; j < size; j ++){
    result.add(state.testData.get(j));
  }
  return result;
}


Then, the performance:

Benchmark                               Mode  Cnt   Score   Error  Units
TestLoopPerformance.forCStyle           avgt  200  18.068 ± 0.074  ms/op
TestLoopPerformance.forEach             avgt  200  30.566 ± 0.165  ms/op
TestLoopPerformance.streamMultiThread   avgt  200  79.433 ± 0.747  ms/op
TestLoopPerformance.streamSingleThread  avgt  200  37.779 ± 0.485  ms/op


With C style, JVM simply increases an integer, then reads the value directly from memory. This makes it very fast. But forEach is very different, according to this answer on StackOverFlow and document from Oracle, JVM has to convert forEach to an iterator and call hasNext() with every item. This is why forEach is slower than the C style.

Which Is the High-Performance Way to Travelling Over Set?

We define test data:

    @State(Scope.Benchmark)
    public static class BenchMarkState {
        @Setup(Level.Trial)
        public void doSetup() {
            for(int i = 0; i < 500000; i++){
                testData.add(Integer.valueOf(i));
            }
        }
        @TearDown(Level.Trial)
        public void doTearDown() {
            testData = new HashSet<>(500000);
        }
        public Set<Integer> testData = new HashSet<>(500000);
    }


The Java Set also supports Stream API and forEach loop. According to the previous test, if we convert Set to ArrayList, then travel over ArrayList, maybe the performance improve?

public List<Integer> forCStyle(BenchMarkState state){
int size = state.testData.size();
List<Integer> result = new ArrayList<>(size);
        Integer[] temp = (Integer[]) state.testData.toArray(new Integer[size]);
        for(int j = 0; j < size; j ++){
        result.add(temp[j]);
        }
return result;
}


How about a combination of the iterator with the C style for loop?

public List<Integer> forCStyleWithIteration(BenchMarkState state){
int size = state.testData.size();
List<Integer> result = new ArrayList<>(size);
Iterator<Integer> iteration = state.testData.iterator();
        for(int j = 0; j < size; j ++){
        result.add(iteration.next());
        }
return result;
}


Or, what about simple travel?

public List<Integer> forEach(BenchMarkState state){
List<Integer> result = new ArrayList<>(state.testData.size());
for(Integer item : state.testData) {
result.add(item);
}
return result;
}


This is a nice idea, but it doesn't work because initializing the new ArrayList also consumes resources.

Benchmark                                   Mode  Cnt  Score   Error  Units
TestLoopPerformance.forCStyle               avgt  200  6.013 ± 0.108  ms/op
TestLoopPerformance.forCStyleWithIteration  avgt  200  4.281 ± 0.049  ms/op
TestLoopPerformance.forEach                 avgt  200  4.498 ± 0.026  ms/op

HashMap (HashSet uses HashMap<E,Object>) isn't designed for iterating all items. The fastest way to iterate over HashMap is a combination of Iterator and the C style for loop, because JVM doesn't have to call hasNext().

Conclusion

Foreach and Stream API are convenient to work with Collections. You can write code faster. But, when your system is stable and performance is a major concern, you should think about rewriting your loop.



If you enjoyed this article and want to learn more about Java Collections, check out this collection of tutorials and articles on all things Java Collections.

Java (programming language)

Published at DZone with permission of Dang Ngoc Vu. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Stop Using Spring Profiles Per Environment
  • How To Build a Spring Boot GraalVM Image
  • How We Solved an OOM Issue in TiDB with GOMEMLIMIT
  • Introduction to Container Orchestration

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: