DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Beyond Java Streams: Exploring Alternative Functional Programming Approaches in Java
  • Using Java Stream Gatherers To Improve Stateful Operations
  • Thread-Safety Pitfalls in XML Processing
  • Java Stream API: 3 Things Every Developer Should Know About

Trending

  • Context Is the New Schema
  • Securing Everything: Mapping the Right Identity and Access Protocol (OIDC, OAuth2, and SAML) to the Right Identity
  • LLM Agents and Getting Started with Them
  • AWS Managed Database Observability: Monitoring DynamoDB, ElastiCache, and Redshift Beyond CloudWatch
  1. DZone
  2. Coding
  3. Java
  4. Should I Parallelize Java 8 Streams?

Should I Parallelize Java 8 Streams?

What do we need to consider before parallelizing Java streams?

By 
Santhosh Krishnan user avatar
Santhosh Krishnan
·
Oct. 08, 19 · Analysis
Likes (8)
Comment
Save
Tweet
Share
26.6K Views

Join the DZone community and get the full member experience.

Join For Free

Java parallel streams

[Java] parallel streams

In Java 8, the streams API is easy to iterate over collections, and it's easy to parallelize a stream by calling the parallelStream() method. But should we be using parallelStream() wherever we can? What are the considerations? 

You may also like: Think Twice Before Using Java 8 Parallel Streams

Look at the following ParallelStreamTester class to generate collections of different sizes for the purpose of testing parallel streams performance against a sequential stream.  

public class ParallelStreamTester {
static int COLLECTION_SIZE = 100000;

private static Collection <Person> getPersonCollection (){
List <Person> personList = new ArrayList <Person> ();

String [] names = {"David", "Marry", "Satya", "Matt", "Patrick", "Bill", "Mike", "Jake", "Amber", "Dianne"};
int [] age = {10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
String [] states = {"NY", "MA", "MO", "CA", "TX", "MN", "WA", "PE", "NE", "NH", "OH"};

for (int i=0; i< COLLECTION_SIZE; i++){
personList.add(new Person (names [getRandom()], age[getRandom()], states [getRandom()]));
}

System.out.println ("Generated the collection \n");
return personList;
}


  // more code


Now, consider the following code snippet to test the performance of the sequential. Get all of the Persons who are older than 50 from “NY” or "TX" with names that start with “M”.

private static void sequentialStreamPerformance (Collection <Person> persons){
    long t1 = System.currentTimeMillis(), count;

    count = persons.stream().
    filter(x-> (x.getState().equals("NY") || x.getState().equals("TX")))
    .filter(x-> x.getAge() > 50)
    .filter(x-> x.getName().startsWith("M"))
    .count();

    long t2 = System.currentTimeMillis();
    System.out.println("Count = " + count + " Normal Stream Takes " + (t2-t1) + " ms\n");
}


And for parallel stream performance:

private static void parallelStreamPerformance (Collection <Person> persons){
    long t1 = System.currentTimeMillis(), count;

    count = persons.parallelStream().
    filter(x-> (x.getState().equals("NY") || x.getState().equals("TX")))
    .filter(x-> x.getAge() > 50)
    .filter(x-> x.getName().startsWith("M"))
    .count();

    long t2 = System.currentTimeMillis();
    System.out.println("Count = " + count + " Parallel Stream takes " + (t2-t1) + " ms\n");
}


Now, let's run some tests by varying the value of COLLECTION_SIZE.  Start with a value of 100 and steadily increase the value up to 10000000 each time, taking note of the time taken. Here is my observed result:

Regular V. Parallel Streams - time taken

  • Sequential streams outperformed parallel streams when the number of elements in the collection was less than 100,000.
  • Parallel streams performed significantly better than sequential streams when the number of elements was more than 100,000.

What about synchronization problems when using parallel Streams? 

If a shared resource is used by the predicate, and functions are used in the process, we need to make sure the access is controlled and thread-safe.

A parallel stream has a much higher overhead compared to a sequential stream. Coordinating the threads takes a significant amount of time. Sequential streams sound like the default choice unless there is a performance problem to be addressed.

The code used in this POC can be found on GitHub.

Further Reading

Think Twice Before Using Java 8 Parallel Streams

What's Wrong in Java 8, Part III: Streams and Parallel Streams



If you enjoyed this article and want to learn more about Java Streams, check out this collection of tutorials and articles on all things Java Streams.

Stream (computing) Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • Beyond Java Streams: Exploring Alternative Functional Programming Approaches in Java
  • Using Java Stream Gatherers To Improve Stateful Operations
  • Thread-Safety Pitfalls in XML Processing
  • Java Stream API: 3 Things Every Developer Should Know About

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook