DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • DataWeave Interview Question: Concatenate Elements of an Array
  • ConcurrentHashMap: Call Only One Method Per Key
  • Floyd's Cycle Algorithm for Fraud Detection in Java Systems
  • Merge Multiple PDFs in MuleSoft

Trending

  • Revolutionizing Financial Monitoring: Building a Team Dashboard With OpenObserve
  • Integrating Model Context Protocol (MCP) With Microsoft Copilot Studio AI Agents
  • How Trustworthy Is Big Data?
  • Unlocking the Potential of Apache Iceberg: A Comprehensive Analysis
  1. DZone
  2. Data Engineering
  3. Data
  4. The Way of the Lambda and removeIf()

The Way of the Lambda and removeIf()

Did you know that removing elements from large collections can be really slow and error prone if you do it the wrong way? Here's one way to do it right.

By 
Dan Lawesson user avatar
Dan Lawesson
·
Nov. 15, 16 · Tutorial
Likes (38)
Comment
Save
Tweet
Share
44.0K Views

Join the DZone community and get the full member experience.

Join For Free

The Way of the Lambda: Elegant and Efficient

In Java 8, you can do:

items.removeIf(i -> predicate(i));

Where you used to do:

for (Iterator it = items.iterator(); it.hasNext();) {
  if (predicate(it.next())) {
    it.remove(); 
  } 
}

Which is not only considerably more elegant but also a more efficient solution in some cases. I have the graphs to prove it!

Background

Recently having spent a few years away from Java development, it is great to be back and find that Java 8 indeed has delivered on what was promised in 2013. With lambdas and streams, it is possible to tackle problems in a functional manner and write code that is both readable and efficient. This over due modernization of the language is by far the most important change in Java 8, but there are also other improvements that deserve mentioning. This post elaborates on the topic of Collection.removeIf() which allows for conditional removal of elements of a collection.

The Way of the Iterator

Removing elements of a collection while iterating over it is somewhat like cutting off the branch on which you are sitting, but by using Iterators that can be done without a runtime ConcurrentModificationException. Consider the following typical simplistic example, where the elements of the collection items for which predicate() holds true are removed.

for (Iterator it = items.iterator(); it.hasNext();) { 
  if (predicate(it.next())) { 
    it.remove(); 
  } 
}

If the collection in question is a linked list of size n and m elements are removed, the full iteration with removals is O(n + m) = O(n) since iteration steps and removal from the list are constant time operations and m is limited by n. If, however, the data structure of the collection is one where removal is relatively expensive compared to iteration, the complexity becomes less favorable. For an ArrayList a single removal step is O(n) since all items at higher indices have to be moved. There are still n iteration steps to be made so the full sequence of removals becomes O(m*n) which is O(n^2) assuming worst case with the number of removals being a constant fraction of the array size.

In more pragmatic terms, what happens in the ArrayList case is that for each element that is removed, all elements of higher indices are moved to fill the gap in the backing array. This is clearly suboptimal since that operation is repeated for each removal. Thus, to reach its final destination an element will visit all locations between the starting position and the destination. In case many elements are removed, this may be a considerable amount of work in vain.

The Java 8 Way of the Lambda

In Java 8, the code snippet from above can be expressed as follows.

items.removeIf(i -> predicate(i));

Clearly more concise, this expression is also more efficient on an ArrayList compared to the Iterator approach. Where the Iterator requires completion of each removal operation before continuing to the next one, removeIf operates on a predicate and can therefore iterate over all items once to determine which elements are to be removed and then shift the remaining elements to the correct new indices all in a second pass. Therefore, removeIf is O(n) since it performs two linear passes through the collection. Each retained element in the resulting collection is moved just once, while for the Iterator approach it would be moved one step at a time towards its destination. See this gist for the relevant JDK code.

The Difference Visualized

We compare execution times for the two approaches as a function of the array size for a rather extreme case where 999 out of 1000 items are removed from an ArrayList. For each data point 40 runs are made, the 5 fastest and slowest outliers are removed and the sum of the remaining 30 form the execution time value. For details please refer to this gist.

We expect the execution time to be considerably shorter for the Java 8 way of the lambda and as a matter of fact that is what we find. We also note that within the resolution of the graph, the lines are quite noise free due to the many repeats and filtering of outlier results.


For array sizes of 150,000 elements, the removeIf() approach is more than 3000 times faster. Even more importantly, the factor between the two increases as the size of the array increases which is expected since the way of the lambda has a superior asymptotical time complexity.

To better compare the time complexity of the two, we scale up the green line to match the green one. By multiplying the execution times of the faster algorithm by ~1500, the graphs can be compared in terms of shape. As expected, we find linear behavior for the Java 8 removeIf() design, while the old ways show a quadratic time complexity due to the inefficient removal of elements.

When scaling up the measurements we get to see some noise in the green line. The sensitivity to random fluctuations in execution time is greater since the execution times are several orders of magnitude shorter compared to the blue. Both lines are fitted to polynomials of power two and as can be clearly seen, the light green line is very similar to a straight line while the blue line clearly aims for the sky in a super linear fashion.

Edit: There is a reddit thread about this blog post. Good points are being made there, for example the fact that the removeIf implementation is atomic (in the database sense) as it first evaluates the predicate for all items before making any alterations to the list. Thus, if the predicate throws an exception for any of the items in the list, the list will remain unchanged.

Data structure Element Java (programming language)

Published at DZone with permission of Dan Lawesson, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • DataWeave Interview Question: Concatenate Elements of an Array
  • ConcurrentHashMap: Call Only One Method Per Key
  • Floyd's Cycle Algorithm for Fraud Detection in Java Systems
  • Merge Multiple PDFs in MuleSoft

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!