DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Data
  4. The Way of the Lambda and removeIf()

The Way of the Lambda and removeIf()

Did you know that removing elements from large collections can be really slow and error prone if you do it the wrong way? Here's one way to do it right.

Dan Lawesson user avatar by
Dan Lawesson
·
Nov. 15, 16 · Tutorial
Like (38)
Save
Tweet
Share
40.83K Views

Join the DZone community and get the full member experience.

Join For Free

The Way of the Lambda: Elegant and Efficient

In Java 8, you can do:

items.removeIf(i -> predicate(i));

Where you used to do:

for (Iterator it = items.iterator(); it.hasNext();) {
  if (predicate(it.next())) {
    it.remove(); 
  } 
}

Which is not only considerably more elegant but also a more efficient solution in some cases. I have the graphs to prove it!

Background

Recently having spent a few years away from Java development, it is great to be back and find that Java 8 indeed has delivered on what was promised in 2013. With lambdas and streams, it is possible to tackle problems in a functional manner and write code that is both readable and efficient. This over due modernization of the language is by far the most important change in Java 8, but there are also other improvements that deserve mentioning. This post elaborates on the topic of Collection.removeIf() which allows for conditional removal of elements of a collection.

The Way of the Iterator

Removing elements of a collection while iterating over it is somewhat like cutting off the branch on which you are sitting, but by using Iterators that can be done without a runtime ConcurrentModificationException. Consider the following typical simplistic example, where the elements of the collection items for which predicate() holds true are removed.

for (Iterator it = items.iterator(); it.hasNext();) { 
  if (predicate(it.next())) { 
    it.remove(); 
  } 
}

If the collection in question is a linked list of size n and m elements are removed, the full iteration with removals is O(n + m) = O(n) since iteration steps and removal from the list are constant time operations and m is limited by n. If, however, the data structure of the collection is one where removal is relatively expensive compared to iteration, the complexity becomes less favorable. For an ArrayList a single removal step is O(n) since all items at higher indices have to be moved. There are still n iteration steps to be made so the full sequence of removals becomes O(m*n) which is O(n^2) assuming worst case with the number of removals being a constant fraction of the array size.

In more pragmatic terms, what happens in the ArrayList case is that for each element that is removed, all elements of higher indices are moved to fill the gap in the backing array. This is clearly suboptimal since that operation is repeated for each removal. Thus, to reach its final destination an element will visit all locations between the starting position and the destination. In case many elements are removed, this may be a considerable amount of work in vain.

The Java 8 Way of the Lambda

In Java 8, the code snippet from above can be expressed as follows.

items.removeIf(i -> predicate(i));

Clearly more concise, this expression is also more efficient on an ArrayList compared to the Iterator approach. Where the Iterator requires completion of each removal operation before continuing to the next one, removeIf operates on a predicate and can therefore iterate over all items once to determine which elements are to be removed and then shift the remaining elements to the correct new indices all in a second pass. Therefore, removeIf is O(n) since it performs two linear passes through the collection. Each retained element in the resulting collection is moved just once, while for the Iterator approach it would be moved one step at a time towards its destination. See this gist for the relevant JDK code.

The Difference Visualized

We compare execution times for the two approaches as a function of the array size for a rather extreme case where 999 out of 1000 items are removed from an ArrayList. For each data point 40 runs are made, the 5 fastest and slowest outliers are removed and the sum of the remaining 30 form the execution time value. For details please refer to this gist.

We expect the execution time to be considerably shorter for the Java 8 way of the lambda and as a matter of fact that is what we find. We also note that within the resolution of the graph, the lines are quite noise free due to the many repeats and filtering of outlier results.


For array sizes of 150,000 elements, the removeIf() approach is more than 3000 times faster. Even more importantly, the factor between the two increases as the size of the array increases which is expected since the way of the lambda has a superior asymptotical time complexity.

To better compare the time complexity of the two, we scale up the green line to match the green one. By multiplying the execution times of the faster algorithm by ~1500, the graphs can be compared in terms of shape. As expected, we find linear behavior for the Java 8 removeIf() design, while the old ways show a quadratic time complexity due to the inefficient removal of elements.

When scaling up the measurements we get to see some noise in the green line. The sensitivity to random fluctuations in execution time is greater since the execution times are several orders of magnitude shorter compared to the blue. Both lines are fitted to polynomials of power two and as can be clearly seen, the light green line is very similar to a straight line while the blue line clearly aims for the sky in a super linear fashion.

Edit: There is a reddit thread about this blog post. Good points are being made there, for example the fact that the removeIf implementation is atomic (in the database sense) as it first evaluates the predicate for all items before making any alterations to the list. Thus, if the predicate throws an exception for any of the items in the list, the list will remain unchanged.

Data structure Element Java (programming language)

Published at DZone with permission of Dan Lawesson, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Test Execution Tutorial: A Comprehensive Guide With Examples and Best Practices
  • 5 Steps for Getting Started in Deep Learning
  • Introduction to Container Orchestration
  • How Elasticsearch Works

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: