DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Data
  4. Collection Performance - Don't Become Too Lazy

Collection Performance - Don't Become Too Lazy

Daniel Schneller user avatar by
Daniel Schneller
·
Nov. 17, 10 · Interview
Like (0)
Save
Tweet
Share
13.55K Views

Join the DZone community and get the full member experience.

Join For Free

A few weeks ago I was performance tuning some code that ran quite regularly and took more time than it should, judging from the complexity of what is was doing. As usual, by merely looking that the code there was nothing blatantly, obviously, complete and ridiculously wrong - after all, it worked correctly.

So I fired up trusty old JProfiler and had a look, just to find out that sometimes you can become just too lazy and reliant on libraries without realizing their characteristics deeply enough.

The Code

The following snippet shows the relevant method of the code under analysis. The surroundings are not really important, the heart of the matter is in here:

protected List<String> getCurrentChunkList() {
List<String> all = getContext().getFullList();
List<String> result = new ArrayList<String>(chunkSize + 2);
int tempCount = 0;
for (Iterator<String> allIter = all.iterator(); allIter.hasNext()
&& tempCount < chunkSize; tempCount++) {
result.add(allIter.next());
if (tempCount == chunkSize) {
while (allIter.hasNext()) {
String current = allIter.next();
if (tCurrent.equalsIgnoreCase(result.get(result.size() - 1))) {
result.add(tCurrent);
} else {
break;
}
}
}
}
all.removeAll(result);
return result;
}

Basically all this does is take a portion of a long list of Strings and transfer them to a partial list with some data specific fiddling at the end of a chunk. Once that has been completed, the elements transferred are removed from the primary list.

The whole thing is part of a batch processing system that is only allowed to run for a limited time slice at a time. So the overall list of items that still needs to be processed is stored persistently in a context object and a certain amount of those is extracted, the master list shortened and the resulting chunk returned to the actual processing in each run. Control is then returned to the surrounding framework that can assign a new time slice to this kind of job, or at different one at its discretion. This will be repeated until the master list is empty.

The all list contains about 10,000 elements, the chunk size is set to 1,000. Now, this does not look too bad, does it?

First run

When running the processing job for the mentioned 10,000 elements in chunks of 1,000 the whole process took 490 seconds; of that the above method accounted for a whopping 6,3% or 31 seconds! A little long considering that it is just calling a few supposedly well-optimized collection operations…

But looking at it more closely, these “few” operations are a few more. With 1,000 elements per chunk, there were

  • 10 invocations of getCurrentChunkList
  • 10,000 invocations of Iterator.hasNext
  • 10,000 invocations of Iterator.next
  • 10,000 invocations of ArrayList.add
  • 10 new ArrayLists
  • 10 new Iterators
  • 10 invocations of ArrayList.removeAll

Not exactly minimalistic, but 31 seconds seems a little over the top.

The problem lies with the ArrayList.removeAll call - internally an ArrayList is using a regular array as a backend store. The implementation of removeAll is just a looped call to the remove method for a single element. remove will then internally iterate the array and do an equality check of the element to be removed, finally removing it from the array when found.

Unless the element to be removed is at the end of the backing store array, there is a lot of bookkeeping to do, because removing something from the middle of an array is not an easy thing to do - in any case a good chunk of the array needs to be shifted over to fill the gap.

While this is using the fairly efficient System.arraycopy in its implementation, it still adds up, because we the algorithm shown above removes the elements it picks for processing from the beginning of the list, leading several thousand array shifting and resizing operations.

For reference, this is the relevant part from the source of ArrayList:

public E remove(int location) {
E result;
int size = lastIndex - firstIndex;
if (0 <= location && location < size) {
if (location == size - 1) {
result = array[--lastIndex];
array[lastIndex] = null;
} else if (location == 0) {
result = array[firstIndex];
array[firstIndex++] = null;
} else {
int elementIndex = firstIndex + location;
result = array[elementIndex];
if (location < size / 2) {
System.arraycopy(array, firstIndex, array, firstIndex + 1,
location);
array[firstIndex++] = null;
} else {
System.arraycopy(array, elementIndex + 1, array, elementIndex,
size - location - 1);
array[--lastIndex] = null;
}
}
if (firstIndex == lastIndex) {
firstIndex = lastIndex = 0;
}
} else {
throw new IndexOutOfBoundsException(
// luni.0A=Index: {0}, Size: {1}
Messages.getString("luni.0A", //$NON-NLS-1
Integer.valueOf(location),
Integer.valueOf(lastIndex - firstIndex)));
}
modCount++;
return result;
}

Possible optimisations

As one can see from the source of ArrayList.remove above, the ideal case would be to remove elements from the end of the array, because that does not require any element shifting and array copying whatsoever. The remove operation can simply be done by setting the last element to null and decrementing the lastIndex.

So I changed to code of the initial method to the following:

protected List<String> getCurrentChunkList() {
List<String> all = getContext().getFullList();
List<String> result = new ArrayList<String>(chunkSize + 2);
int tempCount = 0;
for (int i = all.size() - 1; i >= 0 && tempCount < chunkSize; i--, tempCount++) {
result.add(all.remove(i));
}
return result;
}

This code removes strictly from the end of the all list and adds each element to the current result chunk. Going this route also allowed for the special fiddling in the middle of the old method to be removed, even though that did not contribute any measurable amount of overhead to the method’s execution time. The call to removeAll is completely gone, because remove already takes care of that individually.

Second run

Now, with the same data set the profiling comes to a very different conclusion:

  • 10 invocations of getCurrentChunkList
  • 10,000 invocations of ArrayList.remove
  • 10,000 invocations of ArrayList.add
  • 10 new ArrayLists

With this new configuration the percentage of overall runtime for this particular piece of code was reduced to a mere 23ms, less than 0,005% of the total execution time.

Conclusion

Even though the Java Collections framework’s classes are well tested and contain lots of performance enhancing shortcuts, it is still vital to be aware of their characteristics. Even though the original method looked perfectly ok at first glance and for a few runs would not show up on the radar, the repeated execution time added up to quite a remarkable percentage of the overall process.

With just a few minor modifications - that made the code easier to understand, too, getting rid of intermediate Iterators and special handlings for edge cases - that single method’s performance was boosted significantly. Considering that it is part of a more general batch job framework, overall improvements are even greater than what this single isolated test case showed.

From http://www.danielschneller.com/2010/11/collection-performance-dont-become-too.html

Data structure Element

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Multi-Tenant Architecture for a SaaS Application on AWS
  • Seamless Integration of Azure Functions With SQL Server: A Developer's Perspective
  • Create a CLI Chatbot With the ChatGPT API and Node.js
  • Old School or Still Cool? Top Reasons To Choose ETL Over ELT

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: