Concurrency in JSR-166y Meets Groovy: Process Collections in Parallel
Join the DZone community and get the full member experience.
Join For FreeHow about parallel processing of elements stored in one of the java.util.* collections? Could we make the Groovy each(), collect() and such methods leverage multiple threads to speed them up? In fact, it is very easy.
I wrote briefly about the new concurrency enhancements planned for Java 7 under JSR-166y earlier. I'm still happily experimenting with the library and here's how you could seamlessly employ ParallelArrays to speed up your Groovy collections. Thanks to Groovy meta-programming capabilities we will replace some of the original GDK methods on collections with parallel implementations.
It'll be a short journey, the target is only two steps away.
1. Create the JSR-166y thread pool and a helper method that will create Parallel Arrays for us
//Create a pool with size close to the number of processor's cores
final ForkJoinPool pool = new ForkJoinPool(2)
private ParallelArray createParallelArray(pool, collection) {
return ParallelArray.createFromCopy(
collection.toArray(new Object[collection.size()]), pool)
}
2. Enhance the required collection classes with parallel methods using meta-programing
We can either introduce new methods, something like eachAsync(), collectAsync(), etc., or replace directly the original GDK methods like each(), collect() and such. I'll go the second way in the example and replace the GDK methods on the ArrayList class, although replacing them on individual instances of the class instead would probably be more practical in reality.
//Enhance ArrayLists with a new method to process collect in parallel
ArrayList.metaClass.collect = {Closure cl ->
createParallelArray(pool, delegate).
withMapping({cl(it)} as Op).all().asList()
}
//Enhance ArrayLists with a new method to find matching objects in parallel
ArrayList.metaClass.findAll = {Closure cl ->
createParallelArray(pool, delegate).
withFilter({cl(it)} as Predicate).all().asList()
}
The magic happens now
Although nothing needs to change in our code, the collections now use parallelism under the covers.
def sites=[
'http://www.jroller.com',
"http://www.infoq.com",
"http://java.dzone.com"]
def groovySites = sites.findAll {new URL(it).text.toLowerCase().contains('groovy')}
println "These sites talk about Groovy today: ${groovySites}"
Or if you need to process complex images in some way:
//Use the new parallel functionality
List images=loadImages()
List reformatedImages=images.collect {processImage(it)}
Did you count how many times we had to use the Thread class and the synchronized block in the multi-threaded code we've just written?
Opinions expressed by DZone contributors are their own.
Comments