Parallelism in Spring
Join the DZone community and get the full member experience.
Join For FreeI've been wanting to post about parallelism in Spring for ages by now. It's an interesting topic that doesn't get the attention it deserves(if any at all), probably because an IoC container and Spring in particular really shine managing dependencies, a task that intuitively promotes serial processing, and also because JEE APIs (say Servlet or EJB) hide the need of it. There's a specific area where no matter what you'll be looking for concurrency and that is data retrieval. As long as a couple of different resources are involved, or even one alone if the data requested is independent, there are efficiency gains processing the several connections in parallel. A common case in todays environments would be, for example, calling web services.
Standard Java does not really offer any API to manage concurrency once inside a request (though the request itself is managed from a pool). In fact, it's open for discussion if the standard forbids opening new threads in a web context (it's specifically banned for EJBs). WebSphere and Weblogic proposed an alternative called CommonJ aka WorkManager API. It's a very good alternative when running under those application servers. Spring offers another, arguably even more powerful, option with the TaskExecutor abstraction. It's sometimes preferable, in a Spring environment, because it can use CommonJ as the underlying API but it can also use the Java5 Executor framework (among others) as well, making the switch just a matter of changing a couple of configuration lines.
Let's review how to use the framework. Our only pre-requisite is to have at least two data retrieval services already configured as a dependency of a third bean. All the data retrieval services must share a common interface, I can recommend something like the Command pattern here (beware this approach is not fully followed bellow to better showcase inbound data processing). At this point we're going to change the individual dependencies and transform them into a collection, we'll add init and destroy methods and an executor (let's start with a JDK5 implementation):
public class ParallelService implements InitializingBean, DisposableBean { private List<Command<T>> commands; private ExecutorService executor; }
With our current implementation based on Java 5 executors we need to start up the thread pool in the initialization method and conclude everything when Spring context is closed
public void afterPropertiesSet() throws Exception { executor = Executors.newFixedThreadPool(commands.size()); } public void destroy() throws Exception { executor.shutdownNow(); // Improve this as much as liked }
We just need to handle the concurrent execution now. It's easy to do with the Future management of asynchronous tasks. Another alternative is to submit all tasks and await termination (see ExecutorService):
public void execute(Data data) { Set<Future<?>> tasks = new HashSet<Future<?>>(commands.size()); for (Command command : commands) tasks.add(executor.submit(new RunCommand(command, data))); for (Future future : tasks) future.get(); //Other stuff to execute after all data has been retrieved }
The code above just creates a collection of Future objects to check when the jobs have finished. The tricky part is the creation of the concurrent job from a custom service and pass the required data (if needed). An inner class wrapper will suffice:
private static class RunCommand implements Runnable { private final Data data; private final Command command; public RunCommand(Command command, Data data) { this.data = data; this.command = command; } public void run() { command.execute(data); } }
Well, that was pretty easy indeed. Right now we have a perfectly valid way to invoke beans in parallel. This approach has pros and cons. In the former list we have independence from Spring APIs (of course imagine that the Spring interfaces are substituted by their matching XML attributes) but we are also limited to a Java5 environment. If we don't mind introducing a dependency with Spring itself we can transform the source code to use the TaskExecutor framework:
public class ParallelService { private TaskExecutor taskExecutor; private List<Command<T>> commands; public void setTaskExecutor(TaskExecutor taskExecutor) { this.taskExecutor = taskExecutor; } }
And now the init and destroy methods are substituted by some XML configuration:
<bean class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor"> <property name="corePoolSize" value="5" /> <property name="maxPoolSize" value="10" /> <property name="queueCapacity" value="25" /> </bean>
But notice that not all implementations of the TaskExecutor interface allow tracking the progress of a task once scheduled for execution!
public void execute)( { for (Command command : commands) taskExecutor.execute(new RunCommand(command)); }
Opinions expressed by DZone contributors are their own.
Comments