Taking Advantage of Parallelism
Join the DZone community and get the full member experience.
Join For FreeA while ago some colleagues attended a lecture where the presenter
introduced the idea that applications may not take full advantage of
the multi-core servers which are available today. The idea was that if
you have two cores but a process which is running on a single thread,
then all the work is done on one single core. Application servers help
in this respect, because they handle multiple incoming requests
simultaneously, by starting a new thread for each request. So if the
server has two cores it can really handle two requests simultaneously,
or if it has 6 cores, it can handle 6 requests simultanously.
So multi-core CPUs can help the performance of your server if you have
multiple simultaneous requests, which is often the case when your
server is running near its limit. But it's not often the case that you
want your servers running close to the limit, so you typically scale
out, by adding more nodes to your server cluster, which has a similar
effect to adding cores to the CPU (you can continue to handle multiple
requests simultaneously).
So once you have scaled up by adding more cores, and scaled out by adding more servers, how can you improve performance?
Some processes can be designed to be non-serial, especially in enterprise scenarios. The Wikipedia article on multi-core processors
talks about this. Imagine a process which gathers data from multiple
systems while preparing the data which it responds with. An example
would be a pricing system. Imagine your local zoo offered a web service
for purchasing individual tickets, say an entrance ticket, a voucher
for some tasty elephant snacks for feeding time, a voucher for their
shop for getting gifts before you leave, etc. Now imagine, you as an
entrepreneur wanted to open a re-selling business offering packages for
such products, so that you could offer a discount (greater sales
volume, lower per-item profit).
A perfect way to do this would be to partner with the Zoo and offer a
website which integrates with the zoo's web service and sells such
packages. You implement your website, but then while doing performance
tests, you notice that regardless of load, an average sales is taking 3
seconds to complete. You analyse the system and determine that its the
zoo's web service which is the problem - each product quote is taking a
full second to get. You talk to the zoo's IT department, and they
confirm that this is known, but there is little they can do about it
because in the back end they are using complex reservation systems
which work out a dynamic market price which is based on supply and demand.
So, you go away and have a think, and remember that your data center
has just just installed some servers with flashy new multi-core
processors, which are costing you a fortune. Then it hits you! Getting
a product quote is an independent task. So while getting quotes for the
items in your package, you might as well do it in parallel, at the same
time! Hang on now. You implemented your system in Java and the business
logic is in EJBs, so there's no chance of spawning new threads, the
container won't let you! Except... you implemented it on a shiny new
Java EE 6 server! Hooray! Why? Because of the javax.ejb.Asynchronous annotation.
The Javadocs for this annotation state:
Used to mark a method as an asynchronous method or to designate all business methods of a class or interface as asynchronous.
So, what you do, is add this annotation to a business method of your
EJB and the container runs it on a new thread. Brilliant! The only
minor issue is how to get the result from your asynchronous calls, but
that's relatively easy, using the Future<T> class. Consider the following method declaration:
/**
* note that async methods implicitly start new transactions!
* see ejb 3.1 spec, chapter 4.5.3.
*/
@Override
@Asynchronous
@TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public Future getOffer(Product product, UUID purchaseId) {
.
.
This EJB business method takes a product and purchase ID (effectively a
session ID). It gets called by the container on a new thread, which
implicitly means that it cannot re-use any existing transactions and
must hence be run using a new transaction (see EJB specification,
chapter 4.5.3). This method then calls the zoo's web service and when
it gets a response, returns the response:
return new AsyncResult(o);
where "o" is an offer - the response from the web service call. The javax.ejb.AsyncResult<T> class is the standard implentation of a Future<T>, and simply wraps the actual object you want to return.
The EJB which started the async calls knows when they have all returned, because it can check the results isDone() method. As soon as the new thread completes, the container sets this method to return true.
So, the net effect is that instead of taking three seconds to get an
offer for your package of three items, it only takes one second! A
whopping 300% improvement!
The async annotation in the EJB spec is a welcome extension. If also
offers the ability to cancel tasks as well as handle exceptions from
the async calls (via ExecutionExceptions
which wrap the problem). With it however, there are some dangers, as
suggested in the Wikipedia article on multi-core processors. The fact
that such async methods are forced to start a new transaction can cause
problems. However, this is actually an advantage, because we are
calling web services! Web services are inherently non-transactional.
That means they are committed as soon as you get their response, not as
soon as your transaction is committed. So in order to make web services
XA (two phase commit) compatible, one normally adds a "commit" method
to the web service which the caller must call when all their business
logic is complete, or one adds a mechanism for cancelling such
transactions.
I have used the new "inner" transaction which the container starts to
track the state of these offers / sales. As soon as the web service
returns, I write the result to a table in the database with its status
"OFFERED", so that I know I have not actually purchased it. Regardless
of errors in other calls to the web service, I am guaranteed to have
such an entry for each successful web service result, due to the inner
transaction. After I get the three offers, the orchestration EJB which
called the async bean (which calls the web services), purchases the
three offers. After each purchase, I update the status in the database
to "BOUGHT", also using an inner transaction (REQUIRES_NEW), so that
regardless of any failures in subsequent purchases, the state of
individual purchases is known. Why go to this effort? Well, I have a
business rule: my packages are sold as complete packages, or not at
all! So if any single purchase fails, I need to cancel my other
purchases. If I don't, I still owe the zoo for what I have purchased,
and I didn't get any money from my customer, because the package failed!
The solution is to use the timer service from the EJB container. I
schedule a method in an EJB to analyse my offers/purchases and for any
incomplete session (purchase ID), I cancel all three offers and update
my records to mark the purchases as "CANCELLED".
I've put this together as a sample app which includes EJB 3.1, Servlet
3.0, Inteceptors, JPA 2.0, the EJB Timer Service, Validation (JSR-303),
JMS and JAX-WS, all based on the GlassFish 3 configuration which was
blogged at GlassFish 3 In 30 Minutes. The source for this mega sample app can be downloaded here.
Some final quick notes:
1) A web service, and its client cannot be deployed in the same EAR!
That makes sense in normal conditions, but for a sample app, where you
want your server to call a test WS, it can cause trouble. So the zoo
web service is deployed as a simple web project.
2) The Javadocs for the OrchestrationBean provide some more details of the business process.
From http://blog.maxant.co.uk/pebble/2010/05/16/1274017800000.html
Opinions expressed by DZone contributors are their own.
Comments