The AgroSense project (agrosense.java.net) is using JPA as persistence framework for its NetBeans Platform application since the start almost a year ago. We have encountered a lot of issues. Some of those issues we were able to fix and some not. After a year of of JPA we are about to pull the plug and switch back to an alternative.
AgroSense Data Architecture Overview
In the most simple setup, the AgroSense application will consist of only a client. In this case, data needs only to be stored locally, in a place separated from the application installation.
In the second, more advanced setup, the farmer has enabled online storage. In this case, all data stored locally needs to be synchronized with a server somewhere on the internet.
The third most complex setup consists of multiple application instances working on the same dataset. Data on the server needs to be synchronized with multiple clients and a locking mechanism should prevent one user overriding changes of the other.
In both cases with a server component, the data of the individual farmers needs to be strictly separated.
IssuesIssues we where able to fix:
- Extendable model with multiple persistence units ( with isolation of the extensions )
- Use of a file database for runtime and memory database for tests
- lazy loading with use of long living EntityManagers
- Hibernate class proxy
Issues we where not able to fix:
- Clear and working transaction demarcation
- Different threads produce a different instance of an entity with the same id.
- Horrible performance with bulk data processing
- Unavoidable merges of entities causing all Bean listeners to be lost
- Exceptions with read only queries right after an insert/update
How JPA Is Designed To Be Used
There is some very good JPA documentation at the jboss site:
The EntityManagerFactory creation is an expensive process, this should be done once at startup of the application. The EntityManager creation is cheap, the EntityManager typically is created and closed at the start resp. end of a unit of work.
The definition of the EntityManager conflicts with solution 3. EntityManagers are not supposed to be long living. This solution might well be the cause for one or more of the unsolvable problems.
Loosing the lazy loading functionality is not as bad as it may look at first. We use mainly BeanTreeViews to give a strucured data representation. The user can perform actions on the nodes in the tree view. The node hierarchy has its own nifty lazy loading mechanism. So it is not such a bad thing to first fetch all car brands from the database and only fetch models when a brand node is expanded.
But this conflicts with the JPA entity hierarchy. It expects dependencies to be a @OneToMany relation and hold object references. If we query the children separately from the parents, the integrity will be lost. (not for real of course, but the EntityManager will be certain of it).
To investigate this we have done a proof of concept with short living EntityManagers. Instead of storing the Created EntityManagers in a ThreadLocal variable, we now requested new entityManagers from the EntityManagerFactory for each unit of work, and closed it after the work was done in a finally. Transactions were explicitly started, committed and rolled back in a RuntimeException catch block. Soon it became clear that merging of detached entities became an unavoidable necessity.
Consider the following scenario:
- A Bean is to be displayed in a tree as a node, with the node name reflecting the bean name
- The Bean is requested from the service to be displayed in the application (Service#find())
- The user opens the bean edit screen (right click edit for example)
- The user edits the values (among wich the name) of the bean and presses the save button (Service#update(Bean))
- The node representing the bean is updated to reflect the changed name (PropertyChangeListeners)
Both find and update are different units of work in the service. They may or may not be called from the same Thread.
With a long living EntityManager the Entity will be managed when the second call is made from the same Thread. But it would be detached when its not. This is probably the cause of issue 5. In some thread call combinations, the values of the Bean are changed, it became detached and in the mean time on the same entittyManager a query is executed, which causes a flush and produces an exception.
With short living EntityManager we are certain we need to merge the entity. The problem with merging are PropertyChangeListeners. PropertyChangeEvents triggered before a merge can trigger a deadlock (Although I don't know exactly how and why, they do..). After the merge the listeners are lost on the new managed Entity, so changing properties after the merge will not get through to listeners on the "old" entity.
JPA looks heavily engineered towards a web architecture. You have an application server. On initialization, you create your factory. When a request enters, you create an EntityManager for the span of the request. You do your work and when you send out the response you close the EntityManager. In this scenario there is hardly use for the challenged "merge" method. Most likely you will fetch new entities by id's and work your magic on them.
In the desktop environment, we know the state of the client because we are the client. We want our red Ferrari bean to be the same bean through the entire application. We want to wrap it in a BeanNode, listen to name changes, drap it onto the car editor window etc etc. There is no need for complicated attached/detached scenarios.
The most severe problem with JPA is that we cannot trust a bean to be the same throughout the entire application. We can hardly manage when we want to fetch children to make best use of the tools NetBeans gives us. It is far too complicated for what we need from a persistence framework in a desktop application.
- Avaje Ebean (http://avaje.org/)
- plain jdbc
- Spring JDBC template
- Marauroa (http://arianne.sourceforge.net/engine/marauroa.html)
There are quite a few alternatives for JPA as persistence provider. But there are also out-of-the-box solutions like Marauroa and CouchDB. Traditional DBMS solutions, although often chosen implicitly, might not be the best solution to embed in a rich client environment. "NetBeans Swing" might be the answer to the first unasked question ("Do we need a web solution?"). The second unasked question is probably "Do we need to store our data in a relational dastabase on the client?"