How to Use JPA and JDO in HBase
Join the DZone community and get the full member experience.
Join For FreeThanks to Google App Engine's work with DataNucleus, the GAE users have enjoyed JPA and JDO support. For its storage system, GAE uses the (NoSQL) Google BigTable implementation. HBase, under the Apache Hadoop project, is a distributed, column-oriented storage system that has been modeled after BigTable. There are some usage restrictions, but generally it's pretty easy to store data on BigTable using JPA and JDO. What you may not know is that HBase can also support these standard APIs for a homegrown system.
For developers who don't want to host their applications or store their data at Google, HBase provides a viable (and Apache community supported) option for building your own open source system. JDO and JPA are also used through DataNucleus to persist objects in HBase. To install HBase, just read the documentation, which covers all of the possible pitfalls. Set up is not very difficult. Next, Matthias Wessendorf explains how to use the JPA with HBase. You start with a regular persistence XML file listing your classes and the actual configuration:
During a Maven build (shown below) you'll have to enhance the bytecode of the actual classes. Lucky for you, DataNucleus offers a Maven plugin:

There are significant benefits in this method when hosting a normal Java EE application on HBase. Because Java EE uses the JPA for most of its storage, the integration of JEE applications is a lot easier. You can also use the 'native' HBase API to read and store data on a JPA/JDO managed HBase table, but the code is not as simple.
For developers who don't want to host their applications or store their data at Google, HBase provides a viable (and Apache community supported) option for building your own open source system. JDO and JPA are also used through DataNucleus to persist objects in HBase. To install HBase, just read the documentation, which covers all of the possible pitfalls. Set up is not very difficult. Next, Matthias Wessendorf explains how to use the JPA with HBase. You start with a regular persistence XML file listing your classes and the actual configuration:
<persistence...>In most cases you'll want to add @Entity to your class and try to deal with any limitations. When your data model is complete, you can, for example, start using the EntityManager natively:
<persistence-unit...>
<class>net.wessendorf...</class>
...
<properties>
<property name="datanucleus.ConnectionURL" value="hbase"/>
<property name="datanucleus.ConnectionUserName" value=""/>
<property name="datanucleus.ConnectionPassword" value=""/>
<property name="datanucleus.autoCreateSchema" value="true"/>
<property name="datanucleus.validateTables" value="false"/>
<property name="datanucleus.Optimistic" value="false"/>
<property name="datanucleus.validateConstraints" value="false"/>
</properties>
</persistence-unit>
</persistence>
EntityManagerFactory emf = Persistence.createEntityManagerFactory(...);More often, you may instead want to move the JPA-dealing code into a DataAccessObject.
EntityManager entityManager = emf.createEntityManager();
EntityTransaction entityTransaction = entityManager.getTransaction();
entityTransaction.begin();
entityManager.persist(myJPAentity);
entityTransaction.commit();
During a Maven build (shown below) you'll have to enhance the bytecode of the actual classes. Lucky for you, DataNucleus offers a Maven plugin:
<plugin>
<groupId>org.datanucleus</groupId>
<artifactId>maven-datanucleus-plugin</artifactId>
<version>2.0.0-release</version>
<configuration>
<log4jConfiguration>${basedir}/log4j.properties</log4jConfiguration>
<verbose>true</verbose>
<api>JPA</api>
<persistenceUnitName>nameOfyourPU</persistenceUnitName>
</configuration>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>enhance</goal>
</goals>
</execution>
</executions>
</plugin>

There are significant benefits in this method when hosting a normal Java EE application on HBase. Because Java EE uses the JPA for most of its storage, the integration of JEE applications is a lot easier. You can also use the 'native' HBase API to read and store data on a JPA/JDO managed HBase table, but the code is not as simple.
Java Data Objects
Opinions expressed by DZone contributors are their own.
Comments