Over a million developers have joined DZone.

JPA Searching Using Lucene - A Working Example with Spring and DBUnit

DZone's Guide to

JPA Searching Using Lucene - A Working Example with Spring and DBUnit

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Working Example on Github

There's a small, self contained mavenised example project over on Github to accompany this post - check it out here:https://github.com/adrianmilne/jpa-lucene-spring-demo

Running the Demo

See the README file over on GitHub for details of running the demo. Essentially - it's just running the Unit Tests, with the usual maven build and test results output to the console - example below. This is the result of running the DBUnit test, which inserts Book data into the HSQL database using JPA, and then uses Lucene to query the data, testing that the expected Books are returned (i.e. only those int he SCI-FI category, containing the word 'Space', and ensuring that any with 'Space' in the title appear before those with 'Space' only in the description.

The Book Entity

Our simple example stores Books. The Book entity class below is a standard JPA Entity with a few additional annotations to identify it to Lucene:

@Indexed - this identifies that the class will be added to the Lucene index. You can define a specific index by adding the 'index' attribute to the annotation. We're just choosing the simplest, minimal configuration for this example. 

In addition to this - you also need to specify which properties on the entity are to be indexed, and how they are to be indexed. For our example we are again going for the default option by just adding an @Field annotation with no extra parameters. We are adding one other annotation to the 'title' field - @Boost - this is just telling Lucene to give more weight to search term matches that appear in this field (than the same term appearing in the description field). 

This example is purposefully kept minimal in terms of the ins-and-outs of Lucene (I may cover that in a later post) - we're really just concentrating on the integration with JPA and Spring for now.

package com.cor.demo.jpa.entity;

import javax.persistence.Entity;
import javax.persistence.EnumType;
import javax.persistence.Enumerated;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.Lob;

import org.hibernate.search.annotations.Boost;
import org.hibernate.search.annotations.Field;
import org.hibernate.search.annotations.Indexed;

* Book JPA Entity.
public class Book {

    private Long id;

    @Boost(value = 1.5f)
    private String title;

    private String description;

    private BookCategory category;

    public Book(){


    public Book(String title, BookCategory category, String description){
        this.title = title;
        this.category = category;
        this.description = description;

    public Long getId() {
        return id;

    public void setId(Long id) {
        this.id = id;

    public String getTitle() {
        return title;

    public void setTitle(String title) {
        this.title = title;

    public BookCategory getCategory() {
        return category;

    public void setCategory(BookCategory category) {
        this.category = category;

    public String getDescription() {
        return description;

    public void setDescription(String description) {
        this.description = description;

    public String toString() {
        return "Book [id=" + id + ", title=" + title + ", description=" + description + ", category=" + category + "]";


The Book Manager

The BookManager class acts as a simple service layer for the Book operations - used for adding books and searching books. As you can see, the JPA database resources are autowired in by Spring from the application-context.xml. We are just using an in-memory hsql database in this example. 

package com.cor.demo.jpa.manager;

import java.util.List;

import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.persistence.PersistenceContextType;
import javax.persistence.Query;

import org.hibernate.search.jpa.FullTextEntityManager;
import org.hibernate.search.jpa.Search;
import org.hibernate.search.query.dsl.QueryBuilder;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;
import org.springframework.transaction.annotation.Transactional;

import com.cor.demo.jpa.entity.Book;
import com.cor.demo.jpa.entity.BookCategory;

 * Manager for persisting and searching on Books. Uses JPA and Lucene.
@Scope(value = "singleton")
public class BookManager {

    /** Logger. */
    private static Logger LOG = LoggerFactory.getLogger(BookManager.class);

    /** JPA Persistence Unit. */
    @PersistenceContext(type = PersistenceContextType.EXTENDED, name = "booksPU")
    private EntityManager em;

    /** Hibernate Full Text Entity Manager. */
    private FullTextEntityManager ftem;

     * Method to manually update the Full Text Index. This is not required if inserting entities
     * using this Manager as they will automatically be indexed. Useful though if you need to index
     * data inserted using a different method (e.g. pre-existing data, or test data inserted via
     * scripts or DbUnit).
    public void updateFullTextIndex() throws Exception {
        LOG.info("Updating Index");

     * Add a Book to the Database.
    public Book addBook(Book book) {
        LOG.info("Adding Book : " + book);
        return book;

     * Delete All Books.
    public void deleteAllBooks() {

        LOG.info("Delete All Books");

        Query allBooks = em.createQuery("select b from Book b");
        List<Book> books = allBooks.getResultList();

        // We need to delete individually (rather than a bulk delete) to ensure they are removed
        // from the Lucene index correctly
        for (Book b : books) {


    public void listAllBooks() {

        LOG.info("List All Books");

        Query allBooks = em.createQuery("select b from Book b");
        List<Book> books = allBooks.getResultList();

        for (Book b : books) {


     * Search for a Book.
    public List<Book> search(BookCategory category, String searchString) {

        LOG.info("Searching Books in category '" + category + "' for phrase '" + searchString + "'");

        // Create a Query Builder
        QueryBuilder qb = getFullTextEntityManager().getSearchFactory().buildQueryBuilder().forEntity(Book.class).get();

        // Create a Lucene Full Text Query
        org.apache.lucene.search.Query luceneQuery = qb.bool()
                .must(qb.keyword().onFields("title", "description").matching(searchString).createQuery())

        Query fullTextQuery = getFullTextEntityManager().createFullTextQuery(luceneQuery, Book.class);

        // Run Query and print out results to console
        List<Book> result = (List<Book>) fullTextQuery.getResultList();

        // Log the Results
        LOG.info("Found Matching Books :" + result.size());
        for (Book b : result) {
            LOG.info(" - " + b);

        return result;

     * Convenience method to get Full Test Entity Manager. Protected scope to assist mocking in Unit
     * Tests.
     * @return Full Text Entity Manager.
    protected FullTextEntityManager getFullTextEntityManager() {
        if (ftem == null) {
            ftem = Search.getFullTextEntityManager(em);
        return ftem;

     * Get the JPA Entity Manager (required for the DBUnit Tests).
     * @return Entity manager
    protected EntityManager getEntityManager() {
        return em;

     * Sets the JPA Entity Manager (required to assist with mocking in Unit Test)
     * @param em EntityManager
    protected void setEntityManager(EntityManager em) {
        this.em = em;



This is the Spring configuration file. You can see in the JPA Entity Manager configuration the key for 'hibernate.search.default.indexBase' is added to the jpaPropertyMap to tell Lucene where to create the index. We have also externalised the database login credentials to a properties file (as you may wish to change these for different environments), for example by updating the propertyConfigurer to look for and use a different external properties if it finds one on the file system). 

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:camel="http://camel.apache.org/schema/spring" xmlns:tx="http://www.springframework.org/schema/tx"

<!-- Spring Component Package Scan -->
<context:component-scan base-package="com.cor.demo.jpa" />

<!-- Property configuration -->
<bean id="propertyConfigurer"
p:ignoreUnresolvablePlaceholders="true" p:ignoreResourceNotFound="true">
<property name="locations">

<!-- JPA Entity Manager Factory -->
<bean id="entityManagerFactory"
<property name="dataSource" ref="dataSource" />
<!-- <property name="packagesToScan" value="com.cor.demo.jpa.entity" /> -->
<property name="jpaVendorAdapter">
<bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
<property name="showSql" value="true" />
<property name="generateDdl" value="true" />
<property name="jpaPropertyMap">
<entry key="hibernate.hbm2ddl.auto" value="update" />
<entry key="hibernate.format_sql" value="true" />
<entry key="hibernate.use_sql_comments" value="false" />
<entry key="hibernate.show_sql" value="false" />
<entry key="hibernate.search.default.indexBase" value="/var/lucene/indexes" />

<!-- JPA Data Source -->
<bean id="dataSource"
<property name="driverClassName" value="${database.driver}" />
<property name="url" value="${database.url}" />
<property name="username" value="${database.username}" />
<property name="password" value="${database.password}" />

<!-- Transaction Manager -->
<bean id="txManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<tx:annotation-driven transaction-manager="txManager" />


Testing Using DBUnit

In the project is an example of using DBUnit with Spring to test adding and searching against the database using DBUnit to populate the database with test data, exercise the Book Manager search operations and then clean the database down. This is a great way to test database functionality and can be easily integrated into maven and continuous build environments.

Because DBUnit bypasses the standard JPA insertion calls - the data does not get automatically added to the Lucene index. We have a method exposed on the service interface to update the Full Text index 'updateFullTextIndex()' - calling this causes Lucene to update the index with the current data in the database. This can be useful when you are adding search to pre-populated databases to index the  existing content.

package com.cor.demo.jpa.manager;

import java.io.InputStream;
import java.util.List;

import org.dbunit.DBTestCase;
import org.dbunit.database.DatabaseConnection;
import org.dbunit.database.IDatabaseConnection;
import org.dbunit.dataset.IDataSet;
import org.dbunit.dataset.xml.FlatXmlDataSetBuilder;
import org.dbunit.operation.DatabaseOperation;
import org.hibernate.impl.SessionImpl;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;

import com.cor.demo.jpa.entity.Book;
import com.cor.demo.jpa.entity.BookCategory;

 * DBUnit Test - loads data defined in 'test-data-set.xml' into the database to run tests against the
 * BookManager. More thorough (and ultimately easier in this context) than using mocks.
@ContextConfiguration(locations = { "classpath:/application-context.xml" })
public class BookManagerDBUnitTest extends DBTestCase {

    /** Logger. */
    private static Logger LOG = LoggerFactory.getLogger(BookManagerDBUnitTest.class);

    /** Book Manager Under Test. */
    private BookManager bookManager;

    public void setup() throws Exception {
        DatabaseOperation.CLEAN_INSERT.execute(getDatabaseConnection(), getDataSet());

    public void tearDown() {

    protected IDataSet getDataSet() throws Exception {
        InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("test-data-set.xml");
        FlatXmlDataSetBuilder builder = new FlatXmlDataSetBuilder();
        return builder.build(inputStream);

     * Get the underlying database connection from the JPA Entity Manager (DBUnit needs this connection).
     * @return Database Connection
     * @throws Exception
    private IDatabaseConnection getDatabaseConnection() throws Exception {
        return new DatabaseConnection(((SessionImpl) (bookManager.getEntityManager().getDelegate())).connection());

     * Tests the expected results for searching for 'Space' in SCF-FI books.
    public void testSciFiBookSearch() throws Exception {

        List<Book> results = bookManager.search(BookCategory.SCIFI, "Space");

        assertEquals("Expected 2 results for SCI FI search for 'Space'", 2, results.size());
        assertEquals("Expected 1st result to be '2001: A Space Oddysey'", "2001: A Space Oddysey", results.get(0).getTitle());
        assertEquals("Expected 2nd result to be 'Apollo 13'", "Apollo 13", results.get(1).getTitle());

    private void deleteBooks() {
        LOG.info("Deleting Books...-");


The source data for the test is defined in an xml file.

<?xml version='1.0' encoding='UTF-8'?>
<!-- Test Dataset - mix of FANTASY and SC-FI to suppor the BookManagerDBUnitTest -->
  <book id="1" title="The Lord of the Rings"
description="the Lord of the Rings is an epic high fantasy novel written by English philologist and University of Oxford professor J. R. R. Tolkien"
category="FANTASY" />
<book id="2" title="The War of the Worlds" description="War in space"
category="FANTASY" />
<book id="3" title="Apollo 13"
description="Apollo 13 was the seventh manned mission in the American Apollo space program and the third intended to land on the Moon"
category="SCIFI" />
<book id="4" title="2001: A Space Oddysey"
description="2001: A Space Odyssey is a 1968 British-American science fiction film produced and directed by Stanley Kubrick"
category="SCIFI" />
<book id="5" title="Dune"
description="Dune is a 1984 science fiction film written and directed by David Lynch, based on the 1965 Frank Herbert novel of the same name."
category="SCIFI" />

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

architects ,bigdata ,tool ,lucene ,tools & methods ,big data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}