Over a million developers have joined DZone.

Using Spring Data Neo4j to Build Recommendation Systems

· Java Zone

Discover how powerful static code analysis and ergonomic design make development not only productive but also an enjoyable experience, brought to you in partnership with JetBrains

Today we would like to introduce you to Spring Data Neo4j. To this end we implemented a little showcase application. The context of the showcase is a shop system. Therefore it would be useful to calculate what other users also viewed – as known from many popular shopping e-commerce websites like Amazon.  As these connections between users and products are easily displayable as a graph, we decided to use Neo4j to represent the nodes and the relationships between them.

What is Spring Data Neo4j?

First of all, what is Spring Data? It’s a SpringSource project that aims to provide Spring’s  programming model and conventions for NoSQL databases. Spring Data supports different NoSQL databases such as Redis, Riak, MongoDB and others. It also provides an abstraction layer for map-reduce implementations like Hadoop.

Since 2010 there is a Spring Data Neo4j initiative. Some resources can be found on the Neo4j website. The best guidebook available at the moment is ‘Good Relationships’ written by Michael Hunger, the lead developer of this project. It is free for download and also available as HTML version.  Some Spring Data Neo4j code examples are in the Spring Data Neo4j Git repository. There is also an O’Reilly book about Spring Data Neo4j.

Why Not Use Core Neo4j?

Of course, it would also be possible to just use core Neo4j or build your own integration. But if you have experiences with other spring projects, you identify the benefits. As a software engineer in general, you do not want to care about technical details like the entity mapping or transaction management. You just have to know the concept and Spring handles the magic in the background. It is comparable to the Hibernate support in Spring:

  • Common Spring and Spring Data infrastructure. It is very easy to embed Neo4j in existing applications managed by the Spring framework.
  • Annotations to declare the nodes and their relationships.
  • Code much easier to comprehend.
  • Entity state is backed by graph database.
  • Support of Neo4j server.

How to Use Spring Data Neo4j?

If you are using Maven you can include Spring Data Neo4j into your project by adding this to your pom.xml (beside the dependencies for Spring and Neo4j):

<!-- Spring Data Neo4j -->
        <dependency>
            <groupId>org.springframework.data</groupId>
            <artifactId>spring-data-neo4j</artifactId>
            <version>2.2.0.RELEASE</version>
        </dependency>

The setup of Neo4j is done with the following spring context:

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:neo4j="http://www.springframework.org/schema/data/neo4j"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
        http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd
        http://www.springframework.org/schema/data/neo4j http://www.springframework.org/schema/data/neo4j/spring-neo4j.xsd">

    <context:spring-configured/>
    <context:annotation-config/>

    <neo4j:config storeDirectory="target/data/db"/>

    <neo4j:repositories base-package="com.comsysto.neo4j.showcase"/>
</beans>

The “storeDirectory” property of the Neo4j configuration can be a folder of your choice. The Neo4j database will be stored there locally. Afterwards, you can start to implement your node and relationship entities representing your graph model.

How to Declare Node Entities?

For node entities create a new class looking like this:

@NodeEntity
 public class Product {

@GraphId
 private Long graphId;

@Indexed(unique = true)
 private String productId;
@Indexed(indexType = IndexType.FULLTEXT, indexName = "productName")
 private String productName;
@RelatedToVia(type = RelationshipTypes.RECOMMEND)
 private Set<RecommendRelationship> productsRecommendRelationships = new HashSet<RecommendRelationship>();

public Product() {/* NOOP */}

public Product(String productId, String productName) {
 this.productId = productId;
 this.productName = productName;
}

// all getters and setters (for graphId only getter)
...

// further methods 
...

// @Override methods like toString(), equals(Object o), hashCode()
...

The getters and setters are essential for property access in the Spring framework. You can also implement your own additional methods. But a typical Neo4j entity is a classical JavaBean consisting of properties and their accessors. It is also recommended to implement equals and hashCode, as in some cases Spring Data Neo4j compares objects for the entity and relationship mapping.

How to Create Relationships Between Node Entities?

Spring Data could handle relationships between nodes in three different ways. Which possibility you choose depends on two aspects: The type of relationship that has to be modeled (1:1 vs. 1:n) and whether it is a simple or a rich relationship. Rich relationships own additional attributes.

1:1 relationships are very simple: The child node has simply to be referenced in the parent node as a property (accessors have to be implemented). There is no annotation required. Spring Data handles all the magic.

In simple 1:n relationships you have to add a set containing the child nodes in the parent node (see above). In addition to that you have to annotate this set with @RelatedTo(type = “relationshipType”) like this:

@NodeEntity
public class Product extends IdentifiableEntity {

    @RelatedTo(type = "buyer")
    Set<Customer> customers;

    //getters and setters, equals and hashCode
}

Rich relationships demonstrate the full graph model power of Spring Data. You can model real world relationships with additional attributes. To do this, you have to create a relationship entity storing the relation properties. Furthermore, both the parent and child node need to be specified by annotations (@StartNode, @EndNode). The following example demonstrates this:

@NodeEntity
public class User extends IdentifiableEntity {

    @RelatedToVia(type = "clickProductsRelationship")
    Set<ClickedProductRelationship> clickedProducts;
}

@RelationshipEntity(type = "clickProductsRelationship")
public class ClickedProductRelationship {

    @StartNode
    @Fetch
    Product start;

    @EndNode
    @Fetch
    Product end;

    int clicksCount;

    //getters and setters, equals and hashCode
}

Take care about the @Fetch property. In many cases not all related one-to-many objects need to be available after you loaded a specified entity. The default behaviour of Spring Data is only to fetch a list of IDs identifying the related objects when a node entity is loaded.  This approach is comparable to the lazy loading machanism known from many other frameworks like Hibernate. To prevent this you can add the @Fetch annotation to the corresponding node entity.

In general, it is good practice to define the relationship types as constants in a separate class, as these strings will be used in different code fragments in different classes. Our graph model has two relationship types between the nodes. The class looks like this:

public final class RelationshipTypes {
    public static final String CLICKED = "CLICKED";
    public static final String RECOMMEND = "RECOMMEND";
}

How to Load Node Entities and Their Corresponding Relationships?

To access the node entities and relationships, we have to create our own interface extending the Spring’s GraphRepository interface. It looks like this:

public interface ProductRepository extends GraphRepository<Product> {

    Product findByProductId(String productId);

    List<Product> findByProductNameLike(String productName);

    @Query("START product=node(*) " +
            "WHERE HAS (product.productName)" +
            "RETURN product " +
            "ORDER BY product.productName")
    List<Product> findAllProductsSortedByName();

    @Query("START product=node:Product(productId={productId}) " +
            "MATCH product-[recommend:RECOMMEND]->otherProduct " +
            "RETURN otherProduct " +
            "ORDER BY recommend.count DESC " +
            "LIMIT 5 ")
    List<Product> findOtherUsersAlsoViewedProducts(@Param("productId") String productId);

    // more queries here
    ...
}

As you can see in the code, the graph repository is an interface that only defines the function names, return values and the Cypher query (if required). This is very handy, as the framework does not need a corresponding implementation. Based on the parameters described before and the parent Interface “GraphRepository” Spring Data can create proxy objects interacting with the Neo4j core API.
For more details about the possibilities, have a look at the repository documentation. If you are not familiar with Cypher, have a look at this Cypher tutorial. Nevertheless, it may be possible that a custom implementation of your GraphRepository is necessary. In this case you can write a class that implements the interface.

What are the Limitations of Spring Data Neo4j?

During the preparation of this article we found some limitations of Spring Data Neo4j:

  • At the moment it is not possible to run Cypher queries that contain DISTINCT and ORDER BY in the same query.
  • If you have some hard to guess issues, the logging messages are not really intuitive.

What is Our Demo About?

The demo shows the graph repository part of a shop system, that features products that other users also viewed. For example, if users who visited the ‘Pizza Vegetarian’ page had a look at the ‘Pizza Rustica’ page afterwards because they were looking for a vegetarian pizza, it may also be helpful for other users to have a look at the ‘Pizza Rustica’.

You can find some test cases in the SpringDataNeo4jProductUserTest class.

What do you Need to Run this Demo Yourself?

To run the demo you need Maven installed on your computer. If you are not familiar with Maven, please read this article about how to run Maven.

The code is available at github: https://github.com/comsysto/spring-data-neo4j-showcase

What may Help if you Have Problems?

As we prepared this demo, we faced some challenges that you may also run into:

  • Be careful with the indexes. If you are using ids they should be declared as unique as in the code above.
  • If you want a relationship that has attributes, use @RelatedToVia.
  • Initalize all Sets, for example with a HashSet, as in the code above.
  • Don’t use the graphId as an id. GraphIds from deleted nodes can be reused by Neo4j and represent another object after the deletion of the first.
  • Use @Fetch for start and end node of relationships. Otherwise they will not be fetched, if an entity is loaded.
  • Don’t modify getters and setters as they are used by the framework to save values. Use extra methods for this.
  • Start Cypher queries with a simple statement and extend it until you get what you want.
  • Take care about String escaping in Cypher queries. String escaping in the Neo4j core API differs from that in Spring Data.

How can this Demo be Extended?

You can extend this demo as you wish to. For example, you can add more fields to the node entities and extend them with values for further recommendations.

Any Questions?

If you have any feedback, please write to Roger.Kowalewski@comsysto.com or Elisabeth.Engel@comsysto.com!

Where can you Learn More About Neo4j?

We are offering a Neo4j tutorial, which will take place  in our Headoffice on 14th November. A second one will be on 17th December. This tutorial covers the core functionality of the Neo4j graph database. With a mixture of theory and hands-on practice sessions, attendees will quickly learn how easy it is to develop a Neo4j-backed application. For further information please have a look at our event page. If you want to register for November´s tutorial please follow this link to eventbrite! With the promotional code “JUGM20″ you can get 20% off.

Learn more about Kotlin, a new programming language designed to solve problems that software developers face every day brought to you in partnership with JetBrains.

Topics:

Published at DZone with permission of Comsysto Gmbh, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}