Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Neo4j Applied: Modeling with Spring Data Neo4j

DZone's Guide to

Neo4j Applied: Modeling with Spring Data Neo4j

· Big Data Zone
Free Resource

Effortlessly power IoT, predictive analytics, and machine learning applications with an elastic, resilient data infrastructure. Learn how with Mesosphere DC/OS.

Neo4j Applied: Modeling with SDN

By Jonas Partner, Aleksa Vukotic, and Nicki Watts, authors of Neo4j in Action

The Spring Data Neo4j (SDN) framework was created to make life easier for developers like yourself who need to work with POJO-based domain models with data backed by Neo4j. It aims to deal with, and shield you, from all the low-level plumbing and mapping logic surrounding reading and writing domain entities into and out of Neo4j, freeing you to focus on the important aspect of writing the code that makes you (or your company) money—namely, the business logic. In this article, based on chapter 8 of Neo4j in Action, the authors explain domain modeling with SDN.

In this article, we'll be stepping through a process to show you how SDN can be used to transform POJOs to represent the various entities within your movie-lovers' social network, where the data is also stored in your trusty Neo4j database.

Your journey will be as follows:

  1. Define a standard POJO object model to represent your domain.
  2. See what is required by SDN to transform these POJOs into entities backed by Neo4j.
  3. Dig a little deeper into various elements of SDN modeling, including:
    • Modeling node entities
    • Modeling relationship entities
    • Modeling relationships between node entities

To recap, in this social network, friends can be friends with each other. Users can also mark the movies they've seen, and rate them with one to five stars based on how much they liked them. You're going to add a userId property that users can use to login to the system. This will also allow you to able to uniquely identify and refer to each user. Finally, you're also going to add the ability for new users to indicate whether an existing member was the reason he or she is joining the system—that is, whether he or she was referred by anyone at joining time. This could be used, for example, to accumulate points by the referrer for each movie rated by new members within their first month, potentially leading to a free movie ticket or some other benefit. Figure 1 illustrates that John originally joined the network because he was referred by David.

Figure 1 Conceptual overview of movie-lovers' social network with referrals

Initial POJO domain modeling

If you ignore the fact for a moment that your data is actually being stored in Neo4j, and take a simple stab at modeling your conceptual domain as POJOs, your first attempt may look something like the following listing.

Listing 1 Initial POJO modeling attempt

public class User {
     String userId;
	String name;
	Set friends;        					#1
	Set views;       					#A
	User referredBy;          					#2
} 

public class Movie {
	String title;
	Set views;       					#B
} 

public class Viewing {
	User user;
	Movie movie;
	Integer stars;							#C
}

#1 All user's friends; getters, setters, and constructors are omitted for brevity
#A All movie viewings (with star ratings) that user has performed
#2 User who referred this user to the system
#B All users who have viewed this movie (with star ratings)
#C Rating (one to five stars) user gave for associated movie

There is nothing too complicated here—this is basic object modeling. Both the User and Movie concepts have been modeled as first-class entities that seem reasonable enough. You've also modeled a user's viewing of a particular movie as an entity as well, namely the Viewing class. This is primarily because the relationship itself holds some important information that you want to retain and use, namely the stars rating provided by the user. If you had modeled this as a simple collection type relationship between User and Movie, you would lose this information.

Note at this point there is still no reference to any SDN- or Neo4j-specific concepts, just POJO stuff. Next, you'll need to map this entity into the underlying Neo4j graph model. So how close have you gotten to creating a POJO model that is easily translatable into your Neo4j world? Can you use it as is, or does it need any modifications? In this particular case you appear to have a good fit, with the User and Movie classes mapping neatly into the Neo4j nodes primitive concept with an associated name and title property, respectively.

The new referredBy relationship (#2) is represented as a reference to the user who did the referring, while the IS_FRIEND_OF relationship (#1) between users maps nicely to the set of friends. The only tricky part seems to be the modeling surrounding the Viewing class, which is trying to represent the scenario where a user has seen a movie and optionally rated it. Further inspection, however, reveals that this also fits perfectly into the Neo4j relationship concept. The Viewing class represents the HAS_SEEN relationship with its optional stars property, as well as the User who viewed the movie and the Movie reference itself.

So far so good; now it is time to actually do the mapping with SDN!

NOTE It will not always be possible to find a logical POJO model that is so closely tied to the physical Neo4j structure. In this case, you were quite fortunate, but in other cases you may have to adapt your model to fit as needed. In any case, what it does highlight is how general POJO modeling concepts do actually translate relatively well into Neo4j structures.

Annotating the domain model

SDN is an annotation-based object graph-mapping library. This means that it is a library that relies on being able to recognize certain SDN-specific annotations scattered throughout your code. These annotations are then treated as nuggets of information (that is, metadata) with instructions on how to transform the various parts of code they're attached to, to their underlying structure in the graph. Sometimes you may even find that you don't need to annotate certain pieces of code. This is because in such cases, SDN tries to infer some sensible defaults, applying the principle of convention over configuration. So object graph mapping is to graphs what ORM is to an RDBMS!

The next listing shows the next step in the process where various SDN annotations have been added to the POJOs to identify them as entities backed by the Neo4j.

Listing 2 SDN domain model

@NodeEntity          							#A 
public class User {
    
    String name;								#B
    @Indexed									#B
    @GraphProperty							     #B, #C
    String userId;								#B

    @GraphId									#D
    Long nodeId;								#D

    User referredBy;						   	#E
    @RelatedTo(type = "IS_FRIEND_OF", direction = Direction.BOTH)   	#E
    Set friends;						   	#E
    @RelatedToVia							   	#E
    	Set views;						        #E
}

@NodeEntity     		     						#A 
public class Movie {
    
    String title;								#B
    @GraphId									#D
    Long nodeId;								#D

@RelatedToVia(direction = Direction.INCOMING)			#E
Iterable views; 						#E
}

@RelationshipEntity(type = "HAS_SEEN")    					#F
public class Viewing {

    Integer stars;								#G
    @GraphId									#H
    Long relationshipId;							#H

    @StartNode								#I
    User user;								#I
    @EndNode									#I
    Movie movie;							        #I
}

#A Maps to Neo4j node
#B Stored as node properties within graph (where no annotation is present, SDN assumes name of Java field will be same as that stored as property on node) 
#C Optional annotation indicating this is a property on a node
#D Neo4j node ID
#E Relationships to other node entities involving this node (where no annotation is present, SDN assumes name of Java field will be same as that stored of relationship)
#F Maps to Neo4j relationship
#G Stored as relationship properties within graph
#H Neo4j relationship ID
#I References to node entities on either side of relationship

These annotations, along with sensible defaults assumed by SDN based on field names, and so on, directly tie elements of your Java class to physical entities in Neo4j. This means it is imperative you have a very good understanding of your Neo4j data model when modeling with SDN. Although SDN shields you from having to do the actual low-level mappings yourself, it expects you to be able to describe how it should be done!

The sections that follow look to break down listing 2 a bit more and explain some of the core modeling concepts in more detail. We'll be covering:

  • Modeling node entities
  • Modeling relationship entities
  • Modeling relationships between node entities

Modeling node entities

Within SDN, a node entity refers to a Java class that is being used to represent a particular domain entity that is ultimately represented and backed by a Neo4j node primitive in the underlying graph database. Figure 8.2 highlights some candidate nodes within your social network domain model that could be modeled as SDN node entities.

Figure 2 Social network model with nodes highlighted

The Movie and User classes are perfect examples here. The @NodeEntity annotation is used to mark a class as a node entity, generally being placed just before the Java class definition as shown in listing 2 and in the following snippet:

@NodeEntity 						#A         	
public class User { 

    String name; 					#1 
    
    @Indexed                                      #2
    @GraphProperty                                #3            
    String userId;                                #3
    @GraphId						#4
    Long nodeId;                                  #4
    . . .
}

#A Cass backed by a Neo4j node
#1 Maps to name property on underlying node
#2 Ensures userId field is indexed
#3 Optional annotation, maps to userId property on underlying node
#4 Property containing Neo4j node ID

Properties

Within a class annotated with @NodeEntity, by default SDN will treat all simple fields as being representative of Neo4j properties on the backing node. In this context, simple is any primitive or its associated wrapper class, strings, or objects capable of being converted to a string through a Spring conversion service. Collections of any of these previously mentioned types are also included, where these collections are ultimately stored as an array on the Neo4j node. In the preceding domain example, this means that the name (#1) and title fields defined on the User and Movie classes, respectively, will get mapped as Neo4j properties with those same names without you needing to lift a finger! Having said that, you can optionally annotate fields with @GraphProperty if you'd like to be more explicit (as you've done with the userId field at #3); however, it is not strictly necessary.

What about custom property types?
The core Spring framework comes with a general type conversion system that allows you to write and register custom type conversion logic that can transform a specific (nonprimitive) class to a string representation and back again. SDN has already registered some conversion services for you that handle enums and dates. So if, for example, you defined a java.util.Date field on a node entity when the entity is due to get persisted, Spring recognizes that there is a custom converter defined for date fields and uses it to convert the Date property to a string that is then stored in the graph; the reverse occurs when reading the data out of the graph and back into the node entity field.

This means that you can take advantage of this same mechanism to handle any custom class types that you may want to use, say, perhaps, a phone number object. You'll need to write a custom converter and then register this converter with Spring. For more details on how to do this, refer to http://static.springsource.org/spring/docs/current/spring-framework-reference/html/validation.html#core-convert.

Note if you do not register a converter, SDN will still save a string version of your object into the database, however, it will simply be based on whatever is returned from the toString() method.

Now you'll probably have noticed that an additional field appears to have snuck in to both the User and Movie node entities, namely nodeId, annotated with @GraphId. This is a mandatory SDN requirement when using simple object mapping. Without going into too much detail just yet, simple object mapping is one of the strategies employed by SDN to perform the physical mapping between your domain entity and backed graph entity. Using this strategy, SDN needs to have a field where it can store the underlying ID of the node backing the entity. The @GraphId annotation (#4) serves to mark the field that you've set aside for SDN to store this value in.

Indexed properties

As it is generally considered bad practice for the application to rely on the Neo4j node ID as the external unique identifier of an object, it is important to be able to have some other way of looking up a node. The addition of the @Indexed annotation (#2) on the userId field ensures that lookups and queries can be performed against this user based on this field.

Under what index names are these annotated properties stored?
By default, SDN will create an exact lookup index per entity under the simple (not fully qualified) name of the class (User in this case). If you wanted to look up a user (say, John with user ID john001) using native code, it would be coded as follows:

    IndexManager indexManager = aGraphDatabaseService.index();        
    Index<Node> userIdIndex = indexManager.forNodes("User");
    IndexHits<Node>indexHits = userIdIndex.get("userId","john001");
    Node johnNode = indexHits.getSingle();

So be careful in how you name and structure your domain entities. It is generally not advisable to have two domain entities with the same name, even if they're in separate packages due to this default indexing behavior. A bad example of this may be to have one User domain entity under a "core" package and another under "admin." Though it is possible to override this default behavior by setting the indexName attribute within the @Indexed annotation, it can be confusing to those who may not be aware of this behavior and could result in unforeseen results if you're not careful.

Relationships to other entities

Finally, there are also the relationships between node entities. You'll be pleased to learn that these are also simply mapped as reference fields on the entity. However, before we get into the details for how this is done, you need to learn about relationship entities.

Modeling relationship entities

A relationship entity is simply the relationship version of a node entity. It refers to a Java class that is ultimately represented and backed by a Neo4j relationship in the underlying graph database. Neo4j treats relationships as first-class citizens, which, just like nodes, can have their own identifier and set of properties if required. Thus, SDN also allows them to be represented as entities in their own right.

Figure 3 Social network model with relationships highlighted that could potentially be modeled as relationship entities

In this section, we'll cover what is required to model the physical Neo4j relationship entity as a POJO itself, while in the next section, we'll show what is required from the node entity's perspective to refer to other node entities through these modeled relationships, as well as through more simple mechanisms.

NOTE There will probably be many relationships defined within your physical Neo4j model, but this does not automatically mean that all of them need to be modeled as relationship entities in SDN!

SDN relationship entities are generally only required for relationships that have their own set of properties, and that, together with these properties, provide context to the relationship. We'll refer to these relationships as rich relationships because of the additional data they contain over and above the relationship type.

The HAS_SEEN relationship is a perfect example of this, with its additional stars property providing more context to the relationship, which indicates not just that a user has seen a movie, but also how he or she may have rated it. In the social network model, this relationship with all its associated information has been defined as the Viewing class, as shown in listing 3. Contrast this to the IS_FRIEND_OF relationship, which alone is all that is required to understand the relationship between two users—that is, that they're friends. These simple type relationships (that is, the IS_FRIEND_OF relationships) can still be referenced, and you'll see how this is done in the next section, but there is no additional benefit in defining a whole new class to represent it.

Listing 3 Viewing class: relationship entity

@RelationshipEntity(type = "HAS_SEEN")	               #A
public class Viewing {
    @GraphId							#1
    Long relationshipId;					#1

    @StartNode						#2
    User user;						#2

    @EndNode							#3
    Movie movie;						#3

    Integer stars; 						#B
}

#A Class is backed by a Neo4j relationship
#1 Property containing Neo4j relationship ID
#2 User node from which relationship originates (outgoing)
#3 Movie node at which relationship ends (incoming)
#B Maps to stars property on underlying relationship

The @RelationshipEntity annotation is applied to classes to indicate that it represents a Neo4j relationship. The annotation takes a type property used to indicate the name of the Neo4j relationship type used internally. If not specified, the name of the class will be considered to be the physical name used within Neo4j as well. As with the node entity, the relationship entity has the same requirement for a @GraphId annotated field (#1), this time for storing the underlying relationship ID.

If you'd like to access the node entities on either side of this relationship you'll need to provide a field for each of these and annotate them with @StartNode (#2) and @EndNode (#3). For the Viewing class example, the User node entity starts (has the outgoing) relationship to the ending Movie entity.

In terms of what is required to model a Neo4j relationship, that is it. There is, however, generally not much point in defining relationship entity classes in isolation. They're almost always referred to through one or more fields on associated node entities. In the next section, we go in to detail with the whole area of how node entities can refer to other node entities, through simple references, but also through POJO-modeled relationships. To provide full context for this example, however, listing 4 provides a sneak preview of how the User node entity, as well as the Movie node entity, refer to the Viewing relationship entity class through their views field.

Listing 4 User and Movie node entity snippets

@NodeEntity
public class User {
    @RelatedToVia								#1
    Set views; 							#1
    . . .

@NodeEntity
public class Movie {
    @RelatedToVia(direction = Direction.INCOMING)    			#2
    Iterable views;                                            #2

#1 For each HAS_SEEN relationship between user and all movies he or she has seen, the Viewing class contains relationship information
#2 For each HAS_SEEN relationship between movie and all users who've seen it, the Viewing class contains relationship information

Within the User node entity, the RelatedToVia annotation (#1) on the views field essentially reads as "all the HAS_SEEN relationships (with any associated properties) between this user and any movies he or she has watched." (The HAS_SEEN relationship type is inferred because that is what is defined on the Viewing class itself.) The Viewing class represents the full context of the relationship between these two entities including the rating field.

Within the Movie node entity, the RelatedToVia annotation (#2) marks the views field as representative of "all the HAS_SEEN relationships (with any associated properties) between this movie and any users who have actually watched it."

In both cases the Viewing class serves to provide that extra context to the relationship that details the rating of each viewing, and can thus be thought of as the object through which (via) the relationship is fully understood by both sides.

In the next section we continue to detail some of the finer points around defining a variety of different types of relationships between node entities, including where the rich relationship details are required to fully understand the context.

Modeling relationships between node entities

Being able to model node and relationship entities with their simple associated properties in isolation will only take you so far. Models start getting interesting when you're able to actually connect them to explore the relationships between them, and in this section, we'll cover how to do that.

The end of the previous section provided a sneak preview of how such a connection was established between the User and Movie entities through the Viewing relationship entity. You saw how the HAS_SEEN relationship between Users and Movies was modeled as a physical POJO (Viewing class) and then referred to from the entities. In that particular case, a whole separate class (Viewing) was used to represent the relationship in context. But what about other, more simple relationships such as "John is a friend of Jack"? Do you also need a dedicated relationship entity class for such cases? You'll be pleased to know the answer is no—they can be dealt with in a much more simple manner. Figure 4 recaps the relationships between node entities that you're potentially interested in referencing from node entities.

Figure 4 Social network model with relationship references between nodes highlighted

From the node entity's perspective, relationships are also simply modeled as normal Java object references and come in a variety of flavors depending on exactly what it is you're trying to convey. We've already previewed how you can use the Viewing class to reference the HAS_SEEN relationship, but let's tackle the other simpler relationships as well, such as referredBy and IS_FRIEND_OF.

The relationships involved on the User and Movie entities are shown in the next listing.

Listing 5 User and Movie node entity snippets

public class User {
    User referredBy;                                                   #1
    @RelatedTo(type = "IS_FRIEND_OF", direction = Direction.BOTH)      #2
    Set friends; 						      #2
    @RelatedToVia                                                      #3
    Set views;                                                #3
    . . .

public class Movie {
    @RelatedToVia(direction = Direction.INCOMING)                      #4
    Iterable views;                                           #4
    . . .

#1 User (node entity) associated with this node through referredBy relationship
#2 Users (node entities) associated with this node through IS_FRIEND_OF relationship
#3 Relationship information (relationship entity) that exists between this user and movies (node entities) he or she viewed via HAS_SEEN relationship
#4 Read-only relationship information (relationship entity) that exists between this movie and users (node entities) who rated it via HAS_SEEN relationship

Basic relationships, defined as being represented by an underlying Neo4j relationship with no associated properties, to exactly zero or one other node entity can be modeled as a standard object reference within the node entity. By default, the property name will be used as the name to map to the relationship type in Neo4j in the absence of any meta-information to the contrary. The newly introduced concept of referrals of one user to another (modeled by the referredBy property at #1) is an excellent example of this. Note that this property must be a reference to another node entity.

Basic relationships to zero or more other node entities are either modeled with a Set, List, Collection, or Iterable class, with the referenced node entity as the collection type. The use of a Set, List, or Collection class signifies that the field is modifiable from the containing node's perspective, while an Iterable class indicates this should be treated as read-only. Based on the contained node entity class type and its annotations, SDN will be able to work out that your intention is for this field to represent a basic relationship. If, however, you'd like to overwrite any of the defaults inferred, you can add a @RelatedTo annotation. The friends relationships (#2) between users is an example. Note how in this case we added the @RelatedTo annotation to specify the underlying Neo4j relationship type as IS_FRIEND_OF rather than the default value that SDN would have inferred if it were not there—that is, the name of the referenced field, friends.

Rich relationships, defined as being represented by an underlying Neo4j relationship with associated properties, are also modeled with the same Collection class as basic relationships. In this case, however, the type of the contained entity in the collection is a relationship entity rather than a node entity. To recap, a relationship entity represents the underlying Neo4j relationship along with any associated properties that were also modeled in the entity. This provides a neat way to access the rich information on the relationship itself, while still being able to get to the entity or entities on the other end. As with basic relationships, without any annotations, SDN can work out that you're creating such references based solely on the fact that the type of class contained in the collection has been defined as a relationship entity. Again, if you wish to override any of these relationship defaults assumed by SDN, you can apply the @RelatedToVia annotation. As you've already seen, the views field reference (#3) representing the relationship between a User and a Movie is a good example here as the additional stars rating serves to enhance the information of the relationship between these two entities. Notice the different Collection class used for the views property in the case of the User and Movie nodes, namely Setand Iterable, respectively. This means that the views property can be modified from the User node perspective but not from the Movie node. Conceptually, users rate movies, movies don't apply ratings to users!

Both the @RelatedTo (#3) and @RelatedToVia (#4) annotations can take a type and optional direction element that can be used, specified to clarify whether relationships of a particular direction (valid values are INCOMING, BOTH, and the default OUTGOING) should be included in the collection of entities returned.

Note that for the friends property you need to specify the direction as BOTH as shown in the following snippet:

@RelatedTo(type = "IS_FRIEND_OF", direction = Direction.BOTH)
    Set friends;

Logically, a friends relationship is bidirectional, physically within Neo4j, however, relationships are only stored in one direction. By specifying BOTH you're telling Neo4j to consider any IS_FRIENDS_OF relationships associated with the user regardless of which direction the physical relationship is defined.

Summary

We introduced you to the Spring Data Neo4j (SDN) framework that provides you with a variety of tools to be able to operate with a standard POJO-based domain model backed by the powerful Neo4j database.

Here are some other Manning titles you might be interested in:

Hadoop in Action

Hadoop in Action
Chuck Lam

Mahout in Action

Mahout in Action
Sean Owen, Robin Anil, Ted Dunning, and Ellen Friedman

Tika in Action

Tika in Action
Chris A. Mattmann and Jukka L. Zitting

Save 50% on Neo4j in ActionMondrian in Action, and CMIS and Apache Chemistry with promo code dzwkd3 only at manning.com. Offer expires midnight, August 1st EST.

Learn to design and build better data-rich applications with this free eBook from O’Reilly. Brought to you by Mesosphere DC/OS.

Topics:

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}