DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Handling Schema Versioning and Updates in Event Streaming Platforms Without Schema Registries
  • Validating EDI Data in Java
  • Streamlining Event Data in Event-Driven Ansible
  • Beyond Linguistics: Real-Time Domain Event Mapping with WebSocket and Spring Boot

Trending

  • Cookies Revisited: A Networking Solution for Third-Party Cookies
  • Ethical AI in Agile
  • AI's Dilemma: When to Retrain and When to Unlearn?
  • Artificial Intelligence, Real Consequences: Balancing Good vs Evil AI [Infographic]
  1. DZone
  2. Data Engineering
  3. Databases
  4. Neo4j and Cypher: Using MERGE With Schema Indexes/Constraints

Neo4j and Cypher: Using MERGE With Schema Indexes/Constraints

By 
Mark Needham user avatar
Mark Needham
·
Aug. 13, 22 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
12.4K Views

Join the DZone community and get the full member experience.

Join For Free

I wrote about cypher’s MERGE function a couple of weeks ago, and over the last few days, I’ve been exploring how it works with schema indexes and unique constraints.

An Exciting Time to Be a Developer

There is so much that could be said about the merging of Neo4j and Cypher right now, but it is certainly reasonable to point out that this merger will likely result in many exciting developments in the programming world. 

Programmers virtually always appreciate it when they are given the products and tools they require to get their job done properly, and now is the time for steps like this to be taken. The fact that Neo4J and Cypher have decided to merge means that the upsides of both will soon be apparent. 

You deserve to use all of the best tools to make informed decisions about your next software project, and a great way to make it happen is to use what has been given to you regarding product functionality. This is to say that you can use both the upsides of Neo4J and Cypher to come up with the exact tools you need to make a difference in your sphere of influence.

Could Other Products Soon Merge?

There has been some strong demand for other software development products to consider merging. Coders and programmers want to use their favorite projects in exactly how they were meant to be used, and this means getting them to merge in ways that are useful to the programmers. They just want to be able to squeeze as much use out of each program as they possibly can.

You want to make sure that you can see what is going on with your codes as you are directly applying them to whichever problem you are working on at this time. To be sure, it is not an easy task, but no one ever said it would be easy. The important thing is that you get the work done so that you can start to become more productive in the coding you are doing now. 

A common use case with Neo4j is to model users and events where an event could be a tweet, Facebook post, or Pinterest pin. The model might look like this:

We’d like to ensure that we don’t get duplicate users or events, and MERGE provides the semantics to do this:

MERGE (u:User {id: {userId}})
MERGE (e:Event {id: {eventId}})
MERGE (u)-[:CREATED_EVENT]->(m)
RETURN u, e


We’d like to ensure that we don’t get duplicate users or events and MERGE provides the semantics to do this:

MERGE ensures that a pattern exists in the graph. Either the pattern already exists, or it needs to be created.

import org.neo4j.cypher.javacompat.ExecutionEngine;
import org.neo4j.cypher.javacompat.ExecutionResult;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.helpers.collection.MapUtil;
import org.neo4j.kernel.impl.util.FileUtils;
 
...
 
public class MergeTime
{
    public static void main(String[] args) throws Exception
    {
        String pathToDb = "/tmp/foo";
        FileUtils.deleteRecursively(new File(pathToDb));
 
        GraphDatabaseService db = new GraphDatabaseFactory().newEmbeddedDatabase( pathToDb );
        final ExecutionEngine engine = new ExecutionEngine( db );
 
        ExecutorService executor = Executors.newFixedThreadPool( 50 );
        final Random random = new Random();
 
        final int numberOfUsers = 10;
        final int numberOfEvents = 50;
        int iterations = 100;
        final List<Integer> userIds = generateIds( numberOfUsers );
        final List<Integer> eventIds = generateIds( numberOfEvents );
        List<Future> merges = new ArrayList<>(  );
        for ( int i = 0; i < iterations; i++ )
        {
            Integer userId = userIds.get(random.nextInt(numberOfUsers));
            Integer eventId = eventIds.get(random.nextInt(numberOfEvents));
            merges.add(executor.submit(mergeAway( engine, userId, eventId) ));
        }
 
        for ( Future merge : merges )
        {
            merge.get();
        }
 
        executor.shutdown();
 
        ExecutionResult userResult = engine.execute("MATCH (u:User) RETURN u.id as userId, COUNT(u) AS count ORDER BY userId");
 
        System.out.println(userResult.dumpToString());
 
    }
 
    private static Runnable mergeAway(final ExecutionEngine engine,
                                      final Integer userId, final Integer eventId)
    {
        return new Runnable()
        {
            @Override
            public void run()
            {
                try
                {
                    ExecutionResult result = engine.execute(
                            "MERGE (u:User {id: {userId}})\n" +
                            "MERGE (e:Event {id: {eventId}})\n" +
                            "MERGE (u)-[:CREATED_EVENT]->(m)\n" +
                            "RETURN u, e",
                            MapUtil.map( "userId", userId, "eventId", eventId) );
 
                    // throw away
                    for ( Map<String, Object> row : result ) { }
                }
                catch ( Exception e )
                {
                    e.printStackTrace();
                }
            }
        };
    }
 
    private static List<Integer> generateIds( int amount )
    {
        List<Integer> ids = new ArrayList<>();
        for ( int i = 1; i <= amount; i++ )
        {
            ids.add( i );
        }
        return ids;
    }
}


We create a maximum of 10 users and 50 events and then do 100 iterations of random (user, event) pairs with 50 concurrent threads. Afterward, we execute a query that checks how many users of each id have been created and get the following output:

+----------------+
| userId | count |
+----------------+
| 1      | 6     |
| 2      | 3     |
| 3      | 4     |
| 4      | 8     |
| 5      | 9     |
| 6      | 7     |
| 7      | 5     |
| 8      | 3     |
| 9      | 3     |
| 10     | 2     |
+----------------+
10 rows


Next, I added a schema index on users and events to see if that would make any difference, something Javad Karabi recently asked on the user group.

CREATE INDEX ON :User(id)
CREATE INDEX ON :Event(id)


We wouldn’t expect this to make a difference as schema indexes don’t ensure uniqueness, but I ran it anyway t and got the following output:

+----------------+
| userId | count |
+----------------+
| 1      | 2     |
| 2      | 9     |
| 3      | 7     |
| 4      | 2     |
| 5      | 3     |
| 6      | 7     |
| 7      | 7     |
| 8      | 6     |
| 9      | 5     |
| 10     | 3     |
+----------------+
10 rows


If we want to ensure the uniqueness of users and events, we need to add a unique constraint on the id of both of these labels:

CREATE CONSTRAINT ON (user:User) ASSERT user.id IS UNIQUE
CREATE CONSTRAINT ON (event:Event) ASSERT event.id IS UNIQUE


Now if we run the test, we’ll only end up with one of each user:

+----------------+
| userId | count |
+----------------+
| 1      | 1     |
| 2      | 1     |
| 3      | 1     |
| 4      | 1     |
| 5      | 1     |
| 6      | 1     |
| 7      | 1     |
| 8      | 1     |
| 9      | 1     |
| 10     | 1     |
+----------------+
10 rows


We’d see the same result if we ran a similar query checking for the uniqueness of events.

As far as I can tell, this duplication of nodes that we merge on only happens if you try to create the same node twice concurrently. Once the node has been created, we can use MERGE with a non-unique index, and a duplicate node won’t get created.

All the code from this post is available as a gist if you want to play around with it.

Merge (version control) Schema Neo4j Event

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Handling Schema Versioning and Updates in Event Streaming Platforms Without Schema Registries
  • Validating EDI Data in Java
  • Streamlining Event Data in Event-Driven Ansible
  • Beyond Linguistics: Real-Time Domain Event Mapping with WebSocket and Spring Boot

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!