DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 2: Understanding Neo4j
  • Spring Data Neo4j: How to Update an Entity
  • Leveraging Neo4j for Effective Identity Access Management
  • The Beginner's Guide To Understanding Graph Databases

Trending

  • Monolith: The Good, The Bad and The Ugly
  • AI Agents: A New Era for Integration Professionals
  • Go 1.24+ Native FIPS Support for Easier Compliance
  • SaaS in an Enterprise - An Implementation Roadmap
  1. DZone
  2. Data Engineering
  3. Databases
  4. Assigning UUIDs to Neo4j Nodes and Relationships

Assigning UUIDs to Neo4j Nodes and Relationships

By 
Stefan Armbruster user avatar
Stefan Armbruster
·
Aug. 22, 13 · Interview
Likes (0)
Comment
Save
Tweet
Share
11.3K Views

Join the DZone community and get the full member experience.

Join For Free

TL;DR: This blog post features a small demo project on github: neo4j-uuid and explains how to automatically assign UUIDs to nodes and relationships in Neo4j. A very brief introduction into Neo4j 1.9′s KernelExtensionFactory is included as well.

A Little Rant on Neo4j Node/Relationship IDs

In a lot of use cases there is demand for storing a reference to a Neo4j node or relationship in a third party system. The first naive idea probably is to use the internal node/relationship id that Neo4j provides. Do not do that! Ever!

You ask why? Well, Neo4j’s id is basically a offset in one of the store files Neo4j uses (with some math involved). Assume you delete couple of nodes. This produces holes in the store files that Neo4j might reclaim when creating new nodes later on. And since the id is a file offset there is a chance that the new node will have exactly the same id like the previously deleted node. If you don’t synchronously update all node id references stored elsewhere, you’re in trouble. If neo4j would be completely redeveloped from scratch the getId() method would not be part of the public API.

As long as you use node ids only inside a request of an application for example, there’s nothing wrong. To repeat myself: Never ever store a node id in a third party system. I have officially warned you.

UUIDs

Enough of ranting, let’s see what we can do to safely store node references in an external system. Basically we need an identifier that has no semantics in contrast to the node id. A common approach to this is using Universally Unique Identifiers (UUID). Java JDK offers a UUID implementation, so we could potentially use UUID.randomUUID(). Unfortunately random UUIDs are slow to generate. A preferred approach is to use the machine’s MAC and a timestamp as base for the UUID – this should provide enough uniqueness. There a nice library out there at http://wiki.fasterxml.com/JugHome providing exactly what we need.

Automatic UUID Assignments

For convenience it would be great if all fresh created nodes and relationships get automatically assigned a uuid property without doing this explicitly. Fortunately Neo4j supports TransactionEventHandlers, a callback interface pluging into transaction handling. A TransactionEventHandler has a chance to modify or veto any transaction. It’s a sharp tool which can have significant negative performance impact if used the wrong way.

I’ve implemented a UUIDTransactionEventHandler that performs the following tasks:

  • Populate a UUID property for each new node or relationship
  • Reject a transaction if a manual modification of a UUID is attempted; either assignment or removal

public class UUIDTransactionEventHandler implements TransactionEventHandler {

    public static final String UUID_PROPERTY_NAME = "uuid";

    private final TimeBasedGenerator uuidGenerator = Generators.timeBasedGenerator();

    @Override
    public Object beforeCommit(TransactionData data) throws Exception {

        checkForUuidChanges(data.removedNodeProperties(), "remove");
        checkForUuidChanges(data.assignedNodeProperties(), "assign");
        checkForUuidChanges(data.removedRelationshipProperties(), "remove");
        checkForUuidChanges(data.assignedRelationshipProperties(), "assign");

        populateUuidsFor(data.createdNodes());
        populateUuidsFor(data.createdRelationships());

        return null;
    }

    @Override
    public void afterCommit(TransactionData data, java.lang.Object state) {
    }

    @Override
    public void afterRollback(TransactionData data, java.lang.Object state) {
    }

    /**
     * @param propertyContainers set UUID property for a iterable on nodes or relationships
     */
    private void populateUuidsFor(Iterable propertyContainers) {
        for (PropertyContainer propertyContainer : propertyContainers) {
            if (!propertyContainer.hasProperty(UUID_PROPERTY_NAME)) {

                final UUID uuid = uuidGenerator.generate();
                final StringBuilder sb = new StringBuilder();
                sb.append(Long.toHexString(uuid.getMostSignificantBits())).append(Long.toHexString(uuid.getLeastSignificantBits()));

                propertyContainer.setProperty(UUID_PROPERTY_NAME, sb.toString());
            }
        }
    }

    private void checkForUuidChanges(Iterable> changeList, String action) {
        for (PropertyEntry removedProperty : changeList) {
            if (removedProperty.key().equals(UUID_PROPERTY_NAME)) {
                throw new IllegalStateException("you are not allowed to " + action + " " + UUID_PROPERTY_NAME + " properties");
            }
        }
    }

}

Setting up Using KernelExtensionFactory

There are two remaining tasks for full automation of UUID assignments:

  • We need to setup autoindexing for uuid properties to have a convenient way to look up nodes or relationships by UUID
  • We need to register UUIDTransactionEventHandler with the graph database

Since version 1.9 Neo4j has the notion of KernelExtensionFactory. Using KernelExtensionFactory you can supply a class that receives lifecycle callbacks when e.g. Neo4j is started or stopped. This is the right place for configuring autoindexing and setting up the TransactionEventHandler. Since JVM’s ServiceLoader is used KernelExtenstionFactories need to be registered in a file META-INF/services/org.neo4j.kernel.extension.KernelExtensionFactory by listing all implementations you want to use:

org.neo4j.extension.uuid.UUIDKernelExtensionFactory

KernelExtensionFactories can declare dependencies, therefore declare a inner interface (“Dependencies” in code) below that just has getters. Using proxies Neo4j will implement this class and supply you with the required dependencies. The dependencies are match on requested type, see Neo4j’s source code what classes are supported for being dependencies. KernelExtensionFactories must implement a newKernelExtension method that is supposed to return a instance of LifeCycle.

For our UUID project we return a instance of UUIDLifeCycle:


package org.neo4j.extension.uuid;

import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.PropertyContainer;
import org.neo4j.graphdb.event.TransactionEventHandler;
import org.neo4j.graphdb.factory.GraphDatabaseSettings;
import org.neo4j.graphdb.index.AutoIndexer;
import org.neo4j.graphdb.index.IndexManager;
import org.neo4j.kernel.configuration.Config;
import org.neo4j.kernel.lifecycle.LifecycleAdapter;

import java.util.Map;

/**
 * handle the setup of auto indexing for UUIDs and registers a {@link UUIDTransactionEventHandler}
 */
class UUIDLifeCycle extends LifecycleAdapter {

    private TransactionEventHandler transactionEventHandler;
    private GraphDatabaseService graphDatabaseService;
    private IndexManager indexManager;
    private Config config;

    UUIDLifeCycle(GraphDatabaseService graphDatabaseService, Config config) {
        this.graphDatabaseService = graphDatabaseService;
        this.indexManager = graphDatabaseService.index();
        this.config = config;
    }

    /**
     * since {@link org.neo4j.kernel.NodeAutoIndexerImpl#start()} is called *after* {@link org.neo4j.extension.uuid.UUIDLifeCycle#start()} it would apply config settings for auto indexing. To prevent this we change config here.
     * @throws Throwable
     */
    @Override
    public void init() throws Throwable {
        Map params = config.getParams();
        params.put(GraphDatabaseSettings.node_auto_indexing.name(), "true");
        params.put(GraphDatabaseSettings.relationship_auto_indexing.name(), "true");
        config.applyChanges(params);
    }

    @Override
    public void start() throws Throwable {
        startUUIDIndexing(indexManager.getNodeAutoIndexer());
        startUUIDIndexing(indexManager.getRelationshipAutoIndexer());
        transactionEventHandler = new UUIDTransactionEventHandler();
        graphDatabaseService.registerTransactionEventHandler(transactionEventHandler);
    }

    @Override
    public void stop() throws Throwable {
        stopUUIDIndexing(indexManager.getNodeAutoIndexer());
        stopUUIDIndexing(indexManager.getRelationshipAutoIndexer());
        graphDatabaseService.unregisterTransactionEventHandler(transactionEventHandler);
    }

    void startUUIDIndexing(AutoIndexer autoIndexer) {
        autoIndexer.startAutoIndexingProperty(UUIDTransactionEventHandler.UUID_PROPERTY_NAME);
    }

    void stopUUIDIndexing(AutoIndexer autoIndexer) {
        autoIndexer.stopAutoIndexingProperty(UUIDTransactionEventHandler.UUID_PROPERTY_NAME);
    }
}

Most of the code is pretty much straight forward, l.44/45 set up autoindexing for uuid property. l48 registers the UUIDTransactionEventHandler with the graph database. Not that obvious is the code in the init() method. Neo4j’s NodeAutoIndexerImpl configures autoindexing itself and switches it on or off depending on the respective config option. However we want to have autoindexing always switched on. Unfortunately NodeAutoIndexerImpl is run after our code and overrides our settings. That’s we l.37-40 tweaks the config settings to force nice behaviour of NodeAutoIndexerImpl.

Looking up Nodes or Relationships for UUID

For completeness the project also contains a trivial unmanaged extension for looking up nodes and relationships using the REST interface, see UUIDRestInterface. By sending a HTTP GET to http://localhost:7474/db/data/node/<myuuid> the node’s internal id returned.

Build System and Testing

For building the project, Gradle is used;  build.gradle is trivial. Of course couple of tests are included. As a long standing addict I’ve obviously used Spock for testing. See the test code here.

Final Words

A downside of this implementation is that each and every node and relationships gets indexed. Indexing always trades write performance for read performance. Keep that in mind. It might make sense to get rid of unconditional auto indexing and put some domain knowledge into the TransactionEventHandler to assign only those nodes uuids and index them that are really used for storing in an external system.

Neo4j

Published at DZone with permission of Stefan Armbruster, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 2: Understanding Neo4j
  • Spring Data Neo4j: How to Update an Entity
  • Leveraging Neo4j for Effective Identity Access Management
  • The Beginner's Guide To Understanding Graph Databases

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!