DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Spring Data Neo4j: How to Update an Entity
  • Leveraging Neo4j for Effective Identity Access Management
  • The Beginner's Guide To Understanding Graph Databases
  • Externalize Microservice Configuration With Spring Cloud Config

Trending

  • Monoliths, REST, and Spring Boot Sidecars: A Real Modernization Playbook
  • Securing the Future: Best Practices for Privacy and Data Governance in LLMOps
  • Is Big Data Dying?
  • Bridging UI, DevOps, and AI: A Full-Stack Engineer’s Approach to Resilient Systems
  1. DZone
  2. Data Engineering
  3. Databases
  4. neo4j: Extracting a subgraph as an adjacency matrix and calculating eigenvector centrality with JBLAS

neo4j: Extracting a subgraph as an adjacency matrix and calculating eigenvector centrality with JBLAS

By 
Mark Needham user avatar
Mark Needham
·
Aug. 12, 13 · Interview
Likes (0)
Comment
Save
Tweet
Share
5.4K Views

Join the DZone community and get the full member experience.

Join For Free

Earlier in the week I wrote a blog post showing how to calculate the eigenvector centrality of an adjacency matrix using JBLAS and the next step was to work out the eigenvector centrality of a neo4j sub graph.

There were 3 steps involved in doing this:

  1. Export the neo4j sub graph as an adjacency matrix
  2. Run JBLAS over it to get eigenvector centrality scores for each node
  3. Write those scores back into neo4j

I decided to make use of the Paul Revere data set from Kieran Healy’s blog post which consists of people and groups that they had membership of.

The script to import the data is on my fork of the revere repository.

Having imported the data the next step was to write a cypher query which would give me the people in anadjacency matrix with the number in each column/row intersection showing how many common groups that pair of people had.

I thought it’d be easier to build this query incrementally so I started out writing a query which would return one row of the adjacency matrix:

MATCH p1:Person, p2:Person
WHERE p1.name = "Paul Revere"
WITH p1, p2
MATCH p = p1-[?:MEMBER_OF]->()<-[?:MEMBER_OF]-p2
 
WITH p1.name AS p1, p2.name AS p2, COUNT(p) AS links
ORDER BY p2
RETURN p1, COLLECT(links) AS row

Here we start with Paul Revere and then find the relationships between him and every other person by way of a common group membership.

We use an optional relationship since we need to include a value in each column/row of our adjacency matrix we need to return a 0 value for anyone he doesn’t intersect with.

If we run that query we get back the following:

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| p1            | row                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Paul Revere" | [2,1,1,1,1,1,1,1,1,1,1,1,1,1,2,3,1,1,1,1,1,1,3,3,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,3,2,1,1,2,1,2,1,1,1,1,1,0,1,1,1,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,2,1,1,1,1,1,1,2,1,3,1,3,2,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,0,1,0,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,1,1,1,1,1,2,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,3,1,1,2,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,1,2,1,1,1,1,1,1,1,1,3,1,1,1,1,3,1,1,1,1,0,1,2,1,1,1,1,1,1,1] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

As it turns outs we’ve only got to remove the WHERE clause and order everybody and we’ve get the adjacency matrix for everyone:

MATCH p1:Person, p2:Person
WITH p1, p2
MATCH p = p1-[?:MEMBER_OF]->()<-[?:MEMBER_OF]-p2
 
WITH p1.name AS p1, p2.name AS p2, COUNT(p) AS links
ORDER BY p2
RETURN p1, COLLECT(links) AS row
ORDER BY p1
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| p1                      | row                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Abiel Ruddock"         | [0,1,1,1,0,1,0,1,0,0,1,1,1,0,1,2,0,1,0,1,1,1,2,2,1,0,0,1,1,0,1,1,1,1,1,0,0,0,0,1,1,0,0,2,2,0,0,1,1,2,1,1,1,0,1,0,1,1,0,0,2,1,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,1,1,0,1,1,1,1,1,1,1,1,1,0,2,1,2,1,0,0,0,0,1,1,0,1,0,0,1,0,2,0,0,1,0,0,0,1,0,0,2,0,1,0,1,1,1,0,0,1,1,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,1,1,0,1,1,1,2,0,0,1,1,0,0,2,0,1,2,1,1,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,1,2,1,0,1,1,1,1,1,0,0,1,1,0,0,0,0,1,0,1,1,0,0,1,0,0,2,1,0,0,1,1,1,1,0,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,1,0,0,1,0,2,1,1,0,0,2,0,1,0,0,0,0,1,0,1,0,1,0,1,0] |
| "Abraham Hunt"          | [1,0,1,1,0,1,0,0,0,0,0,1,0,0,0,1,0,1,0,1,1,0,1,1,0,0,0,1,1,0,1,0,0,1,0,0,0,0,0,1,0,0,0,1,1,0,0,0,1,1,1,1,1,0,0,0,1,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,1,1,0,1,0,1,1,1,1,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,0,0,0,1,0,0,1,0,1,1,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,1,0,1,0,1,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0,0,1,1,1,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,1,0] |
...
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
254 rows
9897 ms

The next step was to wire up the query results with the JBLAS code that I wrote in the previous post. I ended up with the following:

public class Neo4jAdjacencyMatrixSpike {
    public static void main(String[] args) throws SQLException {
        ClientResponse response = client()
                .resource("http://localhost:7474/db/data/cypher")
                .entity(queryAsJson(), MediaType.APPLICATION_JSON)
                .accept(MediaType.APPLICATION_JSON)
                .post(ClientResponse.class);
 
        JsonNode result = response.getEntity(JsonNode.class);
        ArrayNode rows = (ArrayNode) result.get("data");
 
        List<Double> principalEigenvector = JBLASSpike.getPrincipalEigenvector(new DoubleMatrix(asMatrix(rows)));
 
        List<Person> people = asPeople(rows);
        updatePeopleWithEigenvector(people, principalEigenvector);
 
        System.out.println(sort(people).take(10));
    }
 
    private static double[][] asMatrix(ArrayNode rows) {
        double[][] matrix = new double[rows.size()][254];
        int rowCount = 0;
 
        for (JsonNode row : rows) {
            ArrayNode matrixRow = (ArrayNode) row.get(2);
 
            double[] rowInMatrix = new double[254];
            matrix[rowCount] = rowInMatrix;
            int columnCount = 0;
            for (JsonNode jsonNode : matrixRow) {
                matrix[rowCount][columnCount] = jsonNode.asInt();
                columnCount++;
            }
 
            rowCount++;
        }
        return matrix;
    }
 
    // rest cut for brevity
}

Here we are taking the query and then converting it into an array of arrays before passing it to our JBLAS code to calculate the principal eigenvector. We then return the top 10 people:

Person{name='William Cooper', eigenvector=0.172604992239612, nodeId=68},
Person{name='Nathaniel Barber', eigenvector=0.17260499223961198, nodeId=18},
Person{name='John Hoffins', eigenvector=0.17260499223961195, nodeId=118},
Person{name='Paul Revere', eigenvector=0.17171142003936804, nodeId=207},
Person{name='Caleb Davis', eigenvector=0.16383970722169897, nodeId=71},
Person{name='Caleb Hopkins', eigenvector=0.16383970722169897, nodeId=121},
Person{name='Henry Bass', eigenvector=0.16383970722169897, nodeId=21},
Person{name='Thomas Chase', eigenvector=0.16383970722169897, nodeId=54},
Person{name='William Greenleaf', eigenvector=0.16383970722169897, nodeId=104},
Person{name='Edward Proctor', eigenvector=0.15600043886738055, nodeId=201}

I get back the same 10 people as Kieran Healy although they have different eigenvector values. As far as I understand the absolute value doesn’t matter, what’s more important is the relative score to other people so I think we’re ok.

The final step was to write these eigenvector values back into neo4j which we can do with the following code:

    private static void updateNeo4jWithEigenvectors(List<Person> people) {
        for (Person person : people) {
            ObjectNode request = JsonNodeFactory.instance.objectNode();
            request.put("query", "START p = node({nodeId}) SET p.eigenvectorCentrality={value}");
 
            ObjectNode params = JsonNodeFactory.instance.objectNode();
            params.put("nodeId", person.nodeId);
            params.put("value", person.eigenvector);
 
            request.put("params", params);
 
            client()
                    .resource("http://localhost:7474/db/data/cypher")
                    .entity(request, MediaType.APPLICATION_JSON)
                    .accept(MediaType.APPLICATION_JSON)
                    .post(ClientResponse.class);
        }
    }

Now we might use that eigenvector centrality value in other queries, such as one to show who the most central/potentially influential people are in each group:

MATCH g:Group<-[:MEMBER_OF]-p
 
WITH g.name AS group, p.name AS personName, p.eigenvectorCentrality as eigen
ORDER BY eigen DESC
 
WITH group, COLLECT(personName) AS people
RETURN group, HEAD(people) + [HEAD(TAIL(people))] + [HEAD(TAIL(TAIL(people)))] AS mostCentral
+--------------------------------------------------------------------------+
| group             | mostCentral                                          |
+--------------------------------------------------------------------------+
| "StAndrewsLodge"  | ["Paul Revere","Joseph Warren","Thomas Urann"]       |
| "BostonCommittee" | ["William Cooper","Nathaniel Barber","John Hoffins"] |
| "LoyalNine"       | ["Caleb Hopkins","William Greenleaf","Caleb Davis"]  |
| "LondonEnemies"   | ["William Cooper","Nathaniel Barber","John Hoffins"] |
| "LongRoomClub"    | ["Paul Revere","John Hancock","Benjamin Clarke"]     |
| "NorthCaucus"     | ["William Cooper","Nathaniel Barber","John Hoffins"] |
| "TeaParty"        | ["William Cooper","Nathaniel Barber","John Hoffins"] |
+--------------------------------------------------------------------------+
7 rows
280 ms

Our top ten feature frequently although it’s interesting that only one of them is in the ‘LongRoomClub’ group which perhaps indicates that people in that group are less likely to be members of the other ones.

I’d be interested if anyone can think of other potential uses for eigenvector centrality once we’ve got it back in the graph.

All the code described in this post is on github if you want to take it for a spin.

Matrix (protocol) Neo4j Database

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Spring Data Neo4j: How to Update an Entity
  • Leveraging Neo4j for Effective Identity Access Management
  • The Beginner's Guide To Understanding Graph Databases
  • Externalize Microservice Configuration With Spring Cloud Config

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!