DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Related

  • Spring Data Neo4j: How to Update an Entity
  • Leveraging Neo4j for Effective Identity Access Management
  • The Beginner's Guide To Understanding Graph Databases
  • Externalize Microservice Configuration With Spring Cloud Config

Trending

  • Scrum Smarter, Not Louder: AI Prompts Every Developer Should Steal
  • Complete Guide: Managing Multiple GitHub Accounts on One System
  • Lessons Learned in Test-Driven Development
  • How to Marry MDC With Spring Integration
  1. DZone
  2. Data Engineering
  3. Databases
  4. Neo4j: How Do Null Values Even Work?

Neo4j: How Do Null Values Even Work?

Importing CSVs is a great timesaver, but how do you get around the null values lurking within? Fortunately, we can work our way toward a query that handles them in Neo4j.

By 
Mark Needham user avatar
Mark Needham
·
Feb. 24, 17 · Tutorial
Likes (5)
Comment
Save
Tweet
Share
7.2K Views

Join the DZone community and get the full member experience.

Join For Free

Every now and then I find myself wanting to import a CSV file into Neo4j and I always get confused with how to handle the various null values that can lurk within.

Let’s start with an example that doesn’t have a CSV file in sight. Consider the following list and my attempt to only return null values:

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
WITH value WHERE value = null
RETURN value
 
(no changes, no records)


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value WITH value WHERE value = null RETURN value (no changes, no records)

Hmm, that’s weird. I’d have expected that at least keep the first value in the collection. What about if we do the inverse?

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
WITH value WHERE value <> null
RETURN value
 
(no changes, no records)


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value WITH value WHERE value <> null RETURN value (no changes, no records)

Still nothing! Let’s try returning the output of our comparisons rather than filtering rows:

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
RETURN value = null AS outcome
 
╒═══════╤═════════╕
│"value"│"outcome"│
╞═══════╪═════════╡
│null   │null     │
├───────┼─────────┤
│"null" │null     │
├───────┼─────────┤
│""     │null     │
├───────┼─────────┤
│"Mark" │null     │
└───────┴─────────┘


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value RETURN value = null AS outcome ╒═══════╤═════════╕ │"value"│"outcome"│ ╞═══════╪═════════╡ │null │null │ ├───────┼─────────┤ │"null" │null │ ├───────┼─────────┤ │"" │null │ ├───────┼─────────┤ │"Mark" │null │ └───────┴─────────┘

Ok, so that isn’t what we expected. Everything has an ‘outcome’ of ‘null’! What about if we want to check whether the value is the string “Mark”?

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
RETURN value = "Mark" AS outcome
 
╒═══════╤═════════╕
│"value"│"outcome"│
╞═══════╪═════════╡
│null   │null     │
├───────┼─────────┤
│"null" │false    │
├───────┼─────────┤
│""     │false    │
├───────┼─────────┤
│"Mark" │true     │
└───────┴─────────┘


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value RETURN value = "Mark" AS outcome ╒═══════╤═════════╕ │"value"│"outcome"│ ╞═══════╪═════════╡ │null │null │ ├───────┼─────────┤ │"null" │false │ ├───────┼─────────┤ │"" │false │ ├───────┼─────────┤ │"Mark" │true │ └───────┴─────────┘

From executing this query, we learn that if one side of a comparison is null then the return value is always going to be null.So how do we exclude a row if it’s null? It turns out we have to use the ‘is’ keyword rather than using the equality operator. Let’s see what that looks like:

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
WITH value WHERE value is null
RETURN value
 
╒═══════╕
│"value"│
╞═══════╡
│null   │
└───────┘


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value WITH value WHERE value is null RETURN value ╒═══════╕ │"value"│ ╞═══════╡ │null │ └───────┘

And the positive case:

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
WITH value WHERE value is not null
RETURN value
 
╒═══════╕
│"value"│
╞═══════╡
│"null" │
├───────┤
│""     │
├───────┤
│"Mark" │
└───────┘


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value WITH value WHERE value is not null RETURN value ╒═══════╕ │"value"│ ╞═══════╡ │"null" │ ├───────┤ │"" │ ├───────┤ │"Mark" │ └───────┘

What if we want to get rid of empty strings?

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
WITH value WHERE value <> ""
RETURN value
 
╒═══════╕
│"value"│
╞═══════╡
│"null" │
├───────┤
│"Mark" │
└───────┘


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value WITH value WHERE value <>"" RETURN value ╒═══════╕ │"value"│ ╞═══════╡ │"null" │ ├───────┤ │"Mark" │ └───────┘

Interestingly that also gets rid of the null value which I hadn’t expected. But if we look for values matching the empty string:

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
WITH value WHERE value = ""
RETURN value
 
╒═══════╕
│"value"│
╞═══════╡
│""     │
└───────┘


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value WITH value WHERE value = "" RETURN value ╒═══════╕ │"value"│ ╞═══════╡ │"" │ └───────┘

It’s not there either! Hmm what’s going on here:

WITH [null, "null", "", "Mark"] AS values
UNWIND values AS value
RETURN value, value = "" AS isEmpty, value <> "" AS isNotEmpty
 
╒═══════╤═════════╤════════════╕
│"value"│"isEmpty"│"isNotEmpty"│
╞═══════╪═════════╪════════════╡
│null   │null     │null        │
├───────┼─────────┼────────────┤
│"null" │false    │true        │
├───────┼─────────┼────────────┤
│""     │true     │false       │
├───────┼─────────┼────────────┤
│"Mark" │false    │true        │
└───────┴─────────┴────────────┘


WITH [null, "null", "", "Mark"] AS values UNWIND values AS value RETURN value, value = "" AS isEmpty, value <>"" AS isNotEmpty ╒═══════╤═════════╤════════════╕ │"value"│"isEmpty"│"isNotEmpty"│ ╞═══════╪═════════╪════════════╡ │null │null │null │ ├───────┼─────────┼────────────┤ │"null" │false │true │ ├───────┼─────────┼────────────┤ │"" │true │false │ ├───────┼─────────┼────────────┤ │"Mark" │false │true │ └───────┴─────────┴────────────┘

Null values seem to get filtered out for every type of equality match unless we explicitly check that a value ‘is null’.

So how do we use this knowledge when we’re parsing CSV files using Neo4j’s LOAD CSV tool?

Let’s say we have a CSV file that looks like this:

$ cat nulls.csv
name,company
"Mark",
"Michael",""
"Will",null
"Ryan","Neo4j"


$ cat nulls.csv name,company "Mark", "Michael","""Will",null "Ryan","Neo4j"

So none of the first three rows have a value for ‘company’. I don’t have any value at all, Michael has an empty string, and Will has a null value. Let’s see how LOAD CSV interprets this:

load csv with headers from "file:///nulls.csv" AS row
RETURN row
 
╒═════════════════════════════════╕
│"row"                            │
╞═════════════════════════════════╡
│{"name":"Mark","company":null}   │
├─────────────────────────────────┤
│{"name":"Michael","company":""}  │
├─────────────────────────────────┤
│{"name":"Will","company":"null"} │
├─────────────────────────────────┤
│{"name":"Ryan","company":"Neo4j"}│
└─────────────────────────────────┘


load csv with headers from "file:///nulls.csv" AS row RETURN row ╒═════════════════════════════════╕ │"row" │ ╞═════════════════════════════════╡ │{"name":"Mark","company":null} │ ├─────────────────────────────────┤ │{"name":"Michael","company":""} │ ├─────────────────────────────────┤ │{"name":"Will","company":"null"} │ ├─────────────────────────────────┤ │{"name":"Ryan","company":"Neo4j"}│ └─────────────────────────────────┘

We’ve got the full sweep of all the combinations from above. We’d like to create a Person node for each row but only create a Company node and associated ‘WORKS_FOR’ relationship if an actual company is defined – we don’t want to create a null company.

So we only want to create a company node and ‘WORKS_FOR’ relationship for the Ryan row.

The following query does the trick:

load csv with headers from "file:///nulls.csv" AS row
MERGE (p:Person {name: row.name})
WITH p, row
WHERE row.company <> "" AND row.company <> "null"
MERGE (c:Company {name: row.company})
MERGE (p)-[:WORKS_FOR]->(c)
 
Added 5 labels, created 5 nodes, set 5 properties, created 1 relationship, statement completed in 117 ms.


load csv with headers from "file:///nulls.csv" AS row MERGE (p:Person {name: row.name}) WITH p, row WHERE row.company <>"" AND row.company <>"null" MERGE (c:Company {name: row.company}) MERGE (p)-[:WORKS_FOR]->(c) Added 5 labels, created 5 nodes, set 5 properties, created 1 relationship, statement completed in 117 ms.

And if we visualize what’s been created:

Image title

Perfect. Perhaps this behavior is obvious, but it always trips me up. Hopefully, it’ll be useful to someone else as well!


Neo4j Database CSV

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Spring Data Neo4j: How to Update an Entity
  • Leveraging Neo4j for Effective Identity Access Management
  • The Beginner's Guide To Understanding Graph Databases
  • Externalize Microservice Configuration With Spring Cloud Config

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: