Over a million developers have joined DZone.

Neo4j: What is a Node?

DZone's Guide to

Neo4j: What is a Node?

· Java Zone
Free Resource

Never build auth again! The Okta Developer Platform makes it simple to implement authentication, authorization, MFA and more in Java applications. Get started with the free API.

One of the first things I needed to learn when I started using Neo4j was how to model my domain using nodes and relationships and it wasn’t initially obvious to me what things should be nodes.

Luckily Ian Robinson showed me a mini-algorithm which I found helpful for getting started. The steps are as follows:

  1. Write out the questions you want to ask
  2. Highlight/underline the nouns
  3. Those are your nodes!

This is reasonably similar to the way that we work out what our objects should be when we’re doing OO modelling and I thought I’d give it a try on some of the data sets that I’ve worked with recently:

  • Female friends of friends that somebody could go out with
  • Goals scored by Arsenal players in a particular season
  • Colleagues who have similar skills to me
  • Episodes of a TV program that a particular actor appeared in
  • Customers who would be affected if a piece of equipment went in for repair

If you’re like me and aren’t that great at English grammar we can always cheat and get NLTK to help us out:

>>> nltk.pos_tag(nltk.word_tokenize("Female friends of friends that somebody could go out with"))
[('Female', 'NNP'), ('friends', 'NNS'), ('of', 'IN'), ('friends', 'NNS'), ('that', 'WDT'), ('somebody', 'NN'), ('could', 'MD'), ('go', 'VB'), ('out', 'RP'), ('with', 'IN')]

That tells us the likely tag for each part of speech in the sentence and we can filter the resulting list so we only see nouns like this:

>>> nouns = ['NNS', 'NN', 'NP', 'NNP']
>>> [(word, grammar) for (word, grammar) in nltk.pos_tag(nltk.word_tokenize("Female friends of friends that somebody could go out with")) if grammar in nouns]
[('Female', 'NNP'), ('friends', 'NNS'), ('friends', 'NNS'), ('somebody', 'NN')]

We can ignore the ‘Female’ in this sentence (I think it’s been picked up as a proper noun because of the capitalisation) which leaves us with ‘friends’ and ‘somebody’ In both cases these nouns represent the concept of a person so we’d want to create nodes representing people in this domain.

Let’s see how NLTK gets on with our second question:

>>> sentence = "Goals scored by Arsenal players in a particular season"
>>> [(word, grammar) for (word, grammar) in nltk.pos_tag(nltk.word_tokenize(sentence)) if grammar in ['NNS', 'NN', 'NP']]
[('Goals', 'NNS'), ('Arsenal', 'NNP'), ('players', 'NNS'), ('season', 'NN')]

In this case we’d have goals, teams (e.g. Arsenal), players and seasons as our nodes.

Although this is a very rough algorithm for working out what things should be nodes in a graph I think it’s a good way to get started.

After that the other queries we want to write may lead us to change the model to solve our problem even better.

Build and launch faster with Okta’s user management API. Register today for the free forever developer edition!


Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}