Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Summer 2017 Release of the APOC Procedures Library

DZone's Guide to

Summer 2017 Release of the APOC Procedures Library

Neo4j isn't taking the summer off! In this post we take a look at the latest release for Neo4j's APOC library, including some examples! Read on for the goods!

· Database Zone ·
Free Resource

RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.  

It’s summertime, but that doesn’t mean we’re less active building cool stuff for you to use with Neo4j.

If you haven’t heard of APOC yet — dubbed “Awesome Procedures On Cypher” — it’s a Swiss Army knife of useful utilities that make your life with Neo4j much easier. Besides the documentation, there are a number of past articles, that introduce relevant parts of the APOC library.

After about three months with almost 50 new features and fixes to APOC after our Spring release, we’re happy to announce two new APOC releases for Neo4j 3.1 and 3.2.

Of course, my thanks go to the people contributing to APOC — foremost Alberto, Angelo, Daniele, Omar, and Lorenzo from LARUS in Italy, who did most of the work.

Ron van Weverwijk from GoDataDriven added very useful text comparison features, Brad Nussbaum from AtomRain contributed improvements in load-jdbc, and Stefan Armbruster added new merge procedures.

Valentino Maiorca provided new hashing functions, Max de Marzi built a way of quickly counting the number of different entries for an indexed property, and Andrew Bowman added functions for sorting map-keys.

Highlights of the APOC Release

There are lots of improvements in the Cypher export procedures, which now support neo4j-shell, cypher-shell, and plain formats as well as exporting to separate files or exporting only the constraints. Accordingly, apoc.cypher.runFile and apoc.cypher.runSchemaFile can now consume multiple files.

Three new procedures give you a more detailed output of schema indexes and constraints, as well as apoc.meta.schema for a nested schema listing of nodes and relationships.

CALL apoc.meta.schema();


// output similar to
"Person": {
"type": "node",
"count": "131",
"labels": [],
"properties": {
"name": {"type": "STRING","indexed": true,"unique": true},
"born": {"type": "INTEGER","indexed": true,"unique": false},
},
"relationships": {
"ACTED_IN": {
"direction": "out",
"count": "797",
"labels": ["Movie"],
"properties": {
"roles": {"type": "UNKNOWN","array": true,"existence": false}
}
},
"PRODUCED": {"direction": "out","count": "27","labels": ["Movie"],"properties": {}},
"DIRECTED": {"direction": "out","count": "49","labels": ["Movie"],"properties": {}},
"WROTE": {"direction": "out","count": "12","labels": ["Movie"],"properties": {}}
}
},

There are now procedures to merge nodes and relationships with dynamic labels, relationship-types, and properties (apoc.merge.node/relationship).

CALL apoc.merge.node(['Label'], {id:uniqueValue}, {prop:value,...}) YIELD node;

CALL apoc.merge.relationship(startNode, 'RELTYPE', {[id:uniqueValue]}, {prop:value}, endNode) YIELD rel;

There are also functions for working with large decimal numbers, for example,, for finance applications. You can find them in apoc.number.exact.*.

RETURN apoc.number.exact.mul(toString(2^64),"2")

36893488147419104000


RETURN 0.98 - 0.9 as default, apoc.number.exact.sub('0.98','0.9') as exact

╒═══════════════════╤═══════╕
│"default"          │"exact"│
╞═══════════════════╪═══════╡
│0.07999999999999996│"0.08" │
└───────────────────┴───────┘

Something that people struggle with in Neo4j is atomic operations on properties of nodes and relationships. That’s why we added procedures to APOC that eagerly take write-locks and retry in case of deadlocks for property updates. Those functions are located in the apoc.atomic.* namespace.

CREATE (n:Foo {counter:0,owners:[], ownerCount:0});

MATCH (n:Foo)
CALL apoc.atomic.add(n,"counter",1) YIELD newValue AS counter
CALL apoc.atomic.insert(n,"owners",0,"bar") YIELD newValue AS owners
CALL apoc.atomic.update(n,"ownerCount","length(n.owners)") YIELD newValue AS ownerCount
RETURN n, owners, counter, ownerCount

╒════════════════════════════════════════════════════╤══════════════╤═════════╤════════════╕
│"n"                                                 │"owners"      │"counter"│"ownerCount"│
╞════════════════════════════════════════════════════╪══════════════╪═════════╪════════════╡
│{"ownerCount":2,"owners":["foo","bar"],"counter":2} │["foo","bar"] │2        │2           │
└────────────────────────────────────────────────────┴──────────────┴─────────┴────────────┘

A new function apoc.map.updateTree allows you to update tree structures based on matching keys.

apoc.map.updateTree(treeMap, key, updateList)

RETURN apoc.map.updateTree(
{name:"Michael",kids:[{name:"Rana"},{name:"Selma"},{name:"Selina"}]},
'name',
[['Michael',{born:1975}],["Selina",{born:1998}],["Rana",{born:2005}],["Selma",{born:2008}]])

{"name":"Michael",
 "born":1975,
 "kids":[
   {"name":"Rana","born":2005},
   {"name":"Selma","born":2008},
   {"name":"Selina","born":1998}]
}

There are now apoc.update.jdbc procedures for doing updates in relational databases.

CALL apoc.load.jdbcUpdate(jdbc-url,statement, params) YIELD  row;

MATCH (u:User)-[:BOUGHT]->(p:Product)<-[:BOUGHT]-(o:User)-[:BOUGHT]->(reco)
WHERE u <> o AND NOT (u)-[:BOUGHT]->(reco)
WITH u, reco, count(*) as score
WHERE score > 1000
CALL apoc.load.jdbcUpdate('jdbc:mysql:....',
'INSERT INTO RECOMMENDATIONS values(?,?,?)',[user.id, reco.id, score]) YIELD row;

A new apoc.nodes.connected function can be used as efficient connection test of dense nodes and non-dense nodes.

Relationships can now be merged as a new graph refactoring, and you can control via configuration what happens to the properties.

CALL apoc.refactor.mergeRelationships([rel1,rel2,..., relN], {config}) YIELD rel

Similar to the json-path in the JSON functions and procedures, you can now use XPath to access a subset of an XML document in apoc.load.xml.

CALL apoc.load.xml('books.xml', '/catalog/book[@id="bk102"]/author') YIELD value
WITH value._text as author
RETURN author

New text comparison functions provide fuzzy matching and Levenshtein distance for strings, which is really useful for data matching and merging.

RETURN apoc.text.distance('Berlin','Bärlin'); 
RETURN apoc.text.distance('Neo4j','neoj4'); 
RETURN apoc.text.fuzzyMatch('Cypher','Ciper'); 

A new feature that I learned about from Martin Junghanns is graph grouping. You can take an existing graph and group it into a virtual graph by node labels and properties to get an overview.

The procedure currently only aggregates nodes and relationships by with counts, so you can set the caption in your Neo4j Browser to caption: "{count} x {property}";. Going forward, I want to make it more versatile and efficient.

Here, I group the movies and actors of the movie graph by the century they’re released or born in.

MATCH (n:Movie) SET n.century = n.released/100*100 return count(*)
UNION
MATCH (n:Person) SET n.century = n.born/100*100  return count(*);

CALL apoc.nodes.group(['Person','Movie'],['century']);

Image title

Bug Fixes

  • warmup had a bug that made it fail with graphs with more than two billion nodes or relationships.
  • Periodic.iterate should now better report nested errors.
  • Improved error messages for missing database drivers.
  • Dropping the existing schema in schema.assert is now optional.
  • Faster turnaround in parallelization.
  • apoc.convert.toMap now also works for nodes and relationships.

We got fixes and improvements for code and documentation from Gábor Szárnyas, Chris Willemsen, John Bodley, Elad Wiess, and Nicholas Schiestel. Thank you!

Installation

As usual, you can grab the latest APOC releases from here.

Then just drop the jar-file into your $NEO4J_HOME/plugins directory (note the instructions for different install locations for Neo4j-Community in the readme) and restart your server.

You find more details on the new procedures and functions in the documentation or via call apoc.help('keyword').

Feedback

And, of course, APOC cannot improve if you don’t provide your feedback, so please let us know if you like it and find it useful (especially on Twitter).

If you find any issues or have ideas for improvements don’t hesitate to send us issues.

Of course, the best thing is to get a pull request with a bug fix or improvement, so don’t be afraid and give it a try.

Have fun connecting and have a great summer!

Get comfortable using NoSQL in a free, self-directed learning course provided by RavenDB. Learn to create fully-functional real-world programs on NoSQL Databases. Register today.

Topics:
apoc ,neo4j ,graph database ,database ,procedures

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}