Over a million developers have joined DZone.

Neo Technology and the LDBC Project

DZone's Guide to

Neo Technology and the LDBC Project

· Database Zone ·
Free Resource

Running out of memory? Learn how Redis Enterprise enables large dataset analysis with the highest throughput and lowest latency while reducing costs over 75%! 

Curator's Note: The content of this article was originally written by Alex Averbuch on the Neo4j blog. 

A bit has happened since I ( Alex Averbuch) last updated you about progress in the  LDBC (Linked Data Benchmark Council), and Neo’s part in it. So, without further ado and in no particular order, here’s what we’ve been doing...

Second Technical User Community meeting

Of high priority for the LDBC is getting industry input on benchmark development - benchmarks that are not interesting to industry are generally not very interesting. To address this we engage with industry via bi-annual Technical User Community (TUC) meetings, where experts from both industry and academia are invited to present their data management use cases and participate in the LDBC benchmark development process.
This past April the  second TUC meeting was hosted in Munich by the Technical University of Munich, an academic partner of the LDBC project.
The two-day meeting, dominated by presentations and subsequent discussion, was a complete success. Many thought leaders from leading graph/RDF data management organizations (both academic and industry) were there to give talks. Among them were:  Wolters KluwerBBCR.J. Lee GroupOracleDshiniBNFSt. Judes MedicalUCBBroxACCESOActifyMax Planck Institute for Informatics ( presenting the YAGO project),  MediaproUniversity of CyprusAGT InternationalUIBK, and  OpenPhacts.
One highlight was the talk by Klaus Großmann (Dshini CTO) entitled  Neo4j at Dshini  (Dshini is a German social network that aims to generate purchasing power through activity only - members earn virtual currency, save up and redeem it to fulfill their wishes).
In his presentation, Klaus shared his experience of using Neo4j as the main data storage technology at  Dshini, and provided many insights regarding graph data modeling in the real world. A great talk and very useful input to our benchmark design process - perfect illustration of the value gained by involving industry in the LDBC!  
For anyone that’s interested,  slides from most of the talks are available here. Thanks to all who participated!
Neo Technology in upcoming workshops and conferences
A natural byproduct of Neo’s participation in the LDBC is a general increased presence in academic circles. In the coming months Neo will be present and participating in a number of exciting events, including (but not limited to) the GRADES and GraphLab workshops.
GRADES workshop (23rd of June in NYC): co-sponsored by the LDBC, this workshop is designed to spark discussion and descriptions of application areas and open challenges related to the management large-scale graph data. Neo will contribute both as organizer (as member of the program committee) and as participant; in collaboration with the  Institute for Scientific Interchange Foundation and  SocioPatterns projectCiro CattutoAndré PanissonMarco Quaggiotto, and I will present a paper at GRADES about  modeling time-varying social graphs in the Neo4j.
GraphLab workshop (1st of July in SFO): also co-sponsored by the LDBC, this event will focus on large scale machine learning on sparse graphs. Here too Neo is a member of the program committee, and we will have a number of representatives at the event.
Both I and my colleague  Philip Rathle will be at the event, to represent Neo and the LDBC project.
Not to mention  GraphConnect ( @GraphConnect)... this will be a series of five conferences across the USA and England, held between June-November of this year!
Recent benchmark efforts, their relevance, and what we're busy building
Lately a number of graph database-related micro-benchmarking efforts have been published; these are obviously interesting to Neo, both in general and in the context of LDBC. Though a growing number of such examples are popping up, a recent one that stands out is  LinkBench from Facebook. More specifically, what stands out is the data generator embedded in LinkBench.  
The general 'problem' with generators is they generate synthetic data, the data is not real and its characteristics perhaps not representative of the real world. LinkBench is unique in that it was developed at Facebook - few organizations have access to a real social network dataset as immense or rich as that of Facebook’s. This puts Facebook researchers in the unique position of being able to verify the “realisticness” (I just made it a word...) of the data generators they develop - and, now, Facebook have made LinkBench public, along with details of its data generator!

How does this relate to the LDBC?  

It assists us in developing more meaningful benchmarks. 
We ( Vrije University and the  Polytechnic University of Catalonia in particular) are in the process of developing the LDBC data generator - a continuation of the work performed by Vrije University on the  SIB social network generator. We've now gone through the process of evaluating LinkBench (and a number of real datasets) and are modifying the LDBC data generator, applying the lessons learned to improve the generator's "realisticness".
In parallel, we've also started development of a benchmark driver, for future LDBC benchmarks to use. More on that in a later post!  
The first versions of both the LDBC benchmark driver and LDBC data generator will be published on our  public github account as soon as we have something to share!  
In the meantime, stay up to date with the LDBC project via  LinkedInTwitter (@LDBCproject)Facebook, or the main project page -  www.ldbc.eu.

Running out of memory? Never run out of memory with Redis Enterprise databaseStart your free trial today.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}