Graph Explosion and Consolidation. The Year of the Graph Newsletter: June 2019
''The future is graph — knowledge graph.''
Join the DZone community and get the full member experience.Join For Free
With the knowledge graph space exploding on all accounts (interest, use cases, funding), centrifugal and centripetal forces are simultaneously at play. While the "wild, early days" of knowledge graph technology are gone, the 20 year anniversary of the Semantic Web is a good opportunity to reflect on what worked and what didn't and to move forward in a pragmatic way.
A testament to the fact that this space is booming: more offerings are available every day, the quality and quantity of knowledge sharing is rising to meet the demand, and at the same time we are starting to see consolidation — in vendors, models, and standards.
The Future of Data is Connected and Open Minded. This was the takeaway from Connected Data London 2018, as well as the connection to naturally take us to Connected Data London 2019. In other words, the future is graph – knowledge graph. Connected Data London has announced keynotes from Uber, Microsoft, and Siren.io, and workshops with some of the world's leading experts.
The Future Is Graph – Knowledge Graph
Giovanni Tummarello, Siren.io CPO, is one of Connected Data London's keynote speakers. Here, Tummarello offers a bit of history on the wild, early days of graph databases, where they fell short, why they are hot again, what they are unbeatable at, and the concept of "Enterprise-Wide Knowledge Graphs".
Graph DBs in Enterprise: Top 3 Use Cases in Which They Make Sense
The "wild, early days of graph databases" are to a large extent tied to the concept of the Semantic Web. The 20-year anniversary mark for the Semantic Web has inspired some people to share their thoughts. Pierre-Yves Vandenbussche from Elsevier Labs notes that the Semantic and Web parts may actually be at odds with each other, but pragmatism seems to be on its way to prevailing.
Linked Data – Past, Present (2019) and Future
Ruben Verborgh from University of Ghent/MIT/Inrupt identifies the Semantic Web's identity crisis, arguing "the community has an unconscious bias toward addressing the Paretonian 80% of problems through research — handwavingly assuming that trivial engineering can solve the remaining 20%. In reality, that overlooked 20% could actually require 80% of the total effort and involve significantly more research than we are inclined to think".
Case in point: Verborgh's work on Solid, which promises the separation of data and apps, so we can choose our apps independently of where we store our data. Building such decentralized Linked Data apps necessitates a high level of interoperability, where data written by one app needs to be picked up by another. Rather than relying on the heavier Semantic Web machinery of ontologies, Verborgh believes that shapes are the right way forward — without throwing the added value of links and semantics out of the window. Verborgh will talk about Connecting people, data, and apps without centralization in Connected Data London.
Shaping Linked Data Apps
Semantics are very relevant in business data management. Internal and external concepts relevant to the business is what Andreas Ingvar van der Hoeven from GeoPhy calls the semantic universe. Van der Hoeven goes on to define a who is who in the Semantic world, and what are the professions of Taxonomist, Data Librarian, Ontologist and Semantic Architect.
The Semantic Heroes
Ontologies are a substrate for knowledge graphs, but building them can be intimidating and complex for beginners. C. Maria Keet published the requirements, design, and content of the African Wildlife Ontology tutorial to breathe some fresh air into ontology building for beginners. And for a hands-on education, don't forget Panos Alexopoulos' session in Connected Data London.
The African Wildlife Ontology Tutorial Ontologies: Requirements, Design, and Content
Seasoned ontologist Chris Mungall continues to share his expertise, on mapping and documenting ontologies.
Never Mind the Logix: Taming the Semantic Anarchy of Mappings in Ontologies
Steve Baskauf is a Data Curation Specialist with the Jean & Alexander Heard Libraries at Vanderbilt, whose group is interested in using Wikidata. Baskauf gets the fact that Wikidata's SPARQL endpoint, like any other SPARQL endpoint, is a powerful API, and showcases how to get data out of Wikidata using SPARQL in his post. In complementary work, Heibi, Peroni, and Shotton show how to enable text search on SPARQL endpoints, also using Wikidata as an example.
Getting Data Out of Wikidata Using Software
DBpedia is another one of the main hubs in the Linked Open Data world. In its community day in Leipzig, there was an array of presentations on all things DBpedia. Some of the most interesting ones: Heiko Paulheim on harvesting knowledge graph data from wikis, and Maribel Acosta on crowdsourcing knowledge graph quality assessment.
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
What happens when you try to extract knowledge using multiple, sometimes contradictory, data sources? This is the question Mos Zhang from SyncedReview is looking into, using BERT and ERNIE, a pretrained model and a model leveraging knowledge graphs, respectively.
Ask AI: Is Bob Dylan an Author or a Songwriter?
data.world has been working on leveraging the benefits of semantics and knowledge graph technology for enterprise data management. Now it joins forces with Capsenta, adding virtualization and consumer-grade UI to its arsenal, in an acqui-hiring move that seems like a perfect match both technologically and culturally.
Data.world Joins Forces With Capsenta to Bring Knowledge Graph-Based Data Management to the Enterprise
In the last few days, we also had an array of news from vendors who may not the first that come to mind when thinking of graph databases but are making a graph data play in one way or another.
Cray will be acquired by Hewlett Packard Enterprise. Cray offers, among others, an RDF graph database, which will now become HP's property.
Fluree is a startup working on FlureeDB, promising that rather than building expensive APIs, developers are able to safely expose rich and permissioned query interfaces to data sources including GraphQL, SPARQL, and Fluree’s JSON-based FlureeQL. Fluree just raised $4.73 million in seed funding.
RavenDB, an open-source transactional NoSQL document database vendor, has added data replication and other features to the latest release along with the ability to handle graph queries in its own query language.
Another startup, Oxford Semantic Technologies, is working on RDFox, its own graph database. Angus Addlesee from Wallscope has a go at RDFox and adds it as a contender to his RDF graph database benchmark.
Last but not least, Wolfram now supports importing and exporting RDF data, as well as querying with SPARQL.
RDF and SPARQL in Wolfram 12
DataStax has added graphs to its database offering a long time ago. Now if you have DataStax Enterprise core, and DataStax Graph, and Developer Studio, the new DataStax Desktop will configure all of those to work together seamlessly, and you don't have to touch a single line of configuration files. DreamWorks is a DataStax user, and although their team had no previous experience in graph databases, having historically relied on relational DBs, they were brave enough to try after hearing about how the graph was applied at Netflix. They used Apache Tinkerpop and Gremlin for this.
DreamWorks Picks Gremlin to Weave Digital Marvels
GraphDB 8.10 Makes Knowledge Graph Experience Faster and Richer
Stardog also has not one, but two new versions out: 6.2 and 7.0 Beta 2. Stardog 6.2 brings scalable virtual graph caching, better Kubernetes integration, support for Amazon Redshift, and many new optimizations. Stardog 7.0's features a new storage engine, with substantially faster write performance, especially when running in a cluster.
Stardog 7.0 Beta 2
In Neo4j's world, we have a couple of updates (the Neuler graph algorithm playground and Bloom), use cases (Women's World Cup data as a graph, and graph databases for journalists), and a tip — how to calculate TF-IDF score using cypher.
Using Graph Algorithms to Explore the Participation of Shell Companies in Public Procurement
TigerGraph also has its own suite of open source graph algorithms, written in GSQL, its own query language. If you want to learn more about graph query languages and GSQL, and how they are used in fraud detection in the finance sector, you can also check TigerGraph's webinar with Connected Data London.
And to follow that up: TigerGraph has come up with a way to use FIBO, the Financial Industry Business Ontology (FIBO), in its property graph platform.
Sink Your Teeth Into FIBO With a Native Parallel Graph Database
The Graphileon application development framework comes with a large set of stylable user interface widgets for software development. Users can develop insightful visualizations as they develop graph analytics. Graphileon has announced new partnerships with Cambridge Semantics and DataStax in the last few days.
Graphileon to Support DataStax Enterprise Graph
One thing that should be clear about graph databases by now is that there a lot of those around, and choosing one is not easy. Sabah Zdanowska shares her experience Getting to Grips with Graph Databases.
Getting to Grips With Graph Databases
Alastair Green has been driving the effort to converge various graph query languages into a new standard called GQL. As he reports, the effort seems to be well underway, with the GQL standard project up for a final vote in ISO/IEC ballot.
Critical Milestone for ISO Graph Query Standard GQL
To receive the Year of the Graph newsletter every month in your inbox, sign up here.
Opinions expressed by DZone contributors are their own.