Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Store Semantic Web Triples in Cassandra

DZone's Guide to

Store Semantic Web Triples in Cassandra

Learn how to store triples into a kb_dph table with the help of kb_direct_preds to optimize semantic web searches in Cassandra.

· Database Zone ·
Free Resource

Running out of memory? Learn how Redis Enterprise enables large dataset analysis with the highest throughput and lowest latency while reducing costs over 75%! 

The semantic web is the next level of web searching, where data is more important and should be well-defined. The semantic web is needed for making web searches more intelligent and intuitive to user requirements. You can find some interesting points on the semantic web here.

Triples are an atomic entity in RDF. They're composed of subject-predicate-object and used for linking the subject and object with the help of the predicate. You can find some interesting points on triples here.

RDF stands for resource description framework. It is a framework for representing all information about a source in a graph. The RDF store is used for storing triples and it uses a SPARQL query to run them. Conversely, the RDF store creates some tables and on the basis of those tables, it converts SPARQL queries into normal SQL queries using Quetzal.

Now, there will be some questions and one of them is: What are we doing here?

We are trying to store triplesin Cassandra, as Quetzal stores into Postgres after creating the tables. Quetzal creates lots of tables for storing triples based on certain conditions. Tables created by Quetzal include:

  1. kb
  2. kb_basestats
  3. kb_direct_preds
  4. kb_dph
  5. kb_ds
  6. kb_dt
  7. kb_lstr
  8. kb_reverse_preds
  9. kb_rph
  10. kb_rs
  11. kb_topkstats

For more information of these tables, you can visit here.

We are trying to implement kb_direct_preds and kb_dph tables on Cassandra and fetch the object with the help of subject and predicate.

Table kb_direct_preds is used for storing the location of the predicate in the kb_dph table while kb_dph stores every value of the triples along with subject, predicate, and object.

Ingest Data Into Cassandra

Apache Cassandra running on default port (i.e. 9042):

sbt "runMain com.knoldus.TripleLoader /PATH/TO/TRIPLE/FILE"

Fetch Data From Cassandra

We have to follow these step to fetch out the data from the Cassandra table.

Run the project using the command sbt run. After running the project, hit the endpoint from your browser:

localhost:8082/triples?subject=<subject_value>&predicate=<predicate_value>

You can find more information on the Wiki of the project.

We have submitted the template for the project. You can find the template with this command:

sbt newknoldus/triple-manipulation.g8

These are the steps for storing triples into a kb_dph table with the help of kb_direct_preds.

If you have any questions, you can comment here!

This article originally appeared on the Knoldus blog.

Running out of memory? Never run out of memory with Redis Enterprise databaseStart your free trial today.

Topics:
database ,cassandra ,data storage ,ingesting ,fetching ,tutorial ,rdf ,quetzal ,postgres

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}