{{announcement.body}}
{{announcement.title}}

Comparing Grakn to Semantic Web Technologies — Part 2/3

DZone 's Guide to

Comparing Grakn to Semantic Web Technologies — Part 2/3

In part 2, take a look at SPARQL and RDFS and explore inserting and querying data with SPARQL, look at the RDF schema, and more.

· Database Zone ·
Free Resource

This is part two of Comparing Semantic Web Technologies to Grakn. In the first part, we looked at how RDF compares to Grakn. In this part, we look specifically at SPARQL and RDFS.

SPARQL

What Is SPARQL?

SPARQL is a W3C-standardised language to query for information from databases that can be mapped to RDF. Similar to SQL, SPARQL allows to insert and query for data. Unlike SQL, queries aren’t constrained to just one database and can be federated across multiple HTTP endpoints.

As Grakn’s query language, Graql is the equivalent query language. As in SPARQL, Graql allows to insert and query for data. However, given that Graql is not built as an open Web language, it doesn’t allow querying across multiple endpoints natively (this can be done with one of Grakn’s client drivers). As such, Graql is more similar to SQL and other traditional database management systems.

Inserting Data With SPARQL

To add data into the default graph store, this snippet describes how two RDF triples are inserted with SPARQL:

Java
 




xxxxxxxxxx
1


 
1
PREFIX dc: <http://purl.org/dc/elements/1.1/>
2
INSERT DATA
3
{ 
4
  <http://example/book1> dc:title "A new book" ;
5
                         dc:creator "A.N.Other" .
6
}



In Graql, we begin with the insert statement to declare that data is to be inserted. The variable $b is assigned to the entity type book, which has a title with value "A new book" and a creator "A.N.Other".

Java
 




xxxxxxxxxx
1


 
1
insert
2
$b isa book, has title "A new book", has creator "A.N.Other";



Querying With SPARQL

In SPARQL, we first declare the endpoints we want to retrieve our data from and we may attach those to a certain PREFIX. The actual query starts with SELECT before stating the data we want to be returned. Then, in the WHERE clause, we state the graph pattern for which SPARQL will then find the data that matches. In this query, we look for all the persons that "Adam Smith" knows using the namespaces foaf and vCard:

Java
 




xxxxxxxxxx
1


 
1
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
2
PREFIX vCard: <http://www.w3.org/2001/vcard-rdf/3.0#>SELECT ?whom
3
WHERE {
4
     ?person rdf:type  foaf:Person .
5
     ?person vcard:family-name "Smith" .
6
         ?person vcard:given-name  "Adam" .
7
     ?person foaf:knows ?whom .
8
 }



In Graql, we begin with the match statement to declare that we want to retrieve data. We match for an entity of type person who has a family-name "Smith" and a given-name "Adam". Then, we connect it through a knows relation type to $p2. As we want to know who "Adam Smith" knows, we want to be returned $p2 which is declared in the get statement:

Java
 




xxxxxxxxxx
1


 
1
match $p isa person, has family-name "Smith", has given-name "Adam"; 
2
($p, $p2) isa knows; 
3
get $p2;



Let’s look at a different query: Give me the director and movies that James Dean played in, where also a woman played a role, and that woman played in a movie directed by John Ford. Below is the SPARQL code and the visual representation of this traversal type query.

Java
 




xxxxxxxxxx
1
16


 
1
PREFIX  movie: <http://example.com/moviedb/0.1/>
2
PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
3
PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
4
 
5
SELECT  ?director ?movie
6
WHERE{    
7
?actor         rdf:type             foaf:Man ;
8
               movie:name          "James Dean" ;
9
               movie:playedIn       ?movie .
10
?actress       movie:playedIn       ?movie ;
11
               rdf:type             foaf:Woman ;
12
               movie:playedIn       ?anotherMovie .
13
?JohnFord      rdf:type             foaf:Man ;
14
               movie:name           "John Ford" .
15
?anotherMovie     movie:directedBy  ?JohnFord .
16
}


Visual representation of the Sparql traversal query.

In Grakn, we can ask the same like this:

Java
 




xxxxxxxxxx
1


 
1
match 
2
$p isa man, has name "James Dean"; 
3
$w isa woman; 
4
(actor: $p, actress: $w, casted-movie: $m) isa casting; 
5
(actress: $w, casted-movie: $m2) isa casting; 
6
$d isa man, has name "John Ford"; 
7
($m2, $d) isa directorship; get $d, $m;



Here, we assign the entity type man with attribute name and value "James Dean" to the variable $p. We then say that $w is of entity type woman. These two are connected with movie in a three-way relation called casting. The woman also plays a role in another casting relation, where the movie entity is connected to "John Ford" who relates to this movie through a directorship relation.

In the example above, the hyper-relation casting in Grakn is representing the two playedIn properties in SPARQL. However, in SPARQL we can only have two edges connecting woman and "James Dean" with movie, but not between themselves. This shows how fundamentally different modelling in Grakn is to RDF given its ability to model hypergraphs. Grakn enables to natively represent N number of role players in one relation without having to reify the model.

Schematically, this is how the query above is represented visually (note the ternary relation casting):

Visual representation of the query in Grakn.

Negation

In SPARQL, we can also specify in our query that certain data isn’t there using the keyword NOT EXISTS. This finds a graph pattern that only matches if that subgraph doesn't match. In the example below, we look for actors who played in the movie Giant, but aren't yet passed away:

Java
 




xxxxxxxxxx
1


 
1
PREFIX movie: <http://example.com/moviedb/0.1/>SELECT ?actor
2
WHERE {
3
      ?actor movie:playedIn movie:Giant .
4
      NOT EXISTS {?actor movie:diedOn ?deathdate .
5
}



Using closed world assumptions, Grakn supports negation. This is done using the keyword not followed by the pattern to be negated. The example above is represented like this:

Java
 




xxxxxxxxxx
1


 
1
match 
2
$m isa movie, has name "Giant"; ($a, $m) isa played-in; 
3
not {$a has death-date $dd;}; get $a;



Here, we’re looking for an entity type movie with name "Giant", which is connected to $a, an actor, through a relation of type played-in. In the not sub-query, we specify that $a must not have an attribute of type death-date with any value. We then get the actor $a.

RDF Schema

As RDF is just a data exchange model, on its own it’s “schemaless”. That’s why RDF Schema (RDFS) was introduced to extend RDF with basic ontological semantics. These allow, for example, for simple type hierarchies over RDF data. In Grakn, Graql is used as its schema language.

RDFS Classes

RDFS extends the RDF vocabulary and allows to describe taxonomies of classes and properties. An RDFS class declares an RDFS resource as a class for other resources. We can abbreviate this using rdfs:Class. Using XML, creating a class animal with a sub-class horse would look like this:

Java
 




xxxxxxxxxx
1


 
1
<?xml version="1.0"?><rdf:RDF
2
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
4
xml:base="http://www.animals.fake/animals#"><rdfs:Class rdf:ID="animal" />
5
<rdfs:Class rdf:ID="horse">
6
  <rdfs:subClassOf rdf:resource="#animal"/>
7
</rdfs:Class>
8
</rdf:RDF>



To do the same in Grakn, we would write this:

Java
 




xxxxxxxxxx
1


 
1
define 
2
animal sub entity; 
3
horse sub animal;



RDFS also allows for sub-typing of Properties:

Java
 




xxxxxxxxxx
1


 
1
<?xml version="1.0"?><rdf:RDF
2
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
4
xml:base="http://www.animals.fake/animals#"><rdfs:Class rdf:ID="mammal" /><rdfs:Class rdf:ID="human">
5
  <rdfs:subClassOf rdf:resource="#mammal"/>
6
</rdfs:Class><rdfs:Property rdf:ID="employment" /><rdfs:Property rdf:ID="part-time-employment">
7
  <rdfs:subPropertyOf rdf:resource="#employment"/>
8
</rdfs:Property></rdf:RDF>



Which in Grakn would look like this:

Java
 




xxxxxxxxxx
1


 
1
mammal sub entity; 
2
human sub mammal;
3
employment sub relation;
4
part-time-employment sub employment;



As the examples show, RDFS mainly describes constructs for types of objects (Classes), inheriting from one another (subClasses), properties that describe objects (Properties), and inheriting from one another (subProperty) as well. This sub-typing behaviour can be obtained with Graql's sub keyword, which can be used to create type hierarchies of any thing (entities, relations, and attributes) in Grakn.

However, to create a one-to-one mapping between a class to an entity in Grakn or a property to a relation in Grakn, despite their seeming similarities, should not always be made. This is because the model in RDF is built using a lower level data model, working in triples, while Grakn enables to model at a higher level.

Multiple Inheritance

One important modelling difference between Grakn and the Semantic Web is with regards to multiple inheritance. In RDFS, a class can have as many superclasses as are named or logically inferred. Let’s take this example:

Java
 




xxxxxxxxxx
1


 
1
company      rdf:type  rdfs:Class
2
government   rdf:type  rdfs:Classemployer     rdf:type         rdfs:Class
3
employer     rdfs:subClassOf  company
4
employer     rdfs:subClassOf  government



This models an employer as both of class company and government. However, although this may look correct, the problem is that often multiple inheritance, as a modelling concept, is not used in the right way. Multiple inheritance should group things, and not subclass "types", where each type is a definition of something else. In other words, we don't want to represent instances of data. This is a common mistake.

Instead of multiple inheritance, Grakn supports single type inheritance, where we we should assign roles instead of multiple classes. A role defines the behaviour and aspect of a thing in the context of a relation, and we can assign multiple roles to a thing (note that roles are inherited when types subclass another).

For example, a government can employ a person, and a company can employ a person. One might suggest to then create a class that inherits both government and company which can employ a person, and end up with an employer class that subclasses both (as shown in the example above).

However, this is an abuse of inheritance. In this case, we should create a role employer, which relates to an employment relation and contextualises how a company or government is involved in that relation (by playing the role of employer).

Java
 




xxxxxxxxxx
1


 
1
company sub entity,
2
    plays employer;government sub entity,
3
    plays employer;employment sub relation,
4
    relates employer;



rdfs:domain and rdfs:range

Two commonly used instances of rdf:property include domain and range. These are used to state that respectively the members or the values of a property are instances of one or more classes. Below is an example of rdfs:domain:

Java
 




xxxxxxxxxx
1


 
1
:publishedOn rdfs:domain :PublishedBook



Here, rdfs:domain assigns the class Person to the subject of the hasBrother property.

This is an example of rdfs:range:

Java
 




xxxxxxxxxx
1


 
1
:hasBrother rdfs:range :Male



Here, rdfs:range assigns the class Male to the object of the hasBrother property.

In Grakn, there is no direct implementation of range and domain. The basic inferences drawn from them would be either already natively be represented in the Grakn data model through the use of roles, or we can create rules to represent the logic we want to infer.

However, bear in mind that using rules in Grakn gives more expressivity in allowing us to represent the type of inferences we want to make. In short, translating range and domain to Grakn should be done on a case by case basis.

In the example above, rdfs:domain can be translated to Grakn by saying that when an entity has an attribute type published-date, it plays the role of published-book in a publishing relation type. This is represented in a Grakn rule:

Java
 




xxxxxxxxxx
1


 
1
when {
2
    $b has published-date $pd; 
3
}, then {
4
    (published-book: $b) is publishing; 
5
};



The example of rdfs:range can be created with the following Grakn rule, which adds the attribute type gender with value of "male", only if a person plays the role brother in any siblingship relation, where the number of other siblings is N.

Java
 




xxxxxxxxxx
1


 
1
when {
2
    $r (brother: $p) isa siblingship; 
3
}, then {
4
    $p has gender "male";  
5
};



Let’s also look at another example. In a maritime setting, if we have a vessel of class DepartingVessel, which has the Property nextDeparture specified, we could state:

Java
 




xxxxxxxxxx
1


 
1
ship:Vessel rdf:type rdfs:Class .
2
ship:DepartingVessel rdf:type rdfs:Class .
3
ship:nextDeparture rdf:type rdf:Property .
4
ship:QEII a ship:Vessel .
5
ship:QEII ship:nextDeparture "Mar 4, 2010" .



With the following rdfs:Domain, any vessel for which nextDeparture is specified, will be inferred to be a member of the DepartingVessel class. In this example, this means QEII is assigned the DepartingVessel class.

Java
 




xxxxxxxxxx
1


 
1
ship:nextDeparture rdfs:domain ship:DepartingVessel .



To do the same in Grakn, we can write a rule that finds all entities with an attribute next-departure and assign them to a relation departure playing the role of departing-vessel.

Java
 




xxxxxxxxxx
1


 
1
when {
2
    $s has next-departure $nd; 
3
}, then {
4
    (departing-vessel: $s) isa departure; 
5
};



Then, if this data is ingested:

Java
 




xxxxxxxxxx
1


 
1
$s isa vessel, has name "QEII", has next-departure "Mar 4, 2010";



Grakn infers that the vessel QEII plays the role of departing-vessel in a departure relation, the equivalent in this case of the nextDeparture class.

The use of rdfs:domain and rdfs:range are useful in the context of the web, where federated data can often be found to be incomplete. As Grakn doesn't live on the web, the need for these concepts is reduced. Further, most of this inferred data is already natively represented in Grakn's conceptual model. A lot of this is due to its higher level model and the usage of rules. Therefore, directly mapping rdfs:range and rdfs:domain to a concept in Grakn is usually naive and leads to redundancies. Instead, translating these concepts into Grakn should be done on a case by case basis using rules and roles.

In the final Part 3 (link here), we look at how Grakn compares OWL and SHACL. To learn more, make sure to attend our upcoming webinars via this link.

Topics:
artificial intelligence, database, rdf classes, semantic web, sparql

Published at DZone with permission of Tomas Sabat , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}