Comparing Grakn to Semantic Web Technologies — Part 2/3
In part 2, take a look at SPARQL and RDFS and explore inserting and querying data with SPARQL, look at the RDF schema, and more.
Join the DZone community and get the full member experience.
Join For FreeThis is part two of Comparing Semantic Web Technologies to Grakn. In the first part, we looked at how RDF compares to Grakn. In this part, we look specifically at SPARQL and RDFS.
SPARQL
What Is SPARQL?
SPARQL is a W3C-standardised language to query for information from databases that can be mapped to RDF. Similar to SQL, SPARQL allows to insert and query for data. Unlike SQL, queries aren’t constrained to just one database and can be federated across multiple HTTP endpoints.
As Grakn’s query language, Graql is the equivalent query language. As in SPARQL, Graql allows to insert and query for data. However, given that Graql is not built as an open Web language, it doesn’t allow querying across multiple endpoints natively (this can be done with one of Grakn’s client drivers). As such, Graql is more similar to SQL and other traditional database management systems.
Inserting Data With SPARQL
To add data into the default graph store, this snippet describes how two RDF triples are inserted with SPARQL:
xxxxxxxxxx
PREFIX dc: <http://purl.org/dc/elements/1.1/>
INSERT DATA
{
<http://example/book1> dc:title "A new book" ;
dc:creator "A.N.Other" .
}
In Graql, we begin with the insert
statement to declare that data is to be inserted. The variable $b
is assigned to the entity type book
, which has a title
with value "A new book" and a creator
"A.N.Other".
xxxxxxxxxx
insert
$b isa book, has title "A new book", has creator "A.N.Other";
Querying With SPARQL
In SPARQL, we first declare the endpoints we want to retrieve our data from and we may attach those to a certain PREFIX. The actual query starts with SELECT
before stating the data we want to be returned. Then, in the WHERE
clause, we state the graph pattern for which SPARQL will then find the data that matches. In this query, we look for all the persons that "Adam Smith" knows using the namespaces foaf
and vCard
:
xxxxxxxxxx
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX vCard: <http://www.w3.org/2001/vcard-rdf/3.0#>SELECT ?whom
WHERE {
?person rdf:type foaf:Person .
?person vcard:family-name "Smith" .
?person vcard:given-name "Adam" .
?person foaf:knows ?whom .
}
In Graql, we begin with the match
statement to declare that we want to retrieve data. We match for an entity of type person
who has a family-name
"Smith" and a given-name
"Adam". Then, we connect it through a knows
relation type to $p2
. As we want to know who "Adam Smith" knows, we want to be returned $p2
which is declared in the get
statement:
xxxxxxxxxx
match $p isa person, has family-name "Smith", has given-name "Adam";
($p, $p2) isa knows;
get $p2;
Let’s look at a different query: Give me the director and movies that James Dean played in, where also a woman played a role, and that woman played in a movie directed by John Ford. Below is the SPARQL code and the visual representation of this traversal type query.
xxxxxxxxxx
PREFIX movie: <http://example.com/moviedb/0.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?director ?movie
WHERE{
?actor rdf:type foaf:Man ;
movie:name "James Dean" ;
movie:playedIn ?movie .
?actress movie:playedIn ?movie ;
rdf:type foaf:Woman ;
movie:playedIn ?anotherMovie .
?JohnFord rdf:type foaf:Man ;
movie:name "John Ford" .
?anotherMovie movie:directedBy ?JohnFord .
}
In Grakn, we can ask the same like this:
xxxxxxxxxx
match
$p isa man, has name "James Dean";
$w isa woman;
(actor: $p, actress: $w, casted-movie: $m) isa casting;
(actress: $w, casted-movie: $m2) isa casting;
$d isa man, has name "John Ford";
($m2, $d) isa directorship; get $d, $m;
Here, we assign the entity type man
with attribute name
and value "James Dean" to the variable $p
. We then say that $w
is of entity type woman
. These two are connected with movie
in a three-way relation called casting
. The woman
also plays a role in another casting
relation, where the movie
entity is connected to "John Ford" who relates to this movie
through a directorship
relation.
In the example above, the hyper-relation casting
in Grakn is representing the two playedIn
properties in SPARQL. However, in SPARQL we can only have two edges connecting woman
and "James Dean" with movie
, but not between themselves. This shows how fundamentally different modelling in Grakn is to RDF given its ability to model hypergraphs. Grakn enables to natively represent N number of role players in one relation without having to reify the model.
Schematically, this is how the query above is represented visually (note the ternary relation casting
):
Negation
In SPARQL, we can also specify in our query that certain data isn’t there using the keyword NOT EXISTS
. This finds a graph pattern that only matches if that subgraph doesn't match. In the example below, we look for actors who played in the movie Giant, but aren't yet passed away:
xxxxxxxxxx
PREFIX movie: <http://example.com/moviedb/0.1/>SELECT ?actor
WHERE {
?actor movie:playedIn movie:Giant .
NOT EXISTS {?actor movie:diedOn ?deathdate .
}
Using closed world assumptions, Grakn supports negation. This is done using the keyword not
followed by the pattern to be negated. The example above is represented like this:
xxxxxxxxxx
match
$m isa movie, has name "Giant"; ($a, $m) isa played-in;
not {$a has death-date $dd;}; get $a;
Here, we’re looking for an entity type movie
with name "Giant", which is connected to $a
, an actor
, through a relation of type played-in
. In the not
sub-query, we specify that $a
must not have an attribute of type death-date
with any value. We then get
the actor $a
.
RDF Schema
As RDF is just a data exchange model, on its own it’s “schemaless”. That’s why RDF Schema (RDFS) was introduced to extend RDF with basic ontological semantics. These allow, for example, for simple type hierarchies over RDF data. In Grakn, Graql is used as its schema language.
RDFS Classes
RDFS extends the RDF vocabulary and allows to describe taxonomies of classes and properties. An RDFS class declares an RDFS resource as a class for other resources. We can abbreviate this using rdfs:Class
. Using XML, creating a class animal
with a sub-class horse
would look like this:
xxxxxxxxxx
<?xml version="1.0"?><rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.animals.fake/animals#"><rdfs:Class rdf:ID="animal" />
<rdfs:Class rdf:ID="horse">
<rdfs:subClassOf rdf:resource="#animal"/>
</rdfs:Class>
</rdf:RDF>
To do the same in Grakn, we would write this:
xxxxxxxxxx
define
animal sub entity;
horse sub animal;
RDFS also allows for sub-typing of Properties
:
xxxxxxxxxx
<?xml version="1.0"?><rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.animals.fake/animals#"><rdfs:Class rdf:ID="mammal" /><rdfs:Class rdf:ID="human">
<rdfs:subClassOf rdf:resource="#mammal"/>
</rdfs:Class><rdfs:Property rdf:ID="employment" /><rdfs:Property rdf:ID="part-time-employment">
<rdfs:subPropertyOf rdf:resource="#employment"/>
</rdfs:Property></rdf:RDF>
Which in Grakn would look like this:
xxxxxxxxxx
mammal sub entity;
human sub mammal;
employment sub relation;
part-time-employment sub employment;
As the examples show, RDFS mainly describes constructs for types of objects (Classes
), inheriting from one another (subClasses
), properties that describe objects (Properties
), and inheriting from one another (subProperty
) as well. This sub-typing behaviour can be obtained with Graql's sub
keyword, which can be used to create type hierarchies of any thing
(entities
, relations
, and attributes
) in Grakn.
However, to create a one-to-one mapping between a class
to an entity
in Grakn or a property
to a relation
in Grakn, despite their seeming similarities, should not always be made. This is because the model in RDF is built using a lower level data model, working in triples, while Grakn enables to model at a higher level.
Multiple Inheritance
One important modelling difference between Grakn and the Semantic Web is with regards to multiple inheritance. In RDFS, a class can have as many superclasses as are named or logically inferred. Let’s take this example:
xxxxxxxxxx
company rdf:type rdfs:Class
government rdf:type rdfs:Classemployer rdf:type rdfs:Class
employer rdfs:subClassOf company
employer rdfs:subClassOf government
This models an employer
as both of class company
and government
. However, although this may look correct, the problem is that often multiple inheritance, as a modelling concept, is not used in the right way. Multiple inheritance should group things, and not subclass "types", where each type is a definition of something else. In other words, we don't want to represent instances of data. This is a common mistake.
Instead of multiple inheritance, Grakn supports single type inheritance, where we we should assign role
s instead of multiple classes. A role
defines the behaviour and aspect of a thing
in the context of a relation
, and we can assign multiple role
s to a thing
(note that roles are inherited when types subclass another).
For example, a government
can employ a person
, and a company
can employ a person
. One might suggest to then create a class that inherits both government
and company
which can employ a person
, and end up with an employer
class that subclasses both (as shown in the example above).
However, this is an abuse of inheritance. In this case, we should create a role employer
, which relates to an employment
relation and contextualises how a company
or government
is involved in that relation (by playing the role of employer
).
xxxxxxxxxx
company sub entity,
plays employer;government sub entity,
plays employer;employment sub relation,
relates employer;
rdfs:domain and rdfs:range
Two commonly used instances of rdf:property
include domain
and range
. These are used to state that respectively the members or the values of a property are instances of one or more classes. Below is an example of rdfs:domain
:
xxxxxxxxxx
:publishedOn rdfs:domain :PublishedBook
Here, rdfs:domain
assigns the class Person
to the subject of the hasBrother
property.
This is an example of rdfs:range
:
xxxxxxxxxx
:hasBrother rdfs:range :Male
Here, rdfs:range
assigns the class Male
to the object of the hasBrother
property.
In Grakn, there is no direct implementation of range
and domain
. The basic inferences drawn from them would be either already natively be represented in the Grakn data model through the use of role
s, or we can create rule
s to represent the logic we want to infer.
However, bear in mind that using rules in Grakn gives more expressivity in allowing us to represent the type of inferences we want to make. In short, translating range
and domain
to Grakn should be done on a case by case basis.
In the example above, rdfs:domain
can be translated to Grakn by saying that when an entity has an attribute type published-date
, it plays the role of published-book
in a publishing
relation type. This is represented in a Grakn rule:
xxxxxxxxxx
when {
$b has published-date $pd;
}, then {
(published-book: $b) is publishing;
};
The example of rdfs:range
can be created with the following Grakn rule, which adds the attribute type gender
with value of "male", only if a person
plays the role brother
in any siblingship
relation, where the number of other siblings is N.
xxxxxxxxxx
when {
$r (brother: $p) isa siblingship;
}, then {
$p has gender "male";
};
Let’s also look at another example. In a maritime setting, if we have a vessel of class DepartingVessel
, which has the Property nextDeparture
specified, we could state:
xxxxxxxxxx
ship:Vessel rdf:type rdfs:Class .
ship:DepartingVessel rdf:type rdfs:Class .
ship:nextDeparture rdf:type rdf:Property .
ship:QEII a ship:Vessel .
ship:QEII ship:nextDeparture "Mar 4, 2010" .
With the following rdfs:Domain
, any vessel for which nextDeparture
is specified, will be inferred to be a member of the DepartingVessel
class. In this example, this means QEII is assigned the DepartingVessel
class.
xxxxxxxxxx
ship:nextDeparture rdfs:domain ship:DepartingVessel .
To do the same in Grakn, we can write a rule that finds all entities with an attribute next-departure
and assign them to a relation departure
playing the role of departing-vessel
.
xxxxxxxxxx
when {
$s has next-departure $nd;
}, then {
(departing-vessel: $s) isa departure;
};
Then, if this data is ingested:
xxxxxxxxxx
$s isa vessel, has name "QEII", has next-departure "Mar 4, 2010";
Grakn infers that the vessel QEII plays the role of departing-vessel
in a departure
relation, the equivalent in this case of the nextDeparture
class.
The use of rdfs:domain
and rdfs:range
are useful in the context of the web, where federated data can often be found to be incomplete. As Grakn doesn't live on the web, the need for these concepts is reduced. Further, most of this inferred data is already natively represented in Grakn's conceptual model. A lot of this is due to its higher level model and the usage of rules. Therefore, directly mapping rdfs:range
and rdfs:domain
to a concept in Grakn is usually naive and leads to redundancies. Instead, translating these concepts into Grakn should be done on a case by case basis using rules and roles.
In the final Part 3 (link here), we look at how Grakn compares OWL and SHACL. To learn more, make sure to attend our upcoming webinars via this link.
Published at DZone with permission of Tomás Sabat, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Seven Steps To Deploy Kedro Pipelines on Amazon EMR
-
What Is mTLS? How To Implement It With Istio
-
Design Patterns for Microservices: Ambassador, Anti-Corruption Layer, and Backends for Frontends
-
Micro Frontends on Monorepo With Remote State Management
Comments