Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

SPARQL and dbpedia: Getting Structured Data from Wikipedia

DZone's Guide to

SPARQL and dbpedia: Getting Structured Data from Wikipedia

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

I always wondered if you could extract structured data from Wikipedia. Then I stumbled upon DBPedia and SPARQLDBPedia stores Wikipedia data as a dataset, and it can be accessed using SPARQL. Let me demonstrate this with an example.

DBPedia has a SPARQL endpoint. And you can use SNORQL for exploring DBPedia. Let us execute the below SPARQL query in SNORQL and notice the resultset that is returned:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT DISTINCT ?film_title ?star_name
where {?film_title rdf:type <http://dbpedia.org/ontology/Film> .
?film_title  foaf:name ?film_name .
?film_title rdfs:comment ?film_abstract .
?film_title dbpedia-owl:starring ?star .
?star dbpprop:name ?star_name
}
LIMIT 5

I get the results as below:

SPARQL results from DBPedia

SPARQL results from DBPedia

As good place to learn SPARQL is http://answers.semanticweb.com/ 

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}