Over a million developers have joined DZone.

SPARQL and dbpedia: Getting Structured Data from Wikipedia

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

I always wondered if you could extract structured data from Wikipedia. Then I stumbled upon DBPedia and SPARQLDBPedia stores Wikipedia data as a dataset, and it can be accessed using SPARQL. Let me demonstrate this with an example.

DBPedia has a SPARQL endpoint. And you can use SNORQL for exploring DBPedia. Let us execute the below SPARQL query in SNORQL and notice the resultset that is returned:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT DISTINCT ?film_title ?star_name
where {?film_title rdf:type <http://dbpedia.org/ontology/Film> .
?film_title  foaf:name ?film_name .
?film_title rdfs:comment ?film_abstract .
?film_title dbpedia-owl:starring ?star .
?star dbpprop:name ?star_name
}
LIMIT 5

I get the results as below:

SPARQL results from DBPedia

SPARQL results from DBPedia

As good place to learn SPARQL is http://answers.semanticweb.com/ 

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:

Published at DZone with permission of Krishna Prasad, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}