Over a million developers have joined DZone.

Get Real Data from the Semantic Web

DZone's Guide to

Get Real Data from the Semantic Web

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Semantic Web this, Semantic Web that, what actual use is the Semantic Web in the real world? I mean how can you actually use it?

If you haven't heard the term "Semantic Web" over the last couple of years then you must have been in... well somewhere without this interweb they're all talking about.

Basically, by using metadata (see RDF), disparate bits of data floating around the web can be joined up. In otherwords they stop being disparate. Better than that, theoretically you can query the connections between the data and get lots of lovely information back. This last bit is done via SPARQL, and yes, the QL does stand for Query Language.

I say theoretically because in reality it's a bit of a pain. I may be an intelligent agentcapable of finding linked bits of data through the web, but how exactly would you do that in python.

It is possible to use rdflib to find information, but it's very long winded. It's much easier to use SPARQLWrapper andin fact in the simple example below, I've used a SPARQLWrapperWrapper to make asking for lots of similarly sourced data, in this case DBPedia, even easier.

from SPARQLWrapper import SPARQLWrapper, JSON
class SparqlEndpoint(object):
    def __init__(self, endpoint, prefixes={}):
        self.sparql = SPARQLWrapper(endpoint)
        self.prefixes = {
            "dbpedia-owl": "http://dbpedia.org/ontology/",
            "owl": "http://www.w3.org/2002/07/owl#",
            "xsd": "http://www.w3.org/2001/XMLSchema#",
            "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
            "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
            "foaf": "http://xmlns.com/foaf/0.1/",
            "dc": "http://purl.org/dc/elements/1.1/",
            "dbpedia2": "http://dbpedia.org/property/",
            "dbpedia": "http://dbpedia.org/",
            "skos": "http://www.w3.org/2004/02/skos/core#",
            "foaf": "http://xmlns.com/foaf/0.1/",
    def query(self, q):
        lines = ["PREFIX %s: <%s>" % (k, r) for k, r in self.prefixes.iteritems()]
        query = "\n".join(lines)
        print query
        results = self.sparql.query().convert()
        return results["results"]["bindings"]
class DBpediaEndpoint(SparqlEndpoint):
    def __init__(self, prefixes = {}):
        endpoint = "http://dbpedia.org/sparql"
        super(DBpediaEndpoint, self).__init__(endpoint, prefixes)

To use this try importing the DBpediaEndpoint and feeding it some SPARQL:

#!/usr/bin/env python
import sys
from sparql import DBpediaEndpoint
def main ():
    s = DBpediaEndpoint()
    resource_uri = "http://dbpedia.org/resource/Foobar"
    results = s.query("""
        SELECT ?o
        WHERE { <%s> dbpedia-owl:abstract ?o .
        FILTER(langMatches(lang(?o), "EN")) }
    """ % resource_uri)
    abstract = results[0]["o"]["value"]
    print abstract
if __name__ == '__main__':
    except KeyboardInterrupt, e: # Ctrl-C
        raise e

Your homework is - How do you identify the resource_uri in the first place?

That's for another evening.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.


Published at DZone with permission of Col Wilson, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}