Over a million developers have joined DZone.

Get Real Data from the Semantic Web

DZone 's Guide to

Get Real Data from the Semantic Web

· Big Data Zone ·
Free Resource

Semantic Web this, Semantic Web that, what actual use is the Semantic Web in the real world? I mean how can you actually use it?

If you haven't heard the term "Semantic Web" over the last couple of years then you must have been in... well somewhere without this interweb they're all talking about.

Basically, by using metadata (see RDF), disparate bits of data floating around the web can be joined up. In otherwords they stop being disparate. Better than that, theoretically you can query the connections between the data and get lots of lovely information back. This last bit is done via SPARQL, and yes, the QL does stand for Query Language.

I say theoretically because in reality it's a bit of a pain. I may be an intelligent agentcapable of finding linked bits of data through the web, but how exactly would you do that in python.

It is possible to use rdflib to find information, but it's very long winded. It's much easier to use SPARQLWrapper andin fact in the simple example below, I've used a SPARQLWrapperWrapper to make asking for lots of similarly sourced data, in this case DBPedia, even easier.

from SPARQLWrapper import SPARQLWrapper, JSON
class SparqlEndpoint(object):
    def __init__(self, endpoint, prefixes={}):
        self.sparql = SPARQLWrapper(endpoint)
        self.prefixes = {
            "dbpedia-owl": "http://dbpedia.org/ontology/",
            "owl": "http://www.w3.org/2002/07/owl#",
            "xsd": "http://www.w3.org/2001/XMLSchema#",
            "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
            "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
            "foaf": "http://xmlns.com/foaf/0.1/",
            "dc": "http://purl.org/dc/elements/1.1/",
            "dbpedia2": "http://dbpedia.org/property/",
            "dbpedia": "http://dbpedia.org/",
            "skos": "http://www.w3.org/2004/02/skos/core#",
            "foaf": "http://xmlns.com/foaf/0.1/",
    def query(self, q):
        lines = ["PREFIX %s: <%s>" % (k, r) for k, r in self.prefixes.iteritems()]
        query = "\n".join(lines)
        print query
        results = self.sparql.query().convert()
        return results["results"]["bindings"]
class DBpediaEndpoint(SparqlEndpoint):
    def __init__(self, prefixes = {}):
        endpoint = "http://dbpedia.org/sparql"
        super(DBpediaEndpoint, self).__init__(endpoint, prefixes)

To use this try importing the DBpediaEndpoint and feeding it some SPARQL:

#!/usr/bin/env python
import sys
from sparql import DBpediaEndpoint
def main ():
    s = DBpediaEndpoint()
    resource_uri = "http://dbpedia.org/resource/Foobar"
    results = s.query("""
        SELECT ?o
        WHERE { <%s> dbpedia-owl:abstract ?o .
        FILTER(langMatches(lang(?o), "EN")) }
    """ % resource_uri)
    abstract = results[0]["o"]["value"]
    print abstract
if __name__ == '__main__':
    except KeyboardInterrupt, e: # Ctrl-C
        raise e

Your homework is - How do you identify the resource_uri in the first place?

That's for another evening.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}