Clerezza: An Apache Project for the Semantic Web
Join the DZone community and get the full member experience.
Join For FreeMore businesses are realizing the value of Semantic Web development for Semantic SEO and marketing, enterprise BI, and richer data systems. Three months ago, a new project entered the Apache incubator that can remove many of the difficulties associated with developing for the Semantic Web. Clerezza is an OSGi-based application that helps developers to build applications that integrate perfectly with the Semantic Web. The project also provides a set of bundles in machine-understandable RDF/JSON formats for building RESTful Semantic Web applications and web services. It uses technologies like jQuery, Jena, Apache Felix, Jetty, and Jersey. Clerezza can also be used as a platform, providing compile and runtime requirements for building semantic applications. DZone interviewed Reto Bachmann-Gmür, the "father" of Clerezza who worked on the technology before it was donated by Trialox.
DZone: When did Clerezza begin and when did it enter the incubator?
Reto Bachmann-Gmür: Development was started by the startup company trialox.org at the end of 2008. Trialox.org was founded by a software and a web company in collaboration with the University of Zurich to build an open source modular and semantic CMS. Clerezza is the foundation of this CMS, it was accepted for apache incubation in November 2009. The original idea for this came up after it was presented at an IKS workshop in Rome, the IKS was looking at a foundation for prototyping and several apache committers were present at the workshop.
DZone: What were the motivations behind creating this project?
Reto: The project has two aspects: web application development and RDF storage and manipulation. We believe that combining these two aspects in an integrated platform makes it easy to create powerful data-centric web applications.
Most existing web development frameworks tend to hide away core concepts of the Web (Rest, Uri, Representations) to emulate a desktop applications environment, due this the application built with them do not benefit from core features of the web: scalability, device independence (and thus accessibility), collaboration (with persistent "deep" Uris). Clerezza by contrast is designed from the ground up to leverage the power of the web-stack, and it does not map these to desktop paradigms or to traditional RDMS database models, as the RDF (linked data) model is much more flexible and seamlessly extends web concepts.
DZone: How's the community shaping up around this project so far?
Reto: The proposal generated a lot of interest, a few persons not previously involved have started contributing. Especially noteworthy is that Tommaso Teofili started implementing the integration with Apache UIMA. It is essential that we have a website featuring tutorials as soon as possible. (There's a bit of a delay here, as we want to use Clerezza to produce this website).
DZone: Who are its key supporters? (and their respective companies)
Reto: Several major contributors (me, Manuel) work for Trialox, Bertrand works for Day, Tommaso for Sourcesense, Hasan for the University of Zurich
DZone: How far along is the project on the initial list of goals?
Reto: The main hindrance is the lack of accessible documentation, the experience has shown that new developers can start developing on top of Clerezza, but until now some individual coaching was necessary for this.
DZone: What burdens does it spare for developers?
Reto: With traditional architectures one would typically have to create additional templates or scripts to create RDF representations.
DZone: What tedious database related tasks in traditional web development are eliminated because of the back-end required RDF model?
Reto:
- In RDF one can basically add any property to any resource, with RDBMS this typically requires a change of the database schema.
- With RDBMS there is no intrinsic mapping between database entities and URIs, i.e. the application has to provide this.
DZone: Since the JAX-RS implementation is based on wymiwyg WRHAPI, can it
only run on Jetty, or could it run on Tomcat too?
Reto: Currently the WRHAPI implementation maps to the default OSGi http service, the jetty based implementation can be used as an alternative (e.g. to have Clerezza listen on multiple ports). While it is generally easy to write new backends for WRHAPI, it seems that tomcat is less suited to run in an OSGi container. What might be useful here is integrating the whole OSGi container in a web application, so Clerezza could be deployed as a war-archive to any JEE web container.
DZone: What part does the OSGi play in Clerezza?
Reto: Clerezza is fully based on OSGi. OSGi is a very lightweight approach to offer the modularization and dynamism missing in standard Java. By using OSGi services it can also interoperate with Spring-DS or Peaberry applications.
DZone: Is there any special advantage for using it in Semantic Web applications?
Reto: The issues addressed by OSGi do not overlap with those addressed by RDF. With REST the overlapping is small, OSGi provides fine grained modularization typically within a single virtual machine while REST interfaces are generally used for interaction between different and distributed agents. Thanks to JAX-RS the same interface can be exposed for faster local consumption as an OSGi service as well as for cross-platform access via REST.
DZone: I've heard that things like RDF and the Semantic Web are currently of interest in mainly 'academic' circles and not of interest to average developers. Is that true?
Reto: RDF and the semantic web offer a wide range of possibilities, including some quite freaky artificial intelligence stuff. This audience and connotation has probably scared away many developers who could benefit from the flexibility of RDF. Recently however, these technologies are becoming increasingly popular with developers and industry, mainly with the label "linked data".
DZone: Will Clerezza be usable for the average developer?
Reto: Our aim is to make Clerezza a platform onto which it shall be very easy to develop applications. By supporting scripting language it shall also serve the need of non-java developers. For developing in java, we found that Clerezza is easier to grasp for relatively unexperienced developers than for mainstream enterprise developers, it seems that once you got used to the rigidity of SQL and JEE you get dizzy at the flexibility of Clerezza.
DZone: Where did the name, Clerezza, come from?
Reto: Clerezza means clarity in Rumantsch, which is what exactly what we want to provide - clarity with the connotation of the solid mountains, the pure water and the fresh air of the region where Heidi rejected the overfloading with useless books and vanity of Frankfurt and finds a way to regain the focus to what really matters to her.
DZone: Is there anything else about Clerezza (its current status or future plans) that you'd like to share?
Reto: To get real benefit out of the information we have access to, both as individuals and as a society we most urgently need to address the challenge of information overfloading. In my opinion the most promising approach is collaborative filtering and recommendation as well as semantic web technology, Clerezza should be the best foundation to build such applications, join us!
DZone: When did Clerezza begin and when did it enter the incubator?
Reto Bachmann-Gmür: Development was started by the startup company trialox.org at the end of 2008. Trialox.org was founded by a software and a web company in collaboration with the University of Zurich to build an open source modular and semantic CMS. Clerezza is the foundation of this CMS, it was accepted for apache incubation in November 2009. The original idea for this came up after it was presented at an IKS workshop in Rome, the IKS was looking at a foundation for prototyping and several apache committers were present at the workshop.
DZone: What were the motivations behind creating this project?
Reto: The project has two aspects: web application development and RDF storage and manipulation. We believe that combining these two aspects in an integrated platform makes it easy to create powerful data-centric web applications.
Most existing web development frameworks tend to hide away core concepts of the Web (Rest, Uri, Representations) to emulate a desktop applications environment, due this the application built with them do not benefit from core features of the web: scalability, device independence (and thus accessibility), collaboration (with persistent "deep" Uris). Clerezza by contrast is designed from the ground up to leverage the power of the web-stack, and it does not map these to desktop paradigms or to traditional RDMS database models, as the RDF (linked data) model is much more flexible and seamlessly extends web concepts.
DZone: How's the community shaping up around this project so far?
Reto: The proposal generated a lot of interest, a few persons not previously involved have started contributing. Especially noteworthy is that Tommaso Teofili started implementing the integration with Apache UIMA. It is essential that we have a website featuring tutorials as soon as possible. (There's a bit of a delay here, as we want to use Clerezza to produce this website).
DZone: Who are its key supporters? (and their respective companies)
Reto: Several major contributors (me, Manuel) work for Trialox, Bertrand works for Day, Tommaso for Sourcesense, Hasan for the University of Zurich
DZone: How far along is the project on the initial list of goals?
Reto: The main hindrance is the lack of accessible documentation, the experience has shown that new developers can start developing on top of Clerezza, but until now some individual coaching was necessary for this.
DZone: What burdens does it spare for developers?
Reto: With traditional architectures one would typically have to create additional templates or scripts to create RDF representations.
DZone: What tedious database related tasks in traditional web development are eliminated because of the back-end required RDF model?
Reto:
- In RDF one can basically add any property to any resource, with RDBMS this typically requires a change of the database schema.
- With RDBMS there is no intrinsic mapping between database entities and URIs, i.e. the application has to provide this.
DZone: Since the JAX-RS implementation is based on wymiwyg WRHAPI, can it
only run on Jetty, or could it run on Tomcat too?
Reto: Currently the WRHAPI implementation maps to the default OSGi http service, the jetty based implementation can be used as an alternative (e.g. to have Clerezza listen on multiple ports). While it is generally easy to write new backends for WRHAPI, it seems that tomcat is less suited to run in an OSGi container. What might be useful here is integrating the whole OSGi container in a web application, so Clerezza could be deployed as a war-archive to any JEE web container.
DZone: What part does the OSGi play in Clerezza?
Reto: Clerezza is fully based on OSGi. OSGi is a very lightweight approach to offer the modularization and dynamism missing in standard Java. By using OSGi services it can also interoperate with Spring-DS or Peaberry applications.
DZone: Is there any special advantage for using it in Semantic Web applications?
Reto: The issues addressed by OSGi do not overlap with those addressed by RDF. With REST the overlapping is small, OSGi provides fine grained modularization typically within a single virtual machine while REST interfaces are generally used for interaction between different and distributed agents. Thanks to JAX-RS the same interface can be exposed for faster local consumption as an OSGi service as well as for cross-platform access via REST.
DZone: I've heard that things like RDF and the Semantic Web are currently of interest in mainly 'academic' circles and not of interest to average developers. Is that true?
Reto: RDF and the semantic web offer a wide range of possibilities, including some quite freaky artificial intelligence stuff. This audience and connotation has probably scared away many developers who could benefit from the flexibility of RDF. Recently however, these technologies are becoming increasingly popular with developers and industry, mainly with the label "linked data".
DZone: Will Clerezza be usable for the average developer?
Reto: Our aim is to make Clerezza a platform onto which it shall be very easy to develop applications. By supporting scripting language it shall also serve the need of non-java developers. For developing in java, we found that Clerezza is easier to grasp for relatively unexperienced developers than for mainstream enterprise developers, it seems that once you got used to the rigidity of SQL and JEE you get dizzy at the flexibility of Clerezza.
DZone: Where did the name, Clerezza, come from?
Reto: Clerezza means clarity in Rumantsch, which is what exactly what we want to provide - clarity with the connotation of the solid mountains, the pure water and the fresh air of the region where Heidi rejected the overfloading with useless books and vanity of Frankfurt and finds a way to regain the focus to what really matters to her.
DZone: Is there anything else about Clerezza (its current status or future plans) that you'd like to share?
Reto: To get real benefit out of the information we have access to, both as individuals and as a society we most urgently need to address the challenge of information overfloading. In my opinion the most promising approach is collaborative filtering and recommendation as well as semantic web technology, Clerezza should be the best foundation to build such applications, join us!
Web Service
Semantic Web
Semantics (computer science)
application
DZone
Database
dev
Resource Description Framework
Web development
Opinions expressed by DZone contributors are their own.
Comments