Over the past few weeks I’ve been doing some R&D into the advantages of using NoSQL databases to implement Entity services (also known as Data Services).
Entity service is a classification of service coined in the Service Technology series of books from Thomas Erl. It’s used to describe services that are highly agnostic and reusable because they deal primarily with the persistence of information modelled as business data ‘entities’. The ultimate benefit of having thin layer of these entity services is in the ease at which you can re-use them to support more complex service compositions.
This approach is further described in the Entity Abstraction SOA pattern.
Entity service layers are therefore a popular architectural choice in SOA, and implementing them has meant big business for vendors like Oracle and IBM, both of whom offer software to support this very task. There is even a separate standard for technologies in this area called Service Data Objects (or SDO for short).
This is all well and good, but these applications come with dedicated servers and specialised IDE’s and its all a bit ‘heavyweight’. These specialised solutions can be terribly expensive if all you really want are some simple CRUD-F operations (Create, Read, Update, Delete, Find) on a service that manages the persistence of a simple canonical data type like a Product or a Customer.
So the usual and basic implementation method would be to break out the Java and use a normal relational database with something like JPA (Java Persistence API) to help you with the object/relational mapping and persistence. This is a good choice and it can simplify the code a great deal, but there are still challenges. In web services where XML is being used as the payload, there is still the matter of converting between JAXB generated Java objects and the Java objects used to persist data via JPA. You can use something like HyperJaxB to annotate JAXB objects with JPA annotations, making the resulting data objects dual purpose, but you still have some issues with versioning and get none of the scalability advantages of NoSQL. Besides, I’ve used this method before in an earlier blog, so it where’s the fun doing it again?
A relatively new and enticing alternative is to use a NoSQL database for persistent storage. NoSQL databases have proved incredibly popular over the last few years, due mainly to their ability to achieve huge scalability and strong resilience. Lots of very high profile and high throughput websites use NoSQL datastores to manage and persist their data including Goole, Twitter, Foursquare, Facebook, and Ebay.
The term NoSQL is used to describe “a class of database management system identified by its non-adherence to the widely used relational database management system (RDBMS) model” – Wikipedia.
NoSQL datastores do not follow the conventional wisdom of a relational table based approach opting instead for a schema-less data structure that’s often ‘document centric’ and capable of supporting very large volumes of data in highly distributed environments.
Choosing a NoSQL Database.
There are lots of different NoSQL implementations, so I won’t go into detail here other than to say that my requirements were simple. I wanted something…
- available via 3rd party PaaS providers like Amazon and Jelastic
- that uses a document store approach (as opposed to key/value or graph)
- open source and freely available
- with a good Java API
- with good developer documentation
- that can be installed locally
- which I could administer myself (easier the better since I don’t want to be a DBA)
In the end my database choices came down to the two market leaders: MongoDB and CouchDB. Mongo has a great Java API, it’s popular with the Java community and it has good developer documentation. However, its admin features are rather unfriendly, with just a command line to keep you company. CouchDB, on the other hand, is much friendlier thanks to its ‘Futon’ UI. CouchDB has most of the technical benefits of Mongo (certainly in this R&D setting) but it lacks an out of the box Java API (REST is the default interface). Luckily, the Java community has stepped in with a number of native Java drivers for CouchDB, the best for me being the Ektorp library which is very simple to use but also very effective.
My goals for this R&D exercise are to:
- implement a viable entity service using a contract-first approach (Web Service bound to SOAP, fully WS-I compliant contract and with predefined data structures).
- discover if using a NoSQL database rather than JPA for data persistence and retrieval can increase developer productivity and reducing the overall effort of entity service implementation.
- Use the following SOA patterns: Service Facade (separates business logic), Contract/Schema Centralisation (canonical contract hosted via a simple service repository), Decoupled Contract, Concurrent Contract (SOAP & REST (maybe)), Message Metadata (headers) and Service Agent (for validation).
Essentially I want to build the entity service by using as little Java code as possible but at the same time preserve the contract-first approach. A contract-first approach is vital for good SOA development because it allows for a looser coupling between the consumer and the service and doesn’t corrupting the relationship with lots of technology specific dependencies like database table definitions and data types.
The main technologies I’ll be using for this development will be Java (JEE), Jax-WS, JaxB, CouchDB & Ektorp and Glassfish v3. As usual I’ll also be using Maven and Jenkins. All are production ready applications and frameworks, but because they’re open source the total cost so far is £0.00.
In the next article in this series I’ll be telling you how I got started on the development of the service, beginning with the web service contract or ‘WSDL’.
Subscribe to my blog now to get notified when I post.
It seems I’m on trend for once, with a number of interesting NoSQL articles coming to light in the last few days…
InfoQ asks ‘What is CouchDB‘ which is an article that I could have done with about a month ago. It’s a fairly comprehensive ‘getting started’ guide and contains more detail that I’ll go into regarding coding with CouchDB. Therefore, I’d advise that anyone looking for a more step by step Java coding guide to check out the article straight away.
The InfoQ article also references two other blog posts that could be of interest to architects. The first is a comparison of a number of different NoSQL databases (including Cassandra Tom!), and the second is a handy NoSQL selection guide.