Why I Think Riak is a Great NoSQL Option
- Basic: Essentially a key/value store implementing Amazon's Dynamo where you can decide the level of replication you want on a per bucket or operation basis. The same applies for read operations.
- Links: Typically you don't have relationships between entities in a key/value store, but Riak provides links. So one entity can point to the other and this link can be walked. In other words, much of the relationship features that you'd see through foreign constraints in SQL and that you'd need to implement yourself in a NoSQL DB can be done via Links in Riak.
- Still under development: Given that the book I am following to learn Riak tested a version from Dec/2011, I can see that still many things are being changed and developed for Riak. This is something to keep in mind, as code that works with one version may not work with a later version and you'll need to spend time to understand what happened. A good example of that was with "precommit" hooks - I couldn't easily find a good example of how to use them against the current version (thus, my post on it).
- Map Reduce support: Although I only tested artificial examples from a book, what may not be very realist, still this is an amazing support. Rather than pulling the data to perform computation on remote nodes, we have the capability to push the code to the Riak nodes and have them performing the computation.
- Secondary indexes: I've had the experience of working on NoSQL database that doesn't provide secondary indexes, and this can be really a painpoint, so I really appreciate this support by Riak. This is only supported by the LevelDB backend, though, and I am not sure what the performance impact is when one compares only primary indexes vs. secondary indexes.
- Precommit/Postcommit hooks: You can set scripts to be run before or after writing to the DB. Whereas you would need to do that by yourself with other DBs, Riak can run your code on the DB server.
- Search Support: This feature really surprised me. Through a custom precommit script or one of the already available indexer scripts, you can create inverted indexes for your data. This can be set on a per bucket basis rather than being a global setting. Since Riak integrates Sol, you have all the Lucene + Solr power to perform flexible searches on your inverted indexes.
- HTTP and Protocol Buffers: I've played with the Restful API, but Riak reduces the overhead of these remote calls by having Protocol Buffers (Protobuf) support as well. HTTP and serializing data can be a performance issue (like in some cloud-based DBs), so this support is definitely welcome.
Overall, I can say that it was really great to learn more Riak and I think it's a great NoSQL option to consider.
Update 07/29/2012: I came across this other great post on the actual 1-year experience with Riak. Definitely a good follow-up reading if you're interested in more details.