Open Source NoSQL Databases

Mitch Pronschinske

Feb. 23, 10 · Interview

Likes (0)

Comment

Save

46.1K Views

For almost a year now, the idea of "NoSQL" has been spreading due to the demand for relational database alternatives. Maybe the biggest motivation behind NoSQL is scalability. Relational databases don't lend themselves well to the kind of horizontal scalability that's required for large-scale social networking or cloud applications, and ORMs can abstract away impedance mismatch only so much. In other cases, companies just don't need as many of the complex features and rigid schemas provided by relational databases. Most people are not suggesting that we all ditch the RDBMS, in fact, many companies don't really need to switch. Relational databases will probably be necessary for many applications years and years from now. In essence, NoSQL is a movement that aims to reexamine the way we structure data and draw attention to innovation in hopes of finding the solution to the next generation's data persistence problems.

Here are some of the better known open source data stores/models labeled as "NoSQL":

CouchDB- Document Store

Maps keys to data
It provides a RESTful JSON API and is written in Erlang
You can upload functions to index data and then you can call those functions
Has a very simple REST interface
Provides an innovative replication strategy - nodes can reconnect, sync, and reconcile differences after being disconnected for long periods of time
Enables new distributed types of applications and data

MongoDB - Document Store

Free-form key-value-like data store with good performance
Powerful, expansive query model
Usability rivals that of Redis
Good for complex data storage needs.
Production-quality sharding capabilities

Neo4j - GraphDB

Disk-based
Has a restricted, single-threaded model for graph traversal
Has optional layers to expose Neo4j as an RDF store
Can handle graphs of several billion nodes, relationships, or properties on a single machine
Released under a dual license - free for non-commercial use

Apache Hbase - Wide Column Store/Column Families

Built on top of Hadoop, which has functionality similar to Google's GFS and MapReduce systems
Hadoop's HDFS provides a mechanism that reliably stores and organizes large amounts of data
Random access performance is on par with MySQL
Has a high performance Thrift gateway
Cascading source and sink modules

Redis - Key Value/Tuple Store

Provides a rich API and does more operations in memory, using disk only periodically.
It's extremely fast
Lets you append a value to the end of a list of items that's already been stored on a key.
Has atomic operations, making it a best-of-breed tally server.

Memcached - Key Value/Tuple Store

High-performance, distributed memory object caching
Free and open source
Generic and agnostic to the objects/strings it caches
It's all in-memory data
Simple yet elegant design enables easy development and deployment
Language neutral caching scheme.
Most of the large properties on the web are using it now, except for Microsoft

Project Voldemort - Eventually Consistent Key Value Store

Used by LinkedIn
Handles server failure transparently
Pluggable serialization supports rich keys and values including lists and tuples with named fields
Supports common serialization frameworks including Protocol Buffers, Thrift, and Java Serialization
Data items are versioned
Supports pluggable data placement strategies
Memory caching and the storage system are combined

Tokyo Cabinet and Tokyo Tyrant - Key Value/Tuple Store

Supports hashtable mode, b-tree mode, and table mode
It's fast and straightforward
Good for small to medium-sized amounts of data that require rapid updating and can be easily modeled in terms of keys and values

Cassandra - Wide Column Store/Column Families

First developed by Facebook
SuperColumns can turn a simple key-value architecture into an architecture that handles sorted lists, based on an index specified by the user.
Can scale from one node to several thousand nodes clustered in different data centers.
Can be tuned for more consistency or availability
Smooth node replacement if one goes down

____

Some other well known NoSQL-style data stores that are closed source include Google BigTable and Amazon SimpleDB. GigaSpaces is a popular space-based Grid solution that has NoSQL qualities.

Check out this informative post on NoSQL patterns.

Relational database Database Open source NoSQL Data (computing)

Opinions expressed by DZone contributors are their own.

Related

Trending

Open Source NoSQL Databases

Related

Partner Resources