DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Securing Your Software Supply Chain with JFrog and Azure
Register Today

Trending

  • Implementing a Serverless DevOps Pipeline With AWS Lambda and CodePipeline
  • Building a Flask Web Application With Docker: A Step-by-Step Guide
  • Observability Architecture: Financial Payments Introduction
  • Reactive Programming

Trending

  • Implementing a Serverless DevOps Pipeline With AWS Lambda and CodePipeline
  • Building a Flask Web Application With Docker: A Step-by-Step Guide
  • Observability Architecture: Financial Payments Introduction
  • Reactive Programming
  1. DZone
  2. Data Engineering
  3. Databases
  4. Making Sense of Datomic, the Revolutionary Non-NoSQL Database

Making Sense of Datomic, the Revolutionary Non-NoSQL Database

Jakub Holý user avatar by
Jakub Holý
·
Jun. 18, 13 · Interview
Like (1)
Save
Tweet
Share
6.73K Views

Join the DZone community and get the full member experience.

Join For Free

i have finally managed to understand one of the most unusual databases of today, datomic, and would like to share it with you. thanks to stuart halloway and his workshop!

why? why?!?

as we shall see shortly, datomic is very different from the traditional rdbms databases as well as the various nosql databases. it even isn’t a database – it is a database on top of a database. i couldn’t wrap my head around that until now. the key to the understanding of datomic and its unique design and advantages is actually simple.

the mainstream databases (and languages) have been designed around the following constraints of 1970s:

  • memory is expensive
  • storage is expensive
  • it is necessary to use dedicated, expensive machines

datomic is essentially an exploration of what database we would have designed if we hadn’t these constraints. what design would we choose having gigabytes of ram, networks with bandwidth and speed matching and exceeding harddisk access, the ability to spin and kill servers at a whim.

but datomic isn’t an academical project. it is pragmatic, it wants to fit into our existing environments and make it easy for us to start using its futuristic capabilities now. and it is not as fresh and green as it might seem. rich hickey, the master mind behind clojure and datomic, has reportedly thought about both these projects for years and the designs have been really well thought through.

the weird architecture of datomic

  1. datomic is a database on top of another database (or rather storage) – in-memory, a file system, a traditional rdbms, amazon dynamo.
  2. you do not send your query to the server and get back the result. instead, you get back all the data you need to execute the query and run the query – and all subsequent queries – locally. thus, “joins” are pretty cheap and you can do plenty of otherwise impossible things (combine data from multiple databases and local data structures, run any code on them, …). each application using datomic – a “peer” – will have the data it needs, based on its unique needs and usage patterns, close to itself.
  3. all writes go through one component, called transactor, which essentially serializes the writes, thus ensuring acid . it might sound as a bottleneck but it isn’t for most practical purposes [1] given the design and typical application needs. (reportedly, datomic could handle all transactions for all credit cards in the world. listen to the experiences of room key with their rather write-heavy load in the relevance podcast with kurt zimmer (podcast episode 033) .)
  4. datomic works quite similarly to a version control system such as git. it never overwrites data, there are no updates. you only mark the data as not valid anymore and add new data, which produces a new version of the database (think of git hash / svn revision number). you can then query the latest state of the database or the state as of a particular version. (of course the whole database isn’t copied whenever you add a fact to it. datomic is smart and efficient.)
  5. it is not a single, monolithic server, the storage, transactor, and peers are physically separate pieces.

what has made this possible?

  • network access as fast as or faster then disk access => can fetch all the data over the network
  • plenty of memory => can store a substantial subset of it on each peer according to its actual needs
  • storage is huge and cheap => we can easily store historical data
  • experiences with efficient, immutable, “persistent” data structures used in modern fp languages => cheap creation of new “database values”

the unique value proposition and capabilities of datomic

we have now learned about and hopefully understood the unique design of datomic. but what does it give to us, what does it distinguish from other databases?

the architecture, together with few other design decisions, provides the following key characteristics:

  • programmability – data, schema, query input/output, transaction metadata are all just elementary data structures that you have fully available at the peer and can thus combine and process in powerful ways unimaginable before
  • persistence/accountability – you never lose history, can annotate transactions with metadata about who/why etc., support for finding out how things were, how they have been changing, performing what-if analysis
  • elastic scalability – since a lot of the load has been pushed to the peers
  • flexibility – no rigid schema, easy to navigate and combine and cache data based on each peer’s unique needs, extensibility via data functions

closing notes

datomic has similar goals as relational databases (especially acid) and could be used in similar use cases. performance-wise, if writes are more important than reads, if you need to write really a lot of data each second continuously, or if you have over billions of “rows” then you might prefer another solution. thanks to the design and recommended architecture for heavily loaded installations, i.e. with memcached in front of the storage, the performance of the backend isn’t so important (as the peers have the data they need locally or get it from memcached) so it should be selected more based on the usage-related characteristics.

summary

the design of datomic – peers fetching data and running queries locally, a single coordinator of writes (transactor), building on existing databases/storage tools (and keeping all the history) seemed very strange and perhaps inefficient to me until i realized that the traditional databases are designed around constraints that do not exist anymore. datomic now makes sense to me and seems as a tool with intriguing capabilities and great potential. i hope you see it the same way now :-) .

i have left out some interesting topics such as what data structures can be stored in datomic and the data model and query model used. to learn about these and more about datomic, head to datomic for five year olds and datomic’s home page .

bonus links

  • data functions for optimistic and pesimistic locking in datomic (forum answer)
  • highscalability.com: voltdb decapitates six sql urban myths and delivers internet scale oltp in the process – description of the architecture of voltdb, that has a few things in common with datomic (single-threaded writes, “stored procedures” as units of transaction etc.)
  • voltdb – mike stonebraker’s incredibly scaleable, sql, acid database that also breaks up with the constraint of 70s and leverages huge ram, single-threaded access etc.

[1] harizopoulos, s., abadi, d. j., madden, s., & stonebraker, m. (2008, june). oltp through the looking glass, and what we found there . in proceedings of the 2008 acm sigmod international conference on management of data (pp. 981-992). acm. – this paper shows that traditional rdbms spend nearly 30% time on locking and latching, that could be eliminated with single-threaded access, as is also done in voltdb. see also the voltdb whitepaper .

Database Relational database Datomic Data (computing)

Published at DZone with permission of Jakub Holý, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Implementing a Serverless DevOps Pipeline With AWS Lambda and CodePipeline
  • Building a Flask Web Application With Docker: A Step-by-Step Guide
  • Observability Architecture: Financial Payments Introduction
  • Reactive Programming

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: