Back To The Future with Datomic
Join the DZone community and get the full member experience.
Join For FreeAt the beginning of March, Rich Hickey and his team released Datomic. Datomic is a novel distributed database system designed to enable scalable, flexible and intelligent applications, running on next-generation cloud architectures. Its launch was surrounded with quite some buzz and skepticism, mainly related to its rather disruptive architectural proposal. Instead of trying to recapitulate the various pros and cons of its architectural approach, I will try to focus on the other innovation it introduces, namely its powerful data model (based upon the concept of Datoms) and its expressive query language (based upon the concept of Datalog). The remainder of this article will describe how to store facts and query them through Datalog expressions and rules. Additionally, I will show how Datomic introduces an explicit notion of time, which allows for the execution of queries against both the previous and future states of the database. As an example, I will use a very simple data model that is able to describe genealogical information. As always, the complete source code can be found on the Datablend public GitHub repository.
1. The Datomic data model
Datomic stores facts (i.e. your data points) as datoms. A datom represents the addition (or retraction) of a relation between an entity, an attribute, a value, and a transaction. The datom concept is closely related to the concept of a RDF triple, where each triple is a statement about a particular resource in the form of a subject-predicate-object expression. Datomic adds the notion of time by explicitly tagging a datom with a transaction identifier (i.e. the exact time-point at which the fact was persisted into the Datomic database). This allows Datomic to promote data immutability: updates are not changing your existing facts; they are merely creating new datoms that are tagged with a more recent transaction. Hence, the system keeps track of all the facts, forever.
Datomic does not enforce an explicit entity schema; it’s up to the user to decide what type of attributes he/she want to store for a particular entity. Attributes are part of the Datomic meta model, which specifies the characteristics (i.e. attributes) of the attributes themselves. Our genealogical example data model stores information about persons and their ancestors. For this, we will require two attributes: name and parent. An attribute is basically an entity, expressed in terms of the built-in system attributes such as cardinality, value type and attribute description.
// Open a connection to the database
String uri = "datomic:mem://test";
Peer.createDatabase(uri);
Connection conn = Peer.connect(uri);
// Declare attribute schema
List tx = new ArrayList();
tx.add(Util.map(":db/id", Peer.tempid(":db.part/db"),
":db/ident", ":person/name",
":db/valueType", ":db.type/string",
":db/cardinality", ":db.cardinality/one",
":db/doc", "A person's name",
":db.install/_attribute", ":db.part/db"));
tx.add(Util.map(":db/id", Peer.tempid(":db.part/db"),
":db/ident", ":person/parent",
":db/valueType", ":db.type/ref",
":db/cardinality", ":db.cardinality/many",
":db/doc", "A person's parent",
":db.install/_attribute", ":db.part/db"));
// Store it
conn.transact(tx).get();
All entities in a Datomic database need to have an internal key, called the entity id. In our case, we generate a temporary id through the tempid utility method. All entities are stored within a specific database partition that groups together logically related entities. Attribute definitions need to reside in the :db.part/db partition, a dedicated system partition employed exclusively for storing system entities and schema definitions. :person/name is a single-valued attribute of value type string. :person/parent is a multi-valued attribute of value type ref. The value of a reference attribute points to (the id) of another entity stored within the Datomic database. Once our attribute schema is persisted, we can start populating our database with concrete person entities.
// Define person entities
List tx = new ArrayList();
Object edmond = Peer.tempid(":db.part/user");
tx.add(Util.map(":db/id", edmond,
":person/name", "Edmond Suvee"));
Object gilbert = Peer.tempid(":db.part/user");
tx.add(Util.map(":db/id", gilbert,
":person/name", "Gilbert Suvee",
":person/parent", edmond));
Object davy = Peer.tempid(":db.part/user");
tx.add(Util.map(":db/id", davy,
":person/name", "Davy Suvee",
":person/parent", gilbert));
// Store them
conn.transact(tx).get();
We will create three concrete persons: myself, my dad Gilbert Suvee and my grandfather Edmond Suvee. Similarly to the definition of attributes, we again employ the tempid utility method to retrieve temporary ids for our newly created entities. This time however, we store our persons within the :db.part/user database partition, which is the default partition for storing application entities. Each person is given a name (via the :person/name attribute) and parent (via the :person/parent attribute). When calling the transact method, each entity is translated into a set of individual datoms that together describe the entity. Once persisted, Datomic ensures that temporary ids are replaced with their final counterparts.
2. The Datomic query language
Datomic’s query model is an extended form of Datalog. Datalog is a deductive query system which will feel quite familiar to people who have experience with SPARQL and/or Prolog. The declarative query language makes use of a pattern matching mechanism to find all combinations of values (i.e. facts) that satisfy a particular set of conditions expressed as clauses. Let’s have a look at a few example queries:
// Find all persons
System.out.println(Peer.q("[:find ?name " +
":where [?person :person/name ?name] ]", conn.db()));
// Find the parents of all persons
System.out.println(Peer.q("[:find ?name ?parentname " +
":where [?person :person/name ?name] " +
"[?person :person/parent ?parent] " +
"[?parent :person/name ?parentname] ]" , conn.db()));
// Find the grandparent of all persons
System.out.println(Peer.q("[:find ?name ?grandparentname " +
":where [?person :person/name ?name] " +
"[?person :person/parent ?parent] " +
"[?parent :person/parent ?grandparent] " +
"[?grandparent :person/name ?grandparentname] ]" , conn.db()));
Result query 1: [["Davy Suvee"], ["Edmond Suvee"], ["Gilbert Suvee"]]
Result query 2: [["Gilbert Suvee" "Edmond Suvee"], ["Davy Suvee" "Gilbert Suvee"]]
Result query 3: [["Davy Suvee" "Edmond Suvee"]]
String grandparentrule = "[ [ (grandparent ?person ?grandparent) [?person :person/parent ?parent] " +
"[?parent :person/parent ?grandparent] ] ]";
System.out.println(Peer.q("[:find ?name ?grandparentname " +
":in $ % " +
":where [?person :person/name ?name] " +
"(grandparent ?person ?grandparent) " +
"[?grandparent :person/name ?grandparentname] ]" , conn.db(), grandparentrule));
String ancestorrule = "[ [ (ancestor ?person ?ancestor) [?person :person/parent ?ancestor] ] " +
"[ (ancestor ?person ?ancestor) [?person :person/parent ?parent] " +
"(ancestor ?parent ?ancestor) ] ] ]";
System.out.println(Peer.q("[:find ?name ?ancestorname " +
":in $ % " +
":where [?person :person/name ?name] " +
"[ancestor ?person ?ancestor] " +
"[?ancestor :person/name ?ancestorname] ]" , conn.db(), ancestorrule));
Result query 4: [["Gilbert Suvee" "Edmond Suvee"], ["Davy Suvee" "Edmond Suvee"], ["Davy Suvee" "Gilbert Suvee"]]
3. Back To The Future I
As already mentioned in section 1, Datomic does not perform in-place updates. Instead, all facts are stored and tagged with a transaction such that the most up-to-date value of a particular entity attribute can be retrieved. By doing so, Datomic allows you to travel back into time and perform queries against previous states of the database. Using the asOf method, one can retrieve a version of the database that only contains facts that were part of the database at that particular moment in time. The use of a checkpoint that predates the storage of my own person entity will result in parent-query results that do not longer contain results related to myself.
System.out.println(Peer.q("[:find ?name ?parentname " +
":where [?person :person/name ?name] " +
"[?person :person/parent ?parent] " +
"[?parent :person/name ?parentname] ]", conn.db().asOf(getCheckPoint(checkpoint))));
Result query 2: [["Gilbert Suvee" "Edmond Suvee"]]
4. Back To The Future II
Datomic also allows to predict the future. Well, sort of … Similar to the asOf method, one can use the with method to retrieve a version of the database that gets extended with a list of not-yet transacted datoms. This allows to run queries against future states of the database and to observe the implications if these new facts were to be added.
List tx = new ArrayList();
tx.add(Util.map(":db/id", Peer.tempid(":db.part/user"),
":person/name", "FutureChild Suvee",
":person/parent", Peer.q("[:find ?person :where [?person :person/name \"Davy Suvee\"] ]", conn.db()).iterator().next().get(0)));
System.out.println(Peer.q("[:find ?name ?ancestorname " +
":in $ % " +
":where [?person :person/name ?name] " +
"[ancestor ?person ?ancestor] " +
"[?ancestor :person/name ?ancestorname] ]" , conn.db().with(tx), ancestorrule));
Result query 4: [["FutureChild Suvee" "Edmond Suvee"], ["FutureChild Suvee" "Gilbert Suvee"],
["Gilbert Suvee" "Edmond Suvee"], ["Davy Suvee" "Edmond Suvee"],
["Davy Suvee" "Gilbert Suvee"], ["FutureChild Suvee" "Davy Suvee"]]
5. Conclusion
The use of Datoms and Datalog allows you to express simple, yet powerful queries. This article introduces only a fraction of the features offered by Datomic. To get myself better acquainted with the various Datomic gotchas, I implemented the Tinkerpop Blueprints API on top of Datomic. By doing so, you basically get a distributed, temporal graph database, which is, as far as I know, unique within the Graph database ecosystem. The source code of this Blueprints implementation can currently be found on the Datablend public GitHub repository and will soon be merged within the Tinkerpop project..
Published at DZone with permission of Davy Suvee, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments