Aggregation framework as stored-procedure in Azure DocumentDB

DZone 's Guide to

Aggregation framework as stored-procedure in Azure DocumentDB

· Database Zone ·
Free Resource

The bad news: DocumentDB does not include aggregation capability.

The good news: DocumentDB includes stored procedures and documentdb-lumenize uses this to add aggregation capability that far exceeds that which you are used to with SQL.

As Michael Stonbraker, genius creator of not one but three wildly successful databases (Postgres, Vertica, and VoltDB) said, "It's better to move the code to the data than the other way around." VoltDB was designed to execute Java-language ACID transactional stored procedures running in the same memory space as the data with huge performance and consistency benefits. Well, Microsoft Azure's DocumentDB takes a similar approach except that it's a NoSQL data store using JSON and you write your stored-procedures in JavaScript (or CoffeeScript in my case).

Using this power, I've ported the aggregation engine of the Lumenize library, which I created while working on my PhD at Carnegie Mellon over to run inside of DocumentDB. This instantly upgrades DocumentDB with more powerful declarative aggregation (including full OLAP cube) capability than even the most advanced databases.

A simple groupBy example

Let's assume this is the only data in your collection.

  {id: 1, value: 10}
  {id: 1, value: 100}
  {id: 2, value: 20}
  {id: 3, value: 30}

Now, let's call the cube with the following:

{cubeConfig: {groupBy: 'id', field: "value", f: "sum"}}

After you call the cube stored procedure, you should expect this to be in thesavedCube.cellsAsCSVStyleArray parameter of the response. Note, the _count metric is always calculated even when not specified.

  [ 'id', '_count', 'value_sum' ],
  [   1,         2,         110 ],
  [   2,         1,          20 ],
  [   3,         1,          30 ]

Full OLAP cube capability

A groupBy is just a one-dimensional OLAP cube and the example above just uses a bit of syntactic sugar to quickly configure this one-dimensional OLAP cube. However, the underlying engine is a fast, light, flexible, declaratively-configured OLAP Cube with powerful hierarchical rollup support. It can be used from node.js projects as well as .NET projects or any other platform using DocumentDB's REST API.

You can read all about this capability as well as all the details on how to use it on the GitHub page for documentdb-lumenize.

This is still a work in progress, so please give me your feedback.

aggregation framework ,aggregations ,azure ,bigdata ,coffeescript ,documentdb ,javascript ,microsoft ,nosql ,nosql c-sharp

Published at DZone with permission of Larry Maccherone . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}