Last time, we got a general overview of DynamoDB. This time, let's look at some advanced features.
All the features described so far are core DynamoDB features. But DynamoDB also has some additional, more advanced features that you can use to build complex applications. In this section, I’ll provide a short list of what these features are and what you can use them for.
A common problem in a distributed systems is when different actors are stepping on each other toes while performing operations in parallel. A common example is when two different processes are trying to update the same item in a database. In this case, a second update can override data written by the first update.
To solve this problem, DynamoDB allows specifying a condition for performing an update. If a condition is satisfied, new data is written; otherwise, DynamoDB will return an error.
A common way to use this feature, which is implemented in
DynamoDBMapper, is to maintain a version field with an item and increment it on every update. If a version was not changed, a process can write new data. Otherwise, it will have to re-read data and try to perform the operation again.
This technique is also called optimistic locking and is similar to the compare-and-swap operation that is used to implement lock-free data structures.
While DynamoDB does not have built-in support for transactions, AWS has implemented a Java library that implements transactions on top of existing DynamoDB features. A detailed design is described here, but in a nutshell, when you perform an operation with the DynamoDB transaction library, it stores a list of performed operations into a separate table and on commit, it applies stored operations.
Since the library is open-source, there is nothing that prevents developers from implementing a similar solution for other languages, but it seems that currently, DynamoDB transaction support is only implemented for Java.
Time to Live
Not all data that you store in DynamoDB should be stored forever. Older data can be moved to a cheaper data storage, like S3, or simply removed. To automatically delete old data, DynamoDB implements the time-to-live feature which allows specifying an attribute that stores a timestamp when an item should be removed. DynamoDB tracks expired items and removes them at no extra cost.
Another powerful DynamoDB feature is DynamoDB streams. If it is enabled it allows reading an immutable, ordered stream of updates to a DynamoDB table. An item is written to a stream after an update is performed and allow to react to changes to DynamoDB. This is a crucial feature if you need to implement one of the following use-cases:
- Cross-region replication: You may want to store a copy of your data in a separate region to keep it close to your users or to have a back-up of your data. To implement it, you can read a DynamoDB stream and replay update operations in a second database.
- Aggregated table: The DynamoDB model might not suit some of the queries you need to perform. For example, if you need to group by data, DynamoDB does not support it out-of-the-box. To implement this feature, you may read DynamoDB update stream and maintain an aggregated table that fits DynamoDB model and allows efficient queries.
- Keep data in sync: In many cases, you need to maintain a copy of your data in a different datastore such as cache or CloudSearch. DynamoDB Streams is an immense help for that since a record in DynamoDB Stream appears only if data was stored in DynamoDB.
DynamoDB Streams implementation is very similar to that of another AWS service, called Kinesis. They both have similar APIs, and to read data from any of the systems, you can use Kinesis Client Library (KCL) that provides a high-level interface for reading data from a Kinesis or DynamoDB stream. Keep in mind that DynamoDB Streams API and Kinesis API are slightly different, so you need to use an adapter to use KCL with DynamoDB.
DynamoDB Accelerator (DAX)
As with every other database, it can be beneficial to use cache if you use DynamoDB. Unfortunately, it may be tricky to maintain cache consistency. If you use a caching solution like ElastiCache, it may be a good idea to utilize DynamoDB stream to maintain a copy of your data in a cache, but recently DynamoDB has introduced a new feature called DynamoDB accelerator.
DAX is write-through caching layer for DynamoDB. It has exactly the same API as DynamoDB, and if you’ve enabled it, you are supposed to read and write to DynamoDB through it. DAX keeps track of what data was written to DynamoDB and only stores it if a write was acknowledged by DynamoDB.
One of the key benefits of using DAX is that it has a sub-millisecond latency which may be especially important if you have strict SLAs.
DynamoDB is a great database. It has a rich feature set, predictable low latency, and almost no operational load. The key to using it efficiently is to understand its data model and to check if you can fit your data in DynamoDB. Remember: To achieve the stellar performance you need to use queries as much as possible and try to avoid scans operations.