Over a million developers have joined DZone.

The Guts 'n' Glory of Database Internals: Early Lock Release

DZone's Guide to

The Guts 'n' Glory of Database Internals: Early Lock Release

Learn a nifty trick that leads to (and become aware of the problems that might arise from) parallelizing your writes — the early lock release.

· Database Zone ·
Free Resource

Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.  

After talking about transaction merging, there is another kind of database trick that you can use to optimize your performance even further. It is called early lock release. With early lock release, we rely on the fact that the client will only know when the transaction has been committed after we told him. And that we don’t have to tell it right away.

That sounds strange, how can not telling the client that its transaction has been committed improve performance? Well, it improves throughput, but it can also improve the performance. But before we get into the details, let's see the code, then talk about what it means:

def MergeTransactionThreadProc():
    while true:
        buffer = Buffer()
        result = DequeueOperation()
        if result.Success is false:
        buffer.Write(result.Operation, result.Notification)
        max = time.time() + 1
        while time.time() < max:
            result = DequeueOperation()
            if result.Success is false:
            buffer.Write(result.Operation, result.Notification)

        asyncCallback = buffer.NotifyAllOperationsAboutSuccessfulJournalSync
        journal.SyncBufferAsync(buffer, asyncCallback)

The code is nearly identical to the one in the previous post, but unlike the previous post, here we play a little game. Instead of flushing the journal synchronously, we are going to do this in an async manner. And what's more important, we don’t it to complete. Why is that important?

It is important because notifying the caller that the transaction has been completed has now moved into the async callback, and we can start processing additional operations to write them to the journal file at the same time that the I/O for the previous operation completes.

As far as the client is concerned, it doesn’t matter how it works, it just needs to get the confirmation that it has been successfully committed. But from the system resources' points of view, it means that we can parallelize a key aspect of the code, and we can proceed with the next set of transactions  before the previous one even hit the disk.

There are some issues to consider. Typically, you have only a single pending write operation because if the previous journal write had an error, you need to abort all future transactions that relied on its in-memory state (effectively, roll back that transaction and any speculative transactions we executed assuming we can commit that transaction to disk). Another issue is that error handling is more complex — if a single part of the merge transaction failed, it will fail in unrelated operations, so you need to unmerge the transaction, run them individually, and report on each operation individually.

Download AnzoGraph now and find out for yourself why it is acknowledged as the most complete all-in-one data warehouse for BI style and graph analytics.  

memory ,early lock release ,async ,database ,merge ,state ,transaction

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}