After talking about transaction merging, there is another kind of database trick that you can use to optimize your performance even further. It is called early lock release. With early lock release, we rely on the fact that the client will only know when the transaction has been committed after we told him. And that we don’t have to tell it right away.
That sounds strange, how can not telling the client that its transaction has been committed improve performance? Well, it improves throughput, but it can also improve the performance. But before we get into the details, let's see the code, then talk about what it means:
def MergeTransactionThreadProc(): while true: buffer = Buffer() result = DequeueOperation() if result.Success is false: WaitForAdditionalOperations() continue buffer.Write(result.Operation, result.Notification) max = time.time() + 1 while time.time() < max: result = DequeueOperation() if result.Success is false: break buffer.Write(result.Operation, result.Notification) asyncCallback = buffer.NotifyAllOperationsAboutSuccessfulJournalSync journal.SyncBufferAsync(buffer, asyncCallback)
The code is nearly identical to the one in the previous post, but unlike the previous post, here we play a little game. Instead of flushing the journal synchronously, we are going to do this in an async manner. And what's more important, we don’t it to complete. Why is that important?
It is important because notifying the caller that the transaction has been completed has now moved into the async callback, and we can start processing additional operations to write them to the journal file at the same time that the I/O for the previous operation completes.
As far as the client is concerned, it doesn’t matter how it works, it just needs to get the confirmation that it has been successfully committed. But from the system resources' points of view, it means that we can parallelize a key aspect of the code, and we can proceed with the next set of transactions before the previous one even hit the disk.
There are some issues to consider. Typically, you have only a single pending write operation because if the previous journal write had an error, you need to abort all future transactions that relied on its in-memory state (effectively, roll back that transaction and any speculative transactions we executed assuming we can commit that transaction to disk). Another issue is that error handling is more complex — if a single part of the merge transaction failed, it will fail in unrelated operations, so you need to unmerge the transaction, run them individually, and report on each operation individually.