Practical PHP Patterns: Optimistic Offline Lock
Join the DZone community and get the full member experience.
Join For FreeIn this series we are now entering the realm of concurrency, an option which adds complexity to an application as many different threads of execution are accessing the state storage at the same time.
There is no native multithreading support in PHP (every script gets its own isolated process), but still concurrency can easily become an issue: multiple clients from all over the world continuosly make requests to PHP applications, and they can easily mutually overwrite their changesets.
A classic example of race condition in the PHP world is two different clients filling an editing form referred to the same entity. They both will submit the form once it is complete, and the first one will get his changes overwritten by the second request. There are many other common situations were concurrent user can provoke errors in the system. Just think of different people choosing the same username and being told via Ajax that it is available; the slower one of them will be surprised when he submits his registration form.
Background
Before entering the explanation of patterns emerged to solve the concurrency problems, we need some definitions. First of all the notion of transaction is necessary: we define a transaction as a change of state of the application. Editing an entity is a transaction; adding or deleting another is still a transaction.
The first kind of transaction we are interested in is the database transactions, which is totally accomplished in one PHP script. This is usually automatically enforced by mechanisms supported at the database level.
The other kind of transaction is the business transaction, which spans over multiple HTTP requests and makes uses of from one to N database transactions. It comprehends checking out data, populating a form or other kind of rich user interface, modifying or adding data (a human-based action), and sending it back to the server.
There is no automatic enforcement for business transaction, since they are defined by the business rules of the domain. This is not a problem that originates because of PHP nature, but because of the separation between client and server which the web is based on.
Optimistic lock
The Optimistic Offline Lock pattern is a way of ensuring integrity of data, avoiding the option that different clients submit conflicting changes.
As the name suggests, it assumes that the chances of conflict are low. Indeed, when this is the case the optimistic lock does not slow down the user interaction a bit.
The goal of this pattern is detecting a conflicting change and instead of applying it, rollback the business transaction and present an error to the user. It accomplishes this goal by validating that no one else has tampered with a record in the data source prior to allowing the modification to be committed.
All open source version control systems such as Subversion and Git implement optimistic lock: anyone may check out a source file and work on it, to end his little fork later with a commit or push. The pain comes while merging, so you are supposed to integrate often. We also borrowed terminology from the source control systems, so in this article you'll encounter terms like commit, checkout, and merge.
Implementation
The most common implementation of Optimistic Offline Lock is a numerical version field on the record to protect from condurrency issues; to aid rollback notifications, other additional fields are useful for signaling the conflict, like the id of the last user that modified the record.
The pattern inner working is not complex: when the data is submitted along with the version field value kept on the client, the version field in it must be the same currently present in the database. Only then it is incremented and the changeset committed.
Encountering a different version field value in the database record means someone else has modified the data in between our checkout and commit, and so it must be preserved. For example we can show a diff to the user, like VCS does; in any case, we should interrupt the transaction.
RDBMS and ORMs can simply use an additional column on the table where the root object is stored to support this pattern.
Alternatives
An alternative implementation consists in using all fields in the WHERE clause of the UPDATE (or only the sensible ones, or only the modified ones to let transactions that affect different field succeed when the business logic allows it. See below).
This solution is handy when you can't add a version field, but it may have performance impact.
Another alternative is to check conditions instead of version fields, which is practical in different use cases.
For example, we can check the existence of a record before deleting it. This is already indirectly done to a cartain extent by ORMs and other abstraction layers when they provide you with an object abstraction you can calla delete() method on.
An extension to the functionality of this pattern is checking that a current editing will (probably) commit, as a feature available at any time during editing of the data. This feature should check that the checkout data is still current.
The domain
An important is that it is part of the job of business domain logic to decide when a conflict occurs: some concurent changesets may be acceptable, while others may not be allowed even if they modify different fields.
In his book, Fowler makes the example of adding elements to a collection concurrently. We can't know if this is right by seeing that the object is a collection, because it is the abstraction that it represent that must be maintained valid: sometimes it is right to add elements, sometimes the transaction should be stopped.
Also merging strategies, which solve the conflicts, are subsceptible to domain considerations. Some are valuable and should be pursued, while some are costly and a rollback with manual user editing of the data is fine.
Advantages
The greatest advantage of this pattern is that it can support real time concurrency, like the check out of multiple items by multiple users simultaneously, as long as there is a merging strategy in place. It can also easily prevent race conditions.
This pattern is also easy to implement, and thus it is the default choice to solve concurrency issues.
Disadvantages
When the conflict probability is high, since there are many concurrent transactions, this pattern produces too many rollbacks. It is not adequate for use cases where a pessimistic pattern should be adopted.
Examples
Doctrine 2 uses natively database transactions: it only commits the changes made in a PHP script when the EntityManager::flush() method is called. It automatically rolls back if an error is detected.
Besides that, Doctrine 2 has also automatic Optimistic Offline Locking support, via the addition of a version field to the entity to lock.
<?php
class User
{
// ...
/** @Version @Column(type="integer") */
private $version;
// ...
}
The version field must be passed back by the user and not read back from the database; for example, it should populate an hidden field when you're using a form.
If a non-current object is committed, an OptimisticLockException is thrown. Moreover, EntityManager::find() has optional arguments that let you check that version you have is current when reloading an object during an intermediate request, in order to warn the user that he has stale data.
Opinions expressed by DZone contributors are their own.
Comments