Practical PHP Patterns: Identity Field
Join the DZone community and get the full member experience.Join For Free
The Identity Field pattern is a common practice in all ORMs implementation, both in the case of Active Record and Data Mapper based ones. This pattern's implementations save a database identity column[s], such as a primary key, as a field of an object, to transport identity between an in-memory object and a database row (original definition from Patterns of Enterprise Architecture).
In general, objects in a graph don't need an Identity Field, since they have already an implicit identity provided by the programming language mechanisms. Operators are available to check the identities of single objects, which is based on equality of the pointer or handler variable. In PHP (and PHPUnit) the term for objects with same identity is in fact same, instead of the less specialized equal. We talk about same objects when the === comparison between them is true, and of equal objects when the == comparison is true. So being same is more than being equal.
This pattern is necessary to link the object and relational representations of data together. When you reconstitute an object, you have to store in it a field that identifies its representation in the database, usually as a row (thus its primary key is stored in the object). In a similar way, when an object is passed back to the mapper, even when it comes from serialization or from a cache or another mapper or whatever, it is recognized as already present in the database or at least as a missing row (that has to be created) in the case of a natural key.
The converse identity transport, from relational to object model, is assured by an Identity Map, that prevents two equal objects to be reconstituted when they should actually be the same object. The Identity Map hands back a reference to the same object already built during a previous reconstitution.
Identity Field is a very simple idea, but simple ideas often lead to complex executions.
Choosing the Identity Field
Choosing a field to represent the identity of the object presents the same issues of common database design that have to be tackled in order to find a meaningful primary key. Thus this is more a database problem than an Orm-related one.
An identity field (or its equivalent primary key) can be natural or generated. Natural keys have a domain-related meaning and are passed by the client code, while generated keys are handled by the data persistence layer. Autoincremented ids and UUID are common examples of generated identity fields.
Usually the Identity Field is stored as a table-unique key, and take advantages of the constraints of primary keys implemented by all relational databases. Another distiction in the key type is between simple and compound keys: the latter are built from a combination of more than one column of the relational table and are often complex to support. I would suggest to avoid compound keys as much as possible, particularly when relationships between objects are involved (the Orm will have to bring this multiple-field key in different objects and queries, and it may not support such a mapping.)
Last but not least, the type of an Identity Field is homogeneous to a Value Object, even is it is a scalar value because of PHP architecture. Identity Fields are mostly strings and integers. Theoretically you can use any type of immutable Value Object, like a Date one, but you have to make sure that your Orm supports it both in its mapping rules (an Identity Field is still a field that has to be transported to the database and recreated from a database column) and as an Identity Field.
Once you have chosen a primary key/Identity Field, you must create also a private field in the object, which will be accessed via reflection like every other private field, for persistence purposes. If the object does not need an Identity Field in its semantics (it is an amount of money, a date, a time interval), reconsider its mapping as an Entity (@Entity in case of Doctrine 2 or other JPA-based Orms) since it is probably a Value Object.
Moreover, special metadata have to be added for mapping, to identify this particular field as the key. In Doctrine 2 the @Id annotation is used, borrowed from the JPA specification, plus a @Generated one to indicate non-natural keys. The code at the end of this article shows both examples.
What if I specify my natural key in an object and it is already present in the database?
The transaction will be usually rejected when the Unit of Work is committed, and an exception will be thrown. Unfortunately the only way to catch these errors in advance is to hit the database via a SELECT query, which is not performance-wise.
What if I need my generated key in an object that is not already in the database?
If there is a sequence or a key table (or a MAX() aggregate function) available you can calculate it, often only in a database-specific, not portable way. But you should keep an eye on concurrency because some other process can stole it before you insert your object. In fact, I prefer to use natural keys when I have an object that is already relevant before being persisted and deal with the exception in case I get a duplicate. Otherwise the database has mixed responsibilities here: providing a key and store the object.
The sample code for this pattern is taken from Doctrine 2 tests, and it shows how annotations provide additional metadata for mapping and specify an Identify Field. The actual code that deals with Identity Field is scattered throughout the whole Orm.
* Single-column key.
* @Id @Column(type="integer")
* @Column(type="string", length=50)
* Non-generated Identity Field.
Opinions expressed by DZone contributors are their own.