Practical PHP Patterns: Serialized LOB
Join the DZone community and get the full member experience.
Join For FreeThe Serialized LOB pattern consists in persisting an object graph as a single [binary] large object, instead of breaking it in homogeneous pieces to store in separate relational tables. Although this pattern resides in the object-relational category, it is not limited to persistence in relational databases.
Usually an object graph persisted with this pattern is a subset of the whole application graph. Persisting a large graph is an issue for performance and consistency reasons. Note also that large object here means a big chunk of unstructured data, not an object in the programming sense. It can be a string or a sequence of bytes, or any structure that is prone to easy serialization without containing behavior like a real object (in fact Fowler advocates XML, but I'm sure that if he was to write his book today he would choose JSON.)
Motivation
Why storing a small object graph as a single value instead of mapping it to tables? Because when there are many small related objects, often their mapping could be very complicated: this solution simplify the graph persistence by serializing it and put it in a single field instead of introducing many different tables and foreign keys; still, you have the freedom to modelling your object graph as you feel it best reflects the domain. One more time, no one said that Active Record is the only philosophy in object-relational mapping, and this pattern is also oblivious to relational databases.
In PHP implementations, serialization involves every object and variable reachable from field references on the object you want to store. The result is a single string, usually not human readable. Serialization is supported by default by the PHP language, but you can save the state of the graph in another equivalent format like XML or JSON. In this case, there is the overhead in development and machine time to convert the object to and from the custom representation, but you gain in interoperability since an XML and JSON snapshots can be understood by other programs easily.
Note that the whole graph will be persisted if you do not define __sleep() to specify only certain fields should be considered: when implementing Serialized LOB you must keep an eye on outgoing references. If you use serialize() directly on an object, its __sleep() method must return an array of the names of the fields to store in the process, or otherwise all of them would be included. For example, in Doctrine 2 Lazy Loading proxies always define __sleep() to exclude a stale EntityManager when serialized, for example, in a cache.
Another issues to pay attention to is the duplication of objects reachable from different graphs. For these reasons, this pattern works well for isolated objects or isolated subgraphs, when you persist the roots of these graphs. Speaking in a DDD-like language, these graphs would be Aggregates without outgoing links towards other Aggregate Roots.
Problems
There are of course issues when using this simple mapping to an unstructured form, and they are related to object search. Querying the storage for an object that satisfies certain criteria would be very slow if it had to crawl each LOB and see what is contained in its internals.
The classical find() methods that work on the primary key can be implemented using a map structure (equivalent to an associative array), whose keys are the primary keys of the objects and values are their serializations. In fact, caches works like this, as they index a generic object or string by a unique key, and they are probably the most diffused implementation of this pattern, although they are not used as a single storage but only as a mirror.
Querying the storage is more complicated, and when it is a requirement basically you have to set up an external index (to update when you edit an object) like Solr, or Lucene (or Zend_Search_Lucene). With this index you can search elements and obtain the keys of the objects, but the index stills save only the fields relevant to the queries, and in a form suitable for search. MySQL's full text index is an half-baked solution that does exactly this, by analizing the field which it is defined on and saving a processed form of it for rapid querying.
Examples
Let's dig in some code. This example uses Sebastian Bergmann's Object Freezer library, which takes an object graph and returns a multidimensional array containing all its data. It accomplishes this introspection by accessing private properties and related object via reflection, similarly to what Doctrine 2 does. Of course Doctrine 2 would put data in a relational database, but Object Freezer simply returns it as an array.
Once you have the array-based snapshot of an object, you can serialize it and put it wherever you want or convert it to XML or JSON. While serializing takes care of private properties, other conversions would be less easy without Object Freezer.
Another interesting trait of this approach is that you can store the LOB out of a relational database (the table would contain two columns, so it's not a great schema), for example in a NoSQL database like CouchDB, or in Solr so that other fields that are processed for full text search and then discarded. The data could be passed either in an intellegible format (JSON with various fields to index), or as a blackbox (string).
<?php
require_once 'Object/Freezer.php';
class Car {
private $_engine;
private $_color;
public function __construct(Engine $engine, $color)
{
$this->_engine = $engine;
$this->_color = $color;
}
}
class Engine {
private $_model;
public function __construct($model)
{
$this->_model = $model;
}
}
$ferrari = new Car(new Engine("SomeTurboTechnoBabble"), "F399");
$freezer = new Object_Freezer();
// serialization is not very readable or comprehensible outside of PHP apps
$state = serialize($freezer->freeze($ferrari));
var_dump($state);
echo "\n";
// JSON has good interoperability instead,
// even if you not use it for storage this line can be useful
var_dump(json_encode($freezer->freeze($ferrari)));
echo "\n";
$newFerrari = $freezer->thaw(unserialize($state));
var_dump($newFerrari);
Opinions expressed by DZone contributors are their own.
Comments