Practical PHP Refactoring: Replace Conditional with Polymorphism
Join the DZone community and get the full member experience.
Join For FreeIn the scenario of today, an if chooses to execute different behavior depending on the type of an object. We should define "type" very looosely; for example, it may be:
- the class of the object or one of the interfaces it implements (instanceof).
- The value of one of the object's fields (usually enumerative).
- The result of a query method, such as isXXX() or getTotalValue().
Why eliminating this conditional? And by replacing it with what?
We can replace easily this kind of conditionals with polymorphism: we move the relevant logic in the object whose type is queried. This object becomes an instance of a subclass of the original one: the different legs of the conditional become different subclasses. In case the subclasses already exist, it's even easier as it's just a movement of code at the method scale.
Typically polymorphism allows to remove the conditional or to put it at the instantiation time of the new object. It's an expression of the Tell, Don't Ask principle, where instead of asking repeatedly (in different places of client code) the object what to do, you simply tell the object to do something and pass in some references if necessary.
The end result is that the duplication of the conditional is eliminated as it is splitted into the various classes of the hierarchy. The addition or removal of a new type impacts just a subclass, which has to be created anew or thrown away. You don't have to grep all your application looking for if()s involving a field.
Note that this refactoring works not only for ifs and elses, but also for selects.
The missing step
A prerequisite for this refactoring is having an existing inheritance structure where the dependency of client code is on the parent, but different instances of the child can be passed in.
If there is not already a hierarchy, you have two choices as preliminary refactorings:
- inheritance: Replace Type code with Subclasses. The current class gains new children; it is a less invasive approach but also but less clear. Inheritance is a one shot strategy as you won't be able to use it for other axis of change.
- composition: Replace type code with State or Strategy. A new class is extracted, which gains the bew children. More flexible as you can extract many collaborators like this, and forces to name the new concept and its new hierarchy.
The steps in the rest of the article presume you already have this hierarchy, so let these two refactorings guide you into extracting the classes (even if they come up empty initially). This refactoring then only focuses on breaking up the conditional code. The previous refactorings allow you to tackle the hierarchy and the instantiation code instead.
Steps
- Extract a method containing the conditional.
- Move this method to the top of the inheritance hierarchy.
- Copy the original method into each subclass, and eliminate all machinery to leave just one leg. Some data of the original class may become protected instead of private.
- Remove the copied leg and repeat with the next subclass until the method in the superclass is empty and the overrides do not contain references to classes different from the current one.
Example
In the initial state of the example, the Renderer class has a kind of switch where it generates different HTML depending on the class of the object to render: Brand pages can have custom URLs, while user profile must follow the standard naming scheme and use their nick.
<?php class ReplaceConditionalWithPolymorphism extends PHPUnit_Framework_TestCase { public function testAUserProfileShouldHaveAStandardURL() { $renderer = new Renderer(new User('giorgio')); $this->assertEquals('<a href="/giorgio">giorgio</a>', $renderer->__toString()); } public function testABrandPageShouldHaveACustomURL() { $renderer = new Renderer(new Brand('Coca Cola', 'coke')); $this->assertEquals('<a href="/coke">Coca Cola</a>', $renderer->__toString()); } } class User { private $name; public function __construct($name) { $this->name = $name; } public function getName() { return $this->name; } } class Brand { private $name; private $url; public function __construct($name, $url) { $this->name = $name; $this->url = $url; } public function getName() { return $this->name; } public function getURL() { return $this->url; } } class Renderer { private $domainObject; public function __construct($domainObject) { $this->domainObject = $domainObject; } public function __toString() { if ($this->domainObject instanceof User) { return '<a href="/' . $this->domainObject->getName() . '">' . $this->domainObject->getName() . '</a>'; } if ($this->domainObject instanceof Brand) { return '<a href="/' . $this->domainObject->getURL() . '">' . $this->domainObject->getName() . '</a>'; } } }
There are two issues with this design:
- if we add a new domain object (e.g. Group, with its own page) we have to open up the hood and insert new code in an already existing class. It's generally easier to deal with a change by adding brand new classes, since we can't break existing code if we do not touch it.
- Not only Renderer contains these different cases, but probably many objects composing Brand and User.
The first move is to isolate the variability into a single hierarchy of classes.Since there is no common ancestor, we create a common superclass in Brand and User. getURL() would be better named as getSlug() actually (an url-friendly label).
abstract class Addressable { public function render($template) { if ($this instanceof User) { return sprintf($template, $this->getName(), $this->getName()); } if ($this instanceof Brand) { return sprintf($template, $this->getUrl(), $this->getName()); } } }
We only hid the mess, not resolved it: the base class is still a single point that knows everything about the subclasses. At least there would be no duplication if someone wants to use name or URL in other HTML fragments.
We copy the method into the various subclasses. In this case we augment duplication, but that's a temporary move since we will eliminate most of the code in the methods shortly.
class User extends Addressable { private $name; public function __construct($name) { $this->name = $name; } public function getName() { return $this->name; } public function render($template) { if ($this instanceof User) { return sprintf($template, $this->getName(), $this->getName()); } if ($this instanceof Brand) { return sprintf($template, $this->getUrl(), $this->getName()); } } } class Brand extends Addressable { private $name; private $url; public function __construct($name, $url) { $this->name = $name; $this->url = $url; } public function getName() { return $this->name; } public function getURL() { return $this->url; } public function render($template) { if ($this instanceof User) { return sprintf($template, $this->getName(), $this->getName()); } if ($this instanceof Brand) { return sprintf($template, $this->getUrl(), $this->getName()); } } }
In the next step, we make the original method abstract: since each concrete class has its own copy, it will never be executed.
abstract class Addressable { public abstract function render($template); }
We eliminate impossible cases: now dynamic dispatch is doing the work of choosing which method to execute (instead of a chain of ifs). Brand and User only refer to their own methods and do not know anything about each other's presence.
abstract class Addressable { public abstract function render($template); } class User extends Addressable { private $name; public function __construct($name) { $this->name = $name; } public function getName() { return $this->name; } public function render($template) { return sprintf($template, $this->getName(), $this->getName()); } } class Brand extends Addressable { private $name; private $url; public function __construct($name, $url) { $this->name = $name; $this->url = $url; } public function getName() { return $this->name; } public function getURL() { return $this->url; } public function render($template) { return sprintf($template, $this->getUrl(), $this->getName()); } }
The last step, although not strictly part of this refactoring, is to eliminate the getters since no one calls them from outside the class in this particular example.
class User extends Addressable { private $name; public function __construct($name) { $this->name = $name; } public function render($template) { return sprintf($template, $this->name, $this->name); } } class Brand extends Addressable { private $name; private $url; public function __construct($name, $url) { $this->name = $name; $this->url = $url; } public function render($template) { return sprintf($template, $this->url, $this->name); } }
We have now a solution where new classes can be added freely, and new Renderer can use all Addressable classes. It's a Bridge pattern, or just good factoring.
Opinions expressed by DZone contributors are their own.
Comments