Practical PHP Refactoring: Substitute Algorithm
Join the DZone community and get the full member experience.
Join For FreeSmall-scale refactoring is usually composed of small changes, which keep the system working all the time; you extract some lines into a new method, or add a few variables to substitute comments with their descriptive names.
As we move towards a larger scale, we find that sometimes little jumps are needed, such as changing an entire algorithm instead of only moving code around.
Why should I change my algorithm if the tests already pass?
- Because you have discovered a better one with respect to non-functional requirements: it is faster, occupies less space, or more reliable in case of stochastic algorithms (e.g. your collection has increased from an average of 10 elements to a sizeable number: Quicksort may help you).
- Or because now your library, or your ORM, or your framework implements the same functionality for you; so you want to bind to another debugged implementation to avoid maintain this piece of code (e.g. your database can sort your User object by name for you).
- Or because the requirement has been superceeded, and a less strict version of the algorithm is required (e.g. you just need to sort according to one field, and not two).
- Or because the algorithm should change for a new requirement, and the old one cannot be modified accordingly (e.g. you need a stable sorting algorithm).
Constraints
To avoid desperate cases of shotgun surgery, impose some constraints on this refactoring; it's always easy to start messing around.
Keep the change as localized as possible: you can take possible cautionary steps to isolate the algorithm in a method: Extract Method, and Inline Method to keep all the algorithm in one place can help you. Ideally, even in the complex cases this refactoring should not become more difficult than changing a Strategy object.
In fact, the Open/Closed Principle is a guideline to follow: if you're changing the algorithm because of a new (or mutated) requirement, the next change should only require you to add classes and reconfigure the construction of the object graph.
Thus a preliminary refactoring of the original algorithm may help, especially if the old and the new are composed of similar steps.
To check if the new algorithm will cover the already existing cases, locate the right level of tests to run:
- unit tests of the class if you're working on private or self-contained public methods;
- functional tests involving multiple objects, if the algorithm is divided into multiple classes;
- even end-to-end tests if the algorithm involves wildly different pieces of the system (e.g. the sorting of a client-side view list after an Ajax insertion).
Steps
Step 0 is to isolate the algorithm as we defined in Constraints.
- Prepare the new algorithm in a design that is compatible with the old one at its terminals (that's electrical engineering terminology): input parameters, output value, changes to the object state. You can refactor the protocol later, but for now it's easier to stick to the old one, for example continuing to require a parameter which would now be unused.
- Switch the client code calls from the old method to the new one.
- Run the identified tests.
In case the tests now fail, you're under version control and can go back to the original one; or you can identify which test cases are causing the failure and debug the new algorithm.
Example
In the initial state, the algorithm is part of the display method that should produce a bit of HTML. It is a reuse of the classical sort() PHP method that acts on keys instead of values.
<?php class PhonebookTest extends PHPUnit_Framework_TestCase { public function testPrintsNameAndNumberOrderedByName() { $phonebook = new Phonebook(); $phonebook->add('Giorgio', '031...'); $phonebook->add('Adam', '021...'); $phonebook->add('Zend', '045...'); $html = $phonebook->toHtml(); $this->assertEquals( "<ul>\n" . "<li>Adam: 021...</li>\n" . "<li>Giorgio: 031...</li>\n" . "<li>Zend: 045...</li>\n" . "</ul>\n", $html ); } } class Phonebook { private $numbers = array(); public function add($name, $number) { $this->numbers[$name] = $number; } /** * I thought about writing a sorting algorithm, but... * I opted to use a different, bundled sorting algorithm :) */ public function toHtml() { $names = array_keys($this->numbers); sort($names); foreach ($names as $name) { $numbers[$name] = $this->numbers[$name]; } $this->numbers = $numbers; $html = "<ul>\n"; foreach ($this->numbers as $name => $number) { $html .= "<li>$name: $number</li>\n"; } $html .= "</ul>\n"; return $html; } }
After a while, we discover PHP already handles this functionality with a native function, and we want to substitute the long and error-prone algorithm with the native one (just like if we introduced a new library).
The first step is to isolate the algorithm in its own method:
class Phonebook { private $numbers = array(); public function add($name, $number) { $this->numbers[$name] = $number; } public function toHtml() { $this->sortNumbersByName(); $html = "<ul>\n"; foreach ($this->numbers as $name => $number) { $html .= "<li>$name: $number</li>\n"; } $html .= "</ul>\n"; return $html; } /** * Let's start by isolating the algorithm. */ private function sortNumbersByName() { $names = array_keys($this->numbers); sort($names); foreach ($names as $name) { $numbers[$name] = $this->numbers[$name]; } $this->numbers = $numbers; } }
Now we can substitute the old code with the new algorithm, incidentally just a single call. If there is no more bookkeeping code to add in the extracted method, we can even inline it again; however the temporary extraction has forced us to define the input (only $this) and output (state of $this) relationship of this little algorithm.
private function sortNumbersByName() { ksort($this->numbers); }
Opinions expressed by DZone contributors are their own.
Comments