One of the superpowers of Selenium 2 is the ability to repeat this check on almost any browser, with the institution of a common protocol that Firefox, Chromium and more drivers implement. One of the common operations - beside clicking and entering text - is the creation of screenshots of the current state of the page.
The Selenium 2 API exposes a session/:id/screenshot method which returns a PNG BLOB. This method is usually exposed by drivers as-is, as in my own PHPUnit_Extensions_Selenium2TestCase::currentScreenshot() method. These screenshots contain the browser main frame, without any title or menu bar, let alone borders.
This method can be used to perform regression tests over the graphical appearance of a page, by comparing a current screenshot with a previous one and fail the test if there are significant differences.
The screenshot functionality is mandated by the JSON Wire Protocol, Selenium's contract with browser drivers. However, the existence of a single URL and protocol is not a guarantee of support - the screenshot functionality was not implemented in the Safari driver last time I checked. Moreover, setting up some of the drivers may require time and machines while you may allocate the responsibility to an external service like SauceLabs provided the staging environment is on a publicly visible IP address.
The screenshot is created with the dimensions of the contents of the browser window, expanding every scroll bar. This lets you check even what does not fit inside the window, but you have to pay attention to the layout.
Selenium lets you set the current size of the window on its /window API. This is exposed by PHPUnit_Selenium as:
$this->currentWindow()->size(array( 'width' => 800, 'height' => 600, ));
but is available also in other drivers.
The exact same test must be performed during the generation of the reference images, to bring the application in the same state during generation and automated checking.
I govern this with a STORE_SCREENSHOTS environment variable, that can be set to 0 or 1:
STORE_SCREENSHOTS=1 phpunit ... STORE_SCREENSHOTS=0 phpunit ...
A byte-by-byte or pixel-by-pixel comparison or reference and actual images is usually not reliable enough for automated tests. For example, some driver like the Android one render colors a bit differently depending on external conditions (which I don't know). Image differences generated over these failures show a few pixel which are out of their expected color (such as #00669c instead of #006699). This behavior may be due to antialiasing or other "optimizations" of the browser engine.
Thus a more abstract comparison should be made if you don't want your tests to give you false positives each day.
Inside PHP code, with exec() calls you can leverage some command line utilities. First of all, the conversion to PNM (a bitmap format) with imagemagick:
file_put_contents('result.png', $this->currentScreenshot()); exec('convert result.png result.pnm');
Then a Mean Average Error comparison:
exec('imgcmp -f reference.pnm -F result.pnm -m mae | paste -sd+ | bc', $comparison);
$comparison now contains a float (in the form of a string) that goes from 0.0001 in the case of false positives to 0.02 or 0.2 in the case of different images. In Ubuntu, imgcmp is contained in the libjasper-runtime package.
I recommend to user per-browser reference images: having a pixel-perfect layout which is equal on each browser and version is probably not worth your time.
You should find out the threshold to declare a test failed depending on your application. I saw 0.000016 and 0.244 as the two examples to classify and as such I set it to 0.001.
Caveat: multiple takes
This is not usually necessary, but if your application lacks a steady state and things move around or you're not sure animations have completed yet, you can resort to multiple takes to keep the better screenshots.
For reference images, the process works with a quorum: take 5 screenshots at the distance of 1 second, pick one that is at least repeated 3 times.
For the actual images, don't compare with the reference just the first screenshot; take instead a maximum of 5 at the distance of 1 second, and compare each one with the reference until it's good. If a screenshot is declared good, you can make an early exit; if you arrive to taking 6 images, declare a failure.
Selenium tests catch any issue that hampers real users from accessing your application, with a special focus on rendering problems and cross-browser issues. When creating screenshots with them take into account this advice to avoid false positives and always-red test suites.