Over a million developers have joined DZone.

Benchmarks: PHP Solr Response Data Handling

DZone's Guide to

Benchmarks: PHP Solr Response Data Handling

· Web Dev Zone ·
Free Resource

Deploy code to production now. Release to users when ready. Learn how to separate code deployment from user-facing feature releases with LaunchDarkly.

Solr supports multiple output formats. Some are for general use (xml, json) and some are even language specific. If you’re using PHP these are the most logical response writer formats:

  • xml
  • json
  • phps (serialized php)
  • php (php code to execute)

On top of that PHP offers multiple ways to parse XML. I’m benchmarking these options to determine the most efficient decoding to implement in the next major version of Solarium, but the results should be useful for any PHP Solr implementation.

Before I get to the results some info on how I tested.
Because this is only the first of several benchmarks I want to do the next few months I needed a good benchmarking tool. I couldn’t really find one that suits, so I’ve created a tool myself. It’s inspired by PHPUnit, but instead of test-cases you write benchmark-cases. It includes concepts like annotation and dataproviders. If you know PHPunit you will probably quickly understand this example:

class SolrParserBenchmark extends Phperf\Benchmark
     * @return array
    public function solrJsonDataProvider()
        return array(
            'small-results' => array(file_get_contents(__DIR__ . '/data/results.json')),
            'big-results' => array(file_get_contents(__DIR__ . '/data/text-results.json')),
     * Json decode
     * Test data is similar to Solr output with wt=json
     * @dataprovider solrJsonDataProvider
     * @repeat 50
     * @param string $data
     * @return string
    public function benchmarkJsonDecode($data)
        return json_decode($data);

The tool is still very much a prototype, but I intend on improving it for use by others as soon as I find the time. The current code, including this Solr benchmark, can be found on GitHub for those interested.


Each decoding method is tested for a small result (approx. 10KB) and a big result (approx. 800KB). The tests are repeated 50 times and the result is the average time.

View the results here


  • The differences between decoding methods are quite big, in percentages
  • The size of the result data has a big impact, json_decode is second fastest with a small dataset but slowest with a big set.
  • There is a clear winner for both small and big datasets, unserialize
  • While the differences in percentages are big, even the worst performer with the big dataset (bigger than most real world cases) only takes about 16 thousands of a second.

Based on these results the next Solarium version will probably use unserialize (phps response format).

The tests were done on a 2Ghz i7 MBP running PHP 5.3.8. As always with benchmarkts, don’t just trust my tests, but be sure to test on your own environment! Your results might vary…

Deploy code to production now. Release to users when ready. Learn how to separate code deployment from user-facing feature releases with LaunchDarkly.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}