The Internals of PHP Associative Arrays
Join the DZone community and get the full member experience.
Join For FreeYou may understand arrays, and you may have used PHP arrays a thousand times before. But did you know that, because of the way PHP uses hashtables to store arrays internally, certain kinds of array insertions, of a given length, can take tens of thousands of times longer than other insertions, of the same length?
To understand why this happens, take a look at this code by Nikita Popov (who has actually written a PHP parser in PHP):
<?php echo '<pre>'; $size = pow(2, 16); // 16 is just an example, could also be 15 or 17 $startTime = microtime(true); $array = array(); for ($key = 0, $maxKey = ($size - 1) * $size; $key <= $maxKey; $key += $size) { $array[$key] = 0; } $endTime = microtime(true); echo 'Inserting ', $size, ' evil elements took ', $endTime - $startTime, ' seconds', "\n"; $startTime = microtime(true); $array = array(); for ($key = 0, $maxKey = $size - 1; $key <= $maxKey; ++$key) { $array[$key] = 0; } $endTime = microtime(true); echo 'Inserting ', $size, ' good elements took ', $endTime - $startTime, ' seconds', "\n";
The crucial lines are 8 and 19 (as you may already have guessed). Inserting the 2^16 evil elements takes 33 seconds on Nikita's machine,versus 0.014 seconds for the good elements.
What's the difference between good and evil elements? Notice the maxKey calculation. The number of elements inserted is the same in both cases, but the max key value is 65535*65536 in the first case, and 65535 in the second; and while the first key is the same in both cases, the second evil element is 65535+65536, while the second good element is 65536. And so forth, with all evil elements as multiples of 65536.
So what's special about these 'evil' values?
Nikita's explanation is a little complicated, but completely clear; it takes a few steps, but touches on some pretty low-level PHP issues, so it's an intriguing read.
Later in the article Nikita discusses how this peculiarity of PHP's array hashtabling allows for e.g. a DOS attack (though this will disappear soon). Comments are interesting too, so check out the full article and peek under the hood of those ubiquitous PHP arrays.
Opinions expressed by DZone contributors are their own.
Comments