Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Retrieving URLs in Parallel With CURL and PHP

DZone's Guide to

Retrieving URLs in Parallel With CURL and PHP

· Web Dev Zone ·
Free Resource

Deploying code to production can be filled with uncertainty. Reduce the risks, and deploy earlier and more often. Download this free guide to learn more. Brought to you in partnership with Rollbar.

As we’ve recently added support for querying Solr servers in parallel, one of the things we added was a simple class to allow us to query several servers at the same time. The CURL library (which has a PHP extension) even provides an abstraction layer for doing the nitty gritty work for you, as long as you keep track of the resources. The code beneath is based on examples in the documentation and a few tweaks of my own.

The code beneath is licensed under a MIT license. You can also download the file (gzipped).



    class Footo_Content_Retrieve_HTTP_CURLParallel
    {
        /**
         * Fetch a collection of URLs in parallell using cURL. The results are
         * returned as an associative array, with the URLs as the key and the
         * content of the URLs as the value.
         *
         * @param array<string> $addresses An array of URLs to fetch.
         * @return array<string> The content of each URL that we've been asked to fetch.
         **/
        public function retrieve($addresses)
        {
            $multiHandle = curl_multi_init();
            $handles = array();
            $results = array();
     
            foreach($addresses as $url)
            {
                $handle = curl_init($url);
                $handles[$url] = $handle;
     
                curl_setopt_array($handle, array(
                    CURLOPT_HEADER => false,
                    CURLOPT_RETURNTRANSFER => true,
                ));
     
                curl_multi_add_handle($multiHandle, $handle);
            }
     
            //execute the handles
            $result = CURLM_CALL_MULTI_PERFORM;
            $running = false;
     
            // set up and make any requests..
            while ($result == CURLM_CALL_MULTI_PERFORM)
            {
                $result = curl_multi_exec($multiHandle, $running);
            }
     
            // wait until data arrives on all sockets
            while($running && ($result == CURLM_OK))
            {
                if (curl_multi_select($multiHandle) > -1)
                {
                    $result = CURLM_CALL_MULTI_PERFORM;
     
                    // while we need to process sockets
                    while ($result == CURLM_CALL_MULTI_PERFORM)
                    {
                        $result = curl_multi_exec($multiHandle, $running);
                    }
                }
            }
     
            // clean up
            foreach($handles as $url => $handle)
            {
                $results[$url] = curl_multi_getcontent($handle);
     
                curl_multi_remove_handle($multiHandle, $handle);
                curl_close($handle);
            }
     
            curl_multi_close($multiHandle);
     
            return $results;
        }
    }

 

Download the file.

Source: http://e-mats.org/2010/01/retrieving-urls-in-parallel-with-curl-and-php/

Deploying code to production can be filled with uncertainty. Reduce the risks, and deploy earlier and more often. Download this free guide to learn more. Brought to you in partnership with Rollbar.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}