Over a million developers have joined DZone.
Platinum Partner

Retrieving URLs in Parallel With CURL and PHP

· Web Dev Zone

The Web Dev Zone is brought to you by Stormpath—offering a pre-built Identity API for developers. Easily build powerful user management, authentication, and authorization into your web and mobile applications. Download this Forrester report on the new landscape of Customer Identity and Access Management.

As we’ve recently added support for querying Solr servers in parallel, one of the things we added was a simple class to allow us to query several servers at the same time. The CURL library (which has a PHP extension) even provides an abstraction layer for doing the nitty gritty work for you, as long as you keep track of the resources. The code beneath is based on examples in the documentation and a few tweaks of my own.

The code beneath is licensed under a MIT license. You can also download the file (gzipped).



    class Footo_Content_Retrieve_HTTP_CURLParallel
    {
        /**
         * Fetch a collection of URLs in parallell using cURL. The results are
         * returned as an associative array, with the URLs as the key and the
         * content of the URLs as the value.
         *
         * @param array<string> $addresses An array of URLs to fetch.
         * @return array<string> The content of each URL that we've been asked to fetch.
         **/
        public function retrieve($addresses)
        {
            $multiHandle = curl_multi_init();
            $handles = array();
            $results = array();
     
            foreach($addresses as $url)
            {
                $handle = curl_init($url);
                $handles[$url] = $handle;
     
                curl_setopt_array($handle, array(
                    CURLOPT_HEADER => false,
                    CURLOPT_RETURNTRANSFER => true,
                ));
     
                curl_multi_add_handle($multiHandle, $handle);
            }
     
            //execute the handles
            $result = CURLM_CALL_MULTI_PERFORM;
            $running = false;
     
            // set up and make any requests..
            while ($result == CURLM_CALL_MULTI_PERFORM)
            {
                $result = curl_multi_exec($multiHandle, $running);
            }
     
            // wait until data arrives on all sockets
            while($running && ($result == CURLM_OK))
            {
                if (curl_multi_select($multiHandle) > -1)
                {
                    $result = CURLM_CALL_MULTI_PERFORM;
     
                    // while we need to process sockets
                    while ($result == CURLM_CALL_MULTI_PERFORM)
                    {
                        $result = curl_multi_exec($multiHandle, $running);
                    }
                }
            }
     
            // clean up
            foreach($handles as $url => $handle)
            {
                $results[$url] = curl_multi_getcontent($handle);
     
                curl_multi_remove_handle($multiHandle, $handle);
                curl_close($handle);
            }
     
            curl_multi_close($multiHandle);
     
            return $results;
        }
    }

 

Download the file.

Source: http://e-mats.org/2010/01/retrieving-urls-in-parallel-with-curl-and-php/

The Web Dev Zone is brought to you by Stormpath—offering a pre-built, streamlined User Management API for building web and mobile applications. Check out our top pointers for designing your REST API.

Topics:

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}