An S3 File Bucket Downloader Written in Ruby
Join the DZone community and get the full member experience.
Join For FreeToday I wanted to download files from a website that I happened to find out that stored all files in S3. By accessing the website root, I realized that it was just the response of a S3 ListBucket API call. For instance:
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Name>foo.com</Name> <Prefix/> <Marker/> <MaxKeys>1000</MaxKeys> <IsTruncated>true</IsTruncated> <Contents> <Key>file/1</Key> <LastModified>2011-06-09T06:29:02.000Z</LastModified> <ETag>"5cb3930839817ff4a5c1ddf08e3fea1e"</ETag> <Size>1440231</Size> <StorageClass>STANDARD</StorageClass> </Contents> <Contents> <Key>file/2</Key> <LastModified>2011-06-09T06:29:18.000Z</LastModified> <ETag>"96fdc94d14b6d9817f80ac1e9e2049b4"</ETag> <Size>1310</Size> <StorageClass>STANDARD</StorageClass> </Contents> </ListBucketResult>
In order to download all files more quickly, I wrote the following Ruby program that downloads all files from this website, and I hope it can be useful for others:
require 'net/http' require 'rexml/document' baseurl = 'foo.com' # get the XML data as a string xml_data = Net::HTTP.get_response(URI.parse("http://" + baseurl)).body # extract event information doc = REXML::Document.new(xml_data) titles = [] links = [] Net::HTTP.start(baseurl) do |http| doc.elements.each('ListBucketResult/Contents/Key') do |ele| puts "Downloading " + ele.text resp = http.get("/" + ele.text) open("images/" + ele.text.gsub("/", "_") + ".jpg", "wb") { |file| file.write(resp.body) } end end puts "Done"
Published at DZone with permission of Rodrigo De Castro, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments