DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. An S3 File Bucket Downloader Written in Ruby

An S3 File Bucket Downloader Written in Ruby

Rodrigo De Castro user avatar by
Rodrigo De Castro
·
Jun. 07, 12 · Interview
Like (0)
Save
Tweet
Share
5.68K Views

Join the DZone community and get the full member experience.

Join For Free

Today I wanted to download files from a website that I happened to find out that stored all files in S3. By accessing the website root, I realized that it was just the response of a S3 ListBucket API call. For instance:

    <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">  
       <Name>foo.com</Name>  
       <Prefix/>  
       <Marker/>  
       <MaxKeys>1000</MaxKeys>  
       <IsTruncated>true</IsTruncated>  
       <Contents>  
          <Key>file/1</Key>  
          <LastModified>2011-06-09T06:29:02.000Z</LastModified>  
          <ETag>"5cb3930839817ff4a5c1ddf08e3fea1e"</ETag>  
          <Size>1440231</Size>  
          <StorageClass>STANDARD</StorageClass>  
       </Contents>  
       <Contents>  
          <Key>file/2</Key>  
          <LastModified>2011-06-09T06:29:18.000Z</LastModified>  
          <ETag>"96fdc94d14b6d9817f80ac1e9e2049b4"</ETag>  
          <Size>1310</Size>  
          <StorageClass>STANDARD</StorageClass>  
       </Contents>  
    </ListBucketResult>  

In order to download all files more quickly, I wrote the following Ruby program that downloads all files from this website, and I hope it can be useful for others:

    require 'net/http'  
    require 'rexml/document'  
      
    baseurl = 'foo.com'  
      
    # get the XML data as a string  
    xml_data = Net::HTTP.get_response(URI.parse("http://" + baseurl)).body  
      
    # extract event information  
    doc = REXML::Document.new(xml_data)  
    titles = []  
    links = []  
    Net::HTTP.start(baseurl) do |http|  
      doc.elements.each('ListBucketResult/Contents/Key') do |ele|  
        puts "Downloading " + ele.text  
        resp = http.get("/" + ele.text)  
        open("images/" + ele.text.gsub("/", "_") + ".jpg", "wb") { |file|  
          file.write(resp.body)  
        }  
      end  
    end  
    puts "Done"  
AWS

Published at DZone with permission of Rodrigo De Castro, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How To Choose the Right Streaming Database
  • gRPC on the Client Side
  • Full Lifecycle API Management Is Dead
  • What Are the Benefits of Java Module With Example

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: