Over a million developers have joined DZone.
Platinum Partner

How to Push Assets to S3 with Rake: Versioning and Cache Expiration

· Cloud Zone

The Cloud Zone is brought to you in partnership with Iron.io. Discover how Microservices have transformed the way developers are building and deploying applications in the era of modern cloud infrastructure.

A while ago I wrote about how we package and push Rails assets to Amazon S3. We version assets with the GIT hash – varying the assets by URL enables setting indefinite cache expiration and works well with a CDN. In that post you could find a Rake task that would delete any old assets and replace them with newer assets. It’s time for a revision with some new features.

The first problem we have solved is how long it takes to sync contents between a local folder and S3. The old task fetched the entire bucket file list, which grew quite a bit over time. The S3 API supports a prefix option.

 

    s3i.incrementally_list_bucket(to, prefix: "assets/") do |response|
      response[:contents].each do |existing_object|
        ...
      end
    end

 

The second issue is with asset rollback. We deploy assets to S3 and then code to Heroku. The asset deployment deletes the old assets. There’s a small window in which we have old code and new assets, which is obviously not okay. We’re actually saved by CloudFront which keeps a cache for extended periods of time. A solution is to keep two copies of the assets online: current and previous. The code preserves the most recent copy by looking at the :last_modified field of the S3 object.

Here’s the task with some shortcuts and a complete task as a gist.

 

 

    # uploads assets to s3 under assets/githash, deletes stale assets
    task :uploadToS3, [ :to ] => :environment do |t, args|
      from = File.join(Rails.root, 'public/assets')
      to = args[:to]
      hash = (`git rev-parse --short HEAD` || "").chomp
      
      logger.info("[#{Time.now}] fetching keys from #{to}")
      existing_objects_hash = {}
      existing_assets_hash = {}
      s3i.incrementally_list_bucket(to, prefix: "assets/") do |response|
        response[:contents].each do |existing_object|
          existing_objects_hash[existing_object[:key]] = existing_object
          previous_asset_hash = existing_object[:key].split('/')[1]
          existing_assets_hash[previous_asset_hash] ||= DateTime.parse(existing_object[:last_modified])
        end
      end
     
      logger.info("[#{Time.now}] #{existing_assets_hash.count} existing asset(s)")
      previous_hash = nil
      existing_assets_hash.each_pair do |asset_hash, last_modified|
        logger.info(" #{asset_hash} => #{last_modified}")
        previous_hash = asset_hash unless (previous_hash and existing_assets_hash[previous_hash] > last_modified)
      end
      logger.info("[#{Time.now}] keeping #{previous_hash}") if previous_hash
     
      logger.info("[#{Time.now}] copying from #{from} to s3:#{to} @ #{hash}")
      Dir.glob(from + "/**/*").each do |entry|
        next if File::directory?(entry)
        File.open(entry) do |entry_file|
          content_options = {}
          content_options['x-amz-acl'] = 'public-read'
          content_options['content-type'] = MIME::Types.type_for(entry)[0]
          key = 'assets/'
          key += (hash + '/') if hash
          key += entry.slice(from.length + 1, entry.length - from.length - 1)
          existing_objects_hash.delete(key)
          logger.info("[#{Time.now}]  uploading #{key}")
          s3i.put(to, key, entry_file, content_options)
        end
      end
      
      existing_objects_hash.keys.each do |key|
        next if previous_hash and key.start_with?("assets/#{previous_hash}/")
        puts "deleting #{key}"
        s3i.delete(to, key)
      end
    end

 

Since we’re versioning assets with a GIT hash in the URL, another improvement is to set cache expiration to something longer.

 

 

content_options['cache-control'] = "public, max-age=#{365*24*60*60}"

 


 

The Cloud Zone is brought to you in partnership with Iron.io. Learn how to build and test their Go programs inside Docker containers.

Topics:

Published at DZone with permission of Daniel Doubrovkine .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}