DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Simplify Authorization in Ruby on Rails With the Power of Pundit Gem
  • The Importance of Data Compression in Oracle Databases
  • Two-Pass Huffman in Blocks of 2 Symbols: Golang Implementation
  • Buh-Bye, Webpack and Node.js; Hello, Rails and Import Maps

Trending

  • Native SQL in Java Without JDBC Boilerplate — Meet Ujorm3
  • Combining Temporal and Kafka for Resilient Distributed Systems
  • Agentic AI Has an Observability Blind Spot Nobody Is Talking About
  • Advanced Error Handling and Retry Patterns in Enterprise REST Integrations
  1. DZone
  2. Coding
  3. Frameworks
  4. Gzip a File in Ruby

Gzip a File in Ruby

Learn how to make your files as small as possible to increase effieciency using the zlib module built into Ruby, and some controls you can adjust.

By 
Phil Nash user avatar
Phil Nash
·
Updated Jan. 14, 21 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
8.7K Views

Join the DZone community and get the full member experience.

Join For Free

At the start of the year, I looked into how to better compress the output of a Jekyll site. I’ll write up the results to that soon. For now, here’s how to gzip a file using Ruby.

Enter zlib

Contained within the Ruby standard library is the Zlib module which gives access to the underlying zlib library. It is used to read and write files in gzip format. Here’s a small program to read in a file, compress it and save it as a gzip file.

require "zlib"

def compress_file(file_name)
  zipped = "#{file_name}.gz"

  Zlib::GzipWriter.open(zipped) do |gz|
    gz.write IO.binread(file_name)
  end
end

You can use any IO or IO-like object with Zlib::GzipWriter.

We can enhance this by adding the file's original name into the compressed output. Also, it can be beneficial to set the file’s modified time to the same as the original, particularly if you are using the file with nginx’s gzip_static module.


require "zlib"

def compress_file(file_name)
  zipped = "#{file_name}.gz"

  Zlib::GzipWriter.open(zipped) do |gz|
    gz.mtime = File.mtime(file_name)
    gz.orig_name = file_name
    gz.write IO.binread(file_name)
  end
end


Just to see how well this is working, we can print some statistics from the compression.


require "zlib"

def print_stats(file_name, zipped)
  original_size = File.size(file_name)
  zipped_size = File.size(zipped)
  percentage_difference = ((original_size - zipped_size).to_f/original_size)*100
  puts "#{file_name}: #{original_size}"
  puts "#{zipped}: #{zipped_size}"
  puts "difference - #{'%.2f' % percentage_difference}%"
end

def compress_file(file_name)
  zipped = "#{file_name}.gz"

  Zlib::GzipWriter.open(zipped) do |gz|
    gz.mtime = File.mtime(file_name)
    gz.orig_name = file_name
    gz.write IO.binread(file_name)
  end

  print_stats(file_name, zipped)
end


Now you can pick a file and compress it with Ruby and gzip. For this quick test, I chose the uncompressed, unminified jQuery version 3.3.1. The results were:


./jquery.js: 271751
./jquery.js.gz: 80669
difference - 70.32%


Pretty good, but we can squeeze more out of it if we try a little harder.

Compression Levels

Zlib::GzipWriter takes a second argument to open , which is the level of compression zlib applies. 0 is no compression and 9 is the best possible compression and there’s a tradeoff as you progress from 0 to 9 between speed and compression. The default compression is a good compromise between speed and compression, but if time isn’t a worry for you then you might as well make the file as small as possible, especially if you want to serve it over the web. To set the compression level to 9 you can just use the integer, but Zlib has a convenient constant for it: Zlib::BEST_COMPRESSION.

Changing the line with Zlib::GzipWriter in it to:


  Zlib::GzipWriter.open(zipped, Zlib::BEST_COMPRESSION) do |gz|


and running the file against jQuery again gives you:


./jquery.js: 271751
./jquery.js.gz: 80268
difference - 70.46%


A difference of 0.14 percentage points! Ok, not a huge win, but if the time to generate the file doesn’t matter then you might as well. And the difference is greater on even larger files.

Streaming gzip

There’s one last thing you might want to add. If you are compressing really large files, loading them entirely into memory isn’t the most efficient way to work. Gzip is a streaming format though, so you can write chunks at a time to it. In this case, you just need to read the file you are compressing incrementally and write the chunks to the GzipWriter. Here’s what it would look like to read the file in chunks of 16kb:

def compress_file(file_name)
  zipped = "#{file_name}.gz"

  Zlib::GzipWriter.open(zipped, Zlib::BEST_COMPRESSION) do |gz|
    gz.mtime = File.mtime(file_name)
    gz.orig_name = file_name
    File.open(file_name) do |file|
      while chunk = file.read(16*1024) do
        gz.write(chunk)
      end
    end
  end
end


gzip in Ruby

So that is how to gzip a file using Ruby and zlib, as well as how to add extra information and control the compression level to balance speed against final filesize. All of this went into a gem I created recently to use maximum gzip compression on the output of a Jekyll site. The gem is called jekyll-gzip and I’ll be writing more about it, as well as other tools that are better than the zlib implementation of gzip, soon.


Header icons: Diamond by Edward Boatman and vice by Daniel Luft from the Noun Project.


Gzip a file in Ruby was originally published at philna.sh on Feb 25, 2018.

Data compression Ruby (programming language)

Published at DZone with permission of Phil Nash. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Simplify Authorization in Ruby on Rails With the Power of Pundit Gem
  • The Importance of Data Compression in Oracle Databases
  • Two-Pass Huffman in Blocks of 2 Symbols: Golang Implementation
  • Buh-Bye, Webpack and Node.js; Hello, Rails and Import Maps

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook