DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Securing Your Software Supply Chain with JFrog and Azure
Register Today

Trending

  • Understanding the Role of ERP Systems in Modern Software Development
  • Implementing a Serverless DevOps Pipeline With AWS Lambda and CodePipeline
  • Auditing Tools for Kubernetes
  • How to Supplement SharePoint Site Drive Security With Java Code Examples

Trending

  • Understanding the Role of ERP Systems in Modern Software Development
  • Implementing a Serverless DevOps Pipeline With AWS Lambda and CodePipeline
  • Auditing Tools for Kubernetes
  • How to Supplement SharePoint Site Drive Security With Java Code Examples
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. LZ* compression algorithms

LZ* compression algorithms

Oliver Hookins user avatar by
Oliver Hookins
·
May. 26, 14 · Interview
Like (0)
Save
Tweet
Share
5.60K Views

Join the DZone community and get the full member experience.

Join For Free

While I was implementing a small, naive log aggregation tool I had a moment to consider the type of compression I wanted to use on the log files at rest. The main implication this has outside of the efficiency of compression and how much space the files will take up is how they can be used once stored. If you are using tools like gzip/gunzip with awk or other simple command-line tools, or even within scripts written in common languages like Python or Ruby, gzip compression poses very little problem – since in most cases you are processing one file at a time.

If you want to use a distributed computation system like Hadoop on the other hand, this can be a problem. Gzip files can’t be split, so you tend to suffer by only having a single mapper being able to work on a file at a time. If your files are small, this may not be a problem but if your files are large it can be. Other tools such as Impala will outright refuse to work on gzipped data, so this may further limit your options. I started looking into alternative compression algorithms that these tools do support and one name that kept coming up was LZO. If you look it up on Wikipedia it doesn’t offer much insight into what it actually is. Since I was implementing my aggregation tool in Golang, I checked the standard library and only found LZW compression.

Are LZO and LZW the same? Are they related? Is one better than the other? I also found very little help in the Google results (but perhaps that is just me). In the end I implemented gzip compression in my program, but little did I know that even gzip is basically in the same family as these aforementioned LZ-prefixed algorithms (via deflate).

I just started watching a new series of videos on the Google Developers YouTube channel called Compressor Head – episode 2 of which covers the LZ Compression Family, how these algorithms work in easy-to-understand terms and which programs we know of today that inherit from the fundamental algorithms LZ77 and LZ78. I highly recommend watching them!

Algorithm

Published at DZone with permission of Oliver Hookins, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Understanding the Role of ERP Systems in Modern Software Development
  • Implementing a Serverless DevOps Pipeline With AWS Lambda and CodePipeline
  • Auditing Tools for Kubernetes
  • How to Supplement SharePoint Site Drive Security With Java Code Examples

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: