DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Understanding the Basics of Neural Networks and Deep Learning
  • Required Capabilities in Self-Navigating Vehicle-Processing Architectures
  • Learning AI/ML: The Hard Way
  • Predicting Stock Trend Using Deep Learning

Trending

  • Cookies Revisited: A Networking Solution for Third-Party Cookies
  • Start Coding With Google Cloud Workstations
  • Measuring the Impact of AI on Software Engineering Productivity
  • How Large Tech Companies Architect Resilient Systems for Millions of Users
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Crunch Time: 10 Best Compression Algorithms

Crunch Time: 10 Best Compression Algorithms

Take a look at these compression algorithms that reduce the file size of your data to make them more convenient and efficient.

By 
Leah Fainchtein Buenavida user avatar
Leah Fainchtein Buenavida
·
May. 28, 20 · Analysis
Likes (11)
Comment
Save
Tweet
Share
125.5K Views

Join the DZone community and get the full member experience.

Join For Free

Data compression is the process of reducing file sizes while retaining the same or a comparable approximation of data. This is accomplished by eliminating unnecessary data or by reformatting data for greater efficiency. 

When compressing data, you can use either lossy or lossless methods. Lossy methods permanently erase data while lossless preserve all original data. The type you use depends on how high fidelity you need your files to be.

In this article, you will discover six different types of lossless data compression algorithms, and four image and video compression algorithms based on deep learning.

6 Lossless Data Compression Algorithms

Lossless compression algorithms are typically used for archival or other high fidelity purposes. These algorithms enable you to reduce file size while ensuring that files can be fully restored to their original state if need be. 

There is a variety of algorithms you can choose from when you need to perform lossless compression. Below are six commonly used ones.

1. LZ77

LZ77, released in 1977, is the base of many other lossless compression algorithms. It uses a “sliding window” method. In this method, LZ77 manages a dictionary that uses triples to represent:

  • Offset—the distance between the start of a phrase and the beginning of a file. 

  • Run-length—the number of characters that make up a phrase.

  • Deviating characters—markers indicating a new phrase. This includes an indication that the phrase is equal to the original phrase and which characters differ. 

As a file is parsed, the dictionary is dynamically updated to reflect the compressed data contents and size. For example, a file containing the the string "abbadabba" is compressed to a dictionary entry of "abb(0,1,'d')(0,3,'a')". You can see a breakdown of this process below.

Position

Symbol

Output

0

a

a

1

b

b

2

b

b

3

a

(0,1,’d’)

4

d

5

a

(0,3,’a’)

6

b

7

b

8

a

In this example, the substitution is slightly larger than the input but with a realistic input (which is much longer) the substitution is typically considerably smaller. 

2. LZR

LZR, released in 1981 by Michael Rodeh, modifies LZ77. It is designed as a linear alternative to LZ77 but can be used for any offset within a file. When used non-linearly, it requires a significant amount of memory, making LZ77 a better option. 

3. LZSS

Lempel-Ziv-Storer-Szymanski (LZSS), released in 1982, is an algorithm that improves on LZ77. It does this by including a method that detects whether a substitution decreases file size. If it doesn’t, the input is left in the original form. LZSS also eliminates the use of deviating characters, only using offset-length pairs. This method is commonly used for archive formats, such as RAR, or for compression of network data. 

4. DEFLATE

DEFLATE, released in 1993 by Phil Katz, combines LZ77 or LZSS preprocessor with Huffman coding. Huffman coding is an algorithm developed in 1952. It is an entropy encoding method that assigns codes based on the frequency of a character. 

5. LZMA

Lempel-Ziv Markov chain Algorithm (LZMA), released in 1998, is a modification of LZ77 designed for the 7-Zip archiver with a .7z format. It uses a chain compression method that applies the modified LZ77 algorithm at a bit rather than byte level. The output from this compression is then processed using arithmetic coding for further compression. Depending on the exact implementation, other compression steps may also be performed.

6. LZMA2

LZMA2, released in 2009, is a refinement of LZMA. It improves the performance of LZMA with greater multithreading capabilities and improved handling of incompressible data. 

4 Image and Video Compression Algorithms Based on Deep Learning

In addition to the static algorithms introduced above, there are several algorithms that are based on deep learning that you can use. 

1. Multi-Layer Perceptron (MLP)-Based Compression

MLP is a technology that uses multiple neuron layers to intake, process, and output data. It can be applied to dimension reduction tasks and data compression. The first MLP-based algorithm was developed in 1988 and integrated the existing processes of:

  • Binary coding—standard two-symbol coding.

  • Quantization—constraint of input from a continuous set to a discrete set.

  • Spatial domain transformation—pixel-by-pixel changes to data.

MLP algorithms used outputs from the above processes in a decomposition neural network to determine optimal binary code combinations. Later, this method was refined with predictive techniques which enabled accurate approximation of data based on neighboring data via backpropagation. 

2. DeepCoder - Deep Neural Network Based Video Compression

DeepCoder is a Convolutional Neural Network (CNN) based framework, which presents an alternative to traditional video compression techniques. The model uses separate Convolutional Neural Networks (CNNs) for predictive and residual signals. It encodes feature maps into a binary stream using scalar quantization and a traditional file compression algorithm, Huffman encoding. The model is said to provide superior performance compared to the well-known H.264/AVC video coding standard. See the full paper.

3. Convolutional Neural Network (CNN)-Based Compression

CNNs are layered neural networks, often used for image recognition and feature detection. When applied to compression, these networks use convolution operations to calculate the correlation between neighboring pixels.

CNNs show better compression results than the MLP-based algorithms, with improved super-resolution performance and artifact reduction. Additionally, CNN-based compression improves the quality of JPEG images by reducing the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM). CNN-based compression can also achieve the performance of the High-Efficiency Video Coding (HEVC) standard by using entropy estimation. 

4. Generative Adversarial Network (GAN)-Based Compression

GANs are a form of neural network that use two networks in competition to produce more accurate analyses and predictions. GAN-based compression algorithms were first developed in 2017. These algorithms can compress files up to two and a half times smaller than other commonly used methods, such as JPEG or WebP. 

You can use GAN-based methods for real-time compression through the use of parallel processing. This method works by compressing images based on the most relevant features. When decoded, images are then reconstructed according to predictions made from these features. In comparison with CNN-based compression, GAN-based compression can produce higher quality images by eliminating adversarial loss.

Conclusion

Compression algorithms can help you optimize file size. Different algorithms provide different results. This article reviewed six static algorithms for lossless compression, and four algorithms for deep learning usage. However, if you can’t find the right algorithm here, you can take a look at this guide and refine your search. There is no shortage of algorithms, but you need to be specific when looking for the right algorithm for your project.

Algorithm neural network Data (computing) Network Data compression Deep learning Convolutional neural network

Published at DZone with permission of Leah Fainchtein Buenavida. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Understanding the Basics of Neural Networks and Deep Learning
  • Required Capabilities in Self-Navigating Vehicle-Processing Architectures
  • Learning AI/ML: The Hard Way
  • Predicting Stock Trend Using Deep Learning

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!