Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

ETL Tools: A Modern List

DZone's Guide to

ETL Tools: A Modern List

We take a look at how ETL tools have evolved over the years to incorporate the cloud and open source. See which tools work best in which situations.

· Database Zone ·
Free Resource

New whitepaper: Database DevOps – 6 Tips for Achieving Continuous Delivery. Discover 6 tips for continuous delivery with Database DevOps in this new whitepaper from Redgate. In 9 pages, it covers version control for databases and configurations, branching and testing, automation, using NuGet packages, and advice for how to start a pioneering Database DevOps project. Also includes further research on the industry-wide state of Database DevOps, how application and database development compare, plus practical steps for bringing DevOps to your database. Read it now free.

Extract, Transform, and Load (ETL) tools enable organizations to make their data accessible, meaningful, and usable across disparate data systems. When it comes to choosing the right ETL tool, there are many options to chose from. So, where should you start?

We've prepared a list that is simple to digest, organized into four categories to better help you find the best solution for your needs.

Incumbent Batch ETL Tools

Until recently, most of the world’s ETL tools were on-prem and based on batch processing. Historically, most organizations used to utilize their free compute and database resources, during off-hours, to perform nightly batches of ETL jobs and data consolidation. This is why, for example, you used to see your bank account updated only a day after you made any financial transaction.

Cloud Native ETL Tools

With IT moving to the cloud, more and more cloud-based ETL services started to emerge. Some of them keep the same basic batch model of the legacy platforms, while others start to offer real-time support, intelligent schema detection, and more.

Open Source ETL Tools

Similarly to other areas of software infrastructure, ETL has had its own surge of open source tools and projects. Most of them were created as a modern management layer for scheduled workflows and batch processes. For example, Apache Airflow was developed by the engineering team at AirBnB, and Apache NiFi by the US National Security Agency (NSA).

Real-Time ETL Tools

Doing your ETL in batches makes sense only if you do not need your data in real-time. It might be good enough for salary reporting or tax calculations. However, most modern applications require a real-time access to data from different sources. When you upload a picture to your Facebook account, you want your friends to see it immediately, not a day after.

This shift to real-time led to a profound change in architecture: from a model based on batch processing to a model based on distributed message queues and stream processing. Apache Kafka has emerged as the leading distributed message queue for modern data applications, and companies like Alooma and others are building modern ETL solutions on top of it, either as a SaaS platform or an on-prem solution.

Conclusion

Now that you know how ETL tooling has advanced over the years and which options work best in which scenarios, let's take a more visual look at how those tools have evolved:

Image title

This post contains some representative examples for each category to help you make the choice that meets your needs. For a complete, uncategorized, list we recommend this post.

New whitepaper: Database DevOps – 6 Tips for Achieving Continuous Delivery. Discover 6 tips for continuous delivery with Database DevOps in this new whitepaper from Redgate. In 9 pages, it covers version control for databases and configurations, branching and testing, automation, using NuGet packages, and advice for how to start a pioneering Database DevOps project. Also includes further research on the industry-wide state of Database DevOps, how application and database development compare, plus practical steps for bringing DevOps to your database. Read it now free.

Topics:
etl ,database ,data transformation ,data architecture

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}