Microsoft Azure Data Lake
In this article, see how a team created a central controlled data repository with the MS Azure echo system.
Join the DZone community and get the full member experience.Join For Free
2020 is different in every way, but one thing is constant for the past many years i.e. data and its role in molding our current technology. Recently, I was part of the team to create a central controlled data repository containing clear, consistent, and clean data. While exploring the technologies we landed on MS Azure echo system.
MS Azure echo system for developing data lakes/data warehouse is becoming mature and providing good support when it comes to the enterprise-level solutions. Starting from Azure Data Factory, it gave a good ELT/ETL processing with code-free services. This is very helpful to create pipelines for data ingestion, control flow, and moving data from source to destination. These pipelines have the capability to run 24/7 and ingest petabytes of data. Without the support of a data factory data movement between different enterprise systems requires a lot of effort and at times will be very expensive to develop and maintain. Additionally, there are more than 90 built-in connectors in Azure Data Factory which will help to connect with most of the sources like S3, Redshift, BigQuery, HDFS, Salesforce, and enterprise data warehouse to name a few.
Following is the high-level flow that would be a good fit for most of the applications.
Will discuss some more component of MS Azure in the next series, with some examples.
Opinions expressed by DZone contributors are their own.