DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Partitioning Historical Data Into Daily Parquet Files in Azure Data Lake Using Azure Data Factory and Azure Notebook
  • Introduction to Azure Data Lake Storage Gen2
  • Building an AI/ML Data Lake With Apache Iceberg
  • AWS S3 Strategies for Scalable and Secure Data Lake Storage

Trending

  • Unlocking AI Coding Assistants: Generate Unit Tests
  • Operational Principles, Architecture, Benefits, and Limitations of Artificial Intelligence Large Language Models
  • Unit Testing Large Codebases: Principles, Practices, and C++ Examples
  • Building Reliable LLM-Powered Microservices With Kubernetes on AWS
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Microsoft Azure Data Lake

Microsoft Azure Data Lake

In this article, see how a team created a central controlled data repository with the MS Azure echo system.

By 
Zeeshan Anwar user avatar
Zeeshan Anwar
·
Updated Nov. 19, 20 · Opinion
Likes (7)
Comment
Save
Tweet
Share
10.3K Views

Join the DZone community and get the full member experience.

Join For Free

2020 is different in every way, but one thing is constant for the past many years i.e. data and its role in molding our current technology. Recently, I was part of the team to create a central controlled data repository containing clear, consistent, and clean data. While exploring the technologies we landed on MS Azure echo system.

MS Azure echo system for developing data lakes/data warehouse is becoming mature and providing good support when it comes to the enterprise-level solutions. Starting from Azure Data Factory, it gave a good ELT/ETL processing with code-free services. This is very helpful to create pipelines for data ingestion, control flow, and moving data from source to destination. These pipelines have the capability to run 24/7 and ingest petabytes of data. Without the support of a data factory data movement between different enterprise systems requires a lot of effort and at times will be very expensive to develop and maintain. Additionally, there are more than 90 built-in connectors in Azure Data Factory which will help to connect with most of the sources like S3, Redshift, BigQuery, HDFS, Salesforce, and enterprise data warehouse to name a few.  

Next comes Azure Functions, which are serverless computing services in MS Azure and helps in running small pieces of code without worries about creating an underline infrastructure, this will tremendously help ease in software development. Azure functions are pay-per-use which is good for cost-effective and support multiple languages like C#, Javascript, Python, etc. These functions serve multiple use cases like creating APIs, webhooks, and micro-services. There are many built-in templates provided by Microsoft such as HTTP request/response, connecting to a Data lake (blob storage), event hub, and/or queue storage. These templates and easy to use and get someone starting within minutes.

Following is the high-level flow that would be a good fit for most of the applications.

Data Flow:


Will discuss some more component of MS Azure in the next series, with some examples.

azure Data (computing) Data lake Azure Data Lake

Opinions expressed by DZone contributors are their own.

Related

  • Partitioning Historical Data Into Daily Parquet Files in Azure Data Lake Using Azure Data Factory and Azure Notebook
  • Introduction to Azure Data Lake Storage Gen2
  • Building an AI/ML Data Lake With Apache Iceberg
  • AWS S3 Strategies for Scalable and Secure Data Lake Storage

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!