DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Emerging Data Architectures: The Future of Data Management
  • Data Architectures in the AI Era: Key Strategies and Insights
  • Are Your ELT Tools Ready for Medallion Data Architecture?
  • Real-Time Data Architecture Frameworks

Trending

  • Microsoft Azure Synapse Analytics: Scaling Hurdles and Limitations
  • It’s Not About Control — It’s About Collaboration Between Architecture and Security
  • Recurrent Workflows With Cloud Native Dapr Jobs
  • A Modern Stack for Building Scalable Systems
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Data Governance: Data Architecture (Part 2)

Data Governance: Data Architecture (Part 2)

This article discusses data governance and data architecture, and the top three data architecture types that are most frequently used.

By 
Sukanya Konatam user avatar
Sukanya Konatam
·
Jul. 04, 23 · Analysis
Likes (2)
Comment
Save
Tweet
Share
4.2K Views

Join the DZone community and get the full member experience.

Join For Free

Data governance is a framework created by the collaboration of people with various roles and responsibilities working towards establishing the processes, policies, standards, and metrics to achieve the organization’s goals. These goals can range from providing trusted data for businesses to developing accurate analytics for evaluating business performance, complying with regulatory compliances, protecting the data, ensuring data privacy, and enabling the data management life cycle.

The important areas of data governance are described below. 

Important areas of Data Governance

What Is Data Architecture?

Data Architecture is a fundamental pillar in an organization, and it presents the blueprint of an integrated view comprising the different data governance disciplines shown in the above diagram.

Data Architecture depicts the business's overall strategy in the form of a design. It highlights the requirements for the strategy, such as data sources and various data points. Data Architecture helps to demonstrate how data integration should occur from different data sets that come from different source systems.

Data Architecture shows the integrated view of different levels of data abstractions such as raw data, curated data, mastered data, and aggregated data for business analytics.

The data architecture also explains how data is being managed. This can include how the data is acquired, stored, secured, processed, archived, and deleted.

Data Architecture Types

There are multiple types of data architectures, and the appropriate model can be chosen based on several parameters such as cost, performance, reliability, and availability. The most prominent data architecture models being used in the industry are:

  1. Centralized Data Architecture
  2. Distributed Architecture
  3. Data Lake Architecture
  4. Lakehouse Architecture
  5. Event-Driven Architecture
  6. Federated Architecture
  7. Microservice Architecture
  8. Hybrid Architecture
  9. Data Fabric Architecture
  10. Data Mesh Architecture

We will cover the top three architectures which are prominently used.

Centralized Data Architecture

In this type of architecture, the data is stored in a centralized place, such as a centralized data warehouse, where all the source systems are integrated. From there, data access is provided per the established policies and access controls. In the below picture, the centralized data model is populated with data from different locations in an integrated fashion.
Centralized Data Architecture

Some of the advantages of using this architecture model are:

  • Data is available at a single location, making accessing and coordinating data easier.
  • This method will have less data redundancy as compared to other methods since all the data is stored in a single place.
  • The cost of implementation is economical in comparison with other architectures.
  • Data consistency, transformation, security, etc., is much easier to implement

Some of the disadvantages are:

  • If there are many data consumers, this model can lead to data traffic that may result in performance issues.
  • If a system failure occurs in the centralized system, then the entire system will have an impact
  • There may be regulatory or specific restrictions on sharing the data with other locations

Distributed Architecture

In the distributed architecture, the centralized data will be stored in multiple locations, and the systems closer to that location will perform well. Data replications are generally performed to maintain data consistency and accuracy in this type of architecture. In the picture below, the standard data model has been replicated in three locations, and the data is ingested and consumed close to the respective locations. The data from these three locations have been synchronized to keep them available in all the locations.

Distributed Architecture

The advantages and disadvantages are similar to the Centralized Data Architecture model because both these models rely heavily on a centralized approach.

Some of the advantages are:

  • Data is available at a single location, making accessing and coordinating data easier.
  • This method will have less data redundancy compared to other methods since all the data is stored in a single place.
  • The cost of implementation is economical in comparison with other architectures.
  • Data consistency, transformation, security, etc., is much easier to implement

Disadvantages:

  • There is higher data traffic when considering the centralized database.
  • If a system failure occurs in the centralized system, then the entire system will have the impact
  • We cannot use this model If there are any regulatory or specific restrictions on sharing the data with other countries

Data Lake Architecture

In the data lake architecture, all the data sources are stored in a location in their original raw data format. This model typically helps data scientists and analysts explore the hidden data points, and it allows for flexibility in designing different ML models and analytics. This architecture is generally deployed in the cloud storage services such as S3, Azure BLOB, and Cloud storage facilitated by cloud service providers such as AWS, Azure, GCP, and others.

Data Lake Architecture

Advantages:

  • Data lake architecture allows for storing different data types, such as structured, semi-structured, and unstructured data
  • This model allows the storage of original raw data to perform data explorations and machine learning 
  • This architecture is scalable and can store a very large set of data 
  • The data storage and computing are de-coupled in this architecture due to the ease of scaling these two components.
  • The scalability allows for it to be a cost-effective solution architecture 
  • This architecture supports advanced analytics, such as machine learning

Disadvantages:

  • If there is no proper governance, this model may lead to data silos
  • Data lakes are prone to data security and privacy threats
  • Data ingestion, curation, and enforcing the schema is complex within this model

Conclusion

The key takeaways from this article are:

  • A robust data governance framework is needed when integrating data from various source systems
  • A centralized data architecture utilizes a centralized place to integrate different source systems
  • A distributed data architecture model utilizes a centralized mechanism in different locations. It differs from a centralized data architecture model because there are several locations where the data is stored in this model.
  • A data lake architecture model allows for data to be stored in its original raw format and aids in data exploration.
Architecture Data architecture Data governance

Opinions expressed by DZone contributors are their own.

Related

  • Emerging Data Architectures: The Future of Data Management
  • Data Architectures in the AI Era: Key Strategies and Insights
  • Are Your ELT Tools Ready for Medallion Data Architecture?
  • Real-Time Data Architecture Frameworks

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!