DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Introducing the Data Vault Alliance

Introducing the Data Vault Alliance

A discussion of the concept of data vaults and how they relate to other tools and concepts of big data.

Daniel Linstedt user avatar by
Daniel Linstedt
·
May. 22, 19 · Analysis
Like (2)
Save
Tweet
Share
5.20K Views

Join the DZone community and get the full member experience.

Join For Free

Recently, at the WorldWide Data Vault Consortium (WWDVC) event in Stowe, Vermont, I launched the Data Vault Alliance, a new global community which seeks to unite Data Vault experts, vendors, and practitioners and share best practices for Data Vault 2.0 with organizations worldwide. One of the primary reasons that I founded the Alliance was to provide IT practitioners with the right education, tools, and resources to be successful with Data Vault 2.0 and reap its benefits for the organizations they serve as quickly as possible.

But what is a Data Vault, and why should organizations consider it? My colleague Michael Olschimke, CEO at Scalefree, discussed this in a recent webinar with WhereScape and here’s a quick explanation he shared:

A Quick Introduction

At a basic level, a Data Vault consists of three key categories of information:

  • Hubs – unique lists of business keys.
  • Links – unique lists of relationships.
  • Satellites – descriptive data over time.

The hub sits at the heart of the methodology and is then connected via links to other hubs or satellite information. This satellite information is where all the “color” of useful data is held – including historical tracking of values over time. Examples of satellite information could include customer data, location data, or individual information streams from different business units.

data-vault-flow.png

Source: ScaleFree blog

Together, combinations of these categories form the “network” of a Data Vault, a way of connecting together bits of information in a flexible, repeatable way that can enable a consistent development stream. At its core, using Data Vault 2.0 methodology helps businesses fuse together many different data streams from varying sources, in such a way as can deliver actionable, useable information for end users.

How Does a Data Vault Process Information?

The usual workflow for a Data Vault environment follows four stages:

  • Data is loaded from the source into either a relational data system or a data lake.
  • Data is then broken down into individual “lego brick” components, and then built in a more targeted manner using simple ETL driven by metadata.
  • Once information is regrouped, enterprise business rules can be applied to turn these individual data fragments into useful information.
  • Lastly, an overarching schema is applied – whether that is a star schema or a snowflake schema or something else entirely, this create the basis to overlay a dashboard tool ready to present back insights.

At the end of this process, a fully formed Data Vault provides a solution that is almost “self-service business intelligence,” as well as a raw data stream where power users can create and write back solutions to their own user area, without affecting the core IT data warehouse. The question is, how do we get there?

Data vault automation can play a critical role here. As this workflow remains a constant repeatable process in the Data Vault, it is perfect for applying automaton to help organizations realize the benefits, faster. WhereScape® Data Vault Express™offers exactly this capability – allowing businesses to achieve scalability and data consistency, as well as reaping the benefits of Data Vault 2.0 sooner.

For those wishing to learn more about Data Vault 2.0, and deepen their expertise in Data Vault 2.0 modeling, methodology and architecture, the Data Vault Alliance can provide you with access to world-class training, professional development certifications, organizational assessment tools, directories of authorized Data Vault 2.0 vendors, and mentoring opportunities. You can view this video to learn more about the Data Vault Alliance. I encourage you to take a look at this new online community today.

And for those of you attending the WWDVC, I hope the knowledge and experience our presenters share at this week’s event provide you with many practical ideas to take back and implement within your organizations. It’s an exciting time for the Data Vault community, and if you aren’t yet applying Data Vault 2.0, now is the perfect time for you to learn more and evaluate if it is right for your organization.

Data stream

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • 5 Factors When Selecting a Database
  • PHP vs React
  • Key Considerations When Implementing Virtual Kubernetes Clusters
  • Stream Processing vs. Batch Processing: What to Know

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: