Over a million developers have joined DZone.

Living in a Data Vault

DZone 's Guide to

Living in a Data Vault

We're not talking about a dark, scary vault; more a high, bright ceiling. As data vaults and warehouses continue to become mainstream, this seems an apt comparison.

· Big Data Zone ·
Free Resource

Living in a vault, especially of the underground variety, doesn’t sound too pleasant to me. However, a vault can also mean a roof in the form of an arch, as in a cathedral. To that point, living in a Data Vault demands some over-arching thought, uniting business people and IT once again in a process of ongoing decision-making support, in an environment where business change will always be continuous and rapid.

Such a constant “running while rebuilding” process is well-known to data warehouse teams, but the support for agility combined with the governance offered by the Data Vault model and methodology raises the stakes to unprecedented levels. Getting business people on board is the first and vital step. Business users must see that the agility they experienced in the initial delivery of a Data Vault—when implemented via data warehouse automation software such as WhereScape® Data Vault Express—can carry through to ongoing operation.

The collaborative effort between business users and IT in the discovery and design phase of the initial Data Vault delivery must continue and intensify at an even faster pace. Business users may have had some patience with the initial build of the Data Vault because of its novelty, but now that the Data Vault is mainstream, the expectation will be of ever faster and higher quality iterations. The metadata stored in Data Vaults is the basis for impact analysis, rapid development, and speedy deployment of Data Vault updates. Ensuring updates are delivered successfully the first time—in terms of data delivered and business needs met—will keep business users from reverting to the omnipresent spreadsheet.

For development teams, the automation of the entire process from design to delivery is the mandatory foundation for reducing the effort of 24/7 operation and freeing up developer time and resources for the delivery of new and changed function and data demanded by the business. The error handling, logging, and auditing code generated in the development phase together with the built-in scheduler and job management capabilities of a Data Vault allow developers to focus on what they do best: develop solutions.

Agility is vital in this extended stage of the Data Vault lifecycle. This flexibility comes at the cost of a relatively complex engineering structure in the Data Vault model as well as in the numerous components and processes needed to build and populate it. While advanced Data Vaults offer a methodology to address this complexity, data warehouse automation, and the metadata it creates and uses, is the only effective solution for developers under pressure from rapidly evolving business demands.

This metadata—or, as I prefer to call it, context-setting information — offers a common and consistent vocabulary and business semantics. These are mandatory for effective collaboration between business and IT as the business evolves, sometimes dramatically, and where the content and function of the Data Vault must follow. Manual documentation of business needs or data deliverables cannot be trusted in such circumstances. Metadata, in effect, is the automated and flexible documentation of all requirements and deliverables.

data warehouse ,big data ,agile data ,data vault ,data lifecycle

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}