Over a million developers have joined DZone.

Data Virtualization Primer: The Concepts

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Before we move on to Data Virtualization (DV) Architecture and jump into our first demo for the Primer, let's talk about the concepts and examine how and why we want to add a Data Abstraction Layer.

This is the second in our Data Virtualization Primer Basics Series.  I cover the concepts in the presentation below which are also at http://teiid.jboss.org/basics/.  We will also highlight some of the concepts in this article.

We have some main concepts that we should highlight which are:

  • Source Models
  • View Models
  • Translators
  • Resource Adaptors
  • Virtual Databases
  • Modeling and Execution Environments

Source Models represent the structure and characteristics of physical data sources and the source model must be associated with a translator and a resource adaptor.  View Models represent the structure and characteristics you want to expose to your consumers.  These view models are used to define a layer of abstraction above the physical layer.  This enables information to be presented to consumers in business terms rather than as it is physically stored.  The views are defined using transformations between models.  The business views can be in a variety of forms: relational, XML or Web Services.

A Translator provides a abstraction layer between the DV query engine and physical data source, that knows how to convert DV issued query commands into source specific commands and execute them using the Resource Adaptor.   DV provides pre-built translators like Oracle, DB2, MySQL, Postgres, etc.   The resource adaptor provides the connectivity to the physical data source.  This provides the way to natively issue commands and gather results.  A resource adaptor can be a Relational data source, web service, text file, main frame connection, etc.

A Virtual Database (VDB) is a container for components used to integrate data from multiple data sources, so they can be accessed in a integrated manner through a single, uniform API.  The VDB contains the models.  There are 2 different types of VDBs.  The first is a dynamic VDB is defined using a simple XML file.  The second is a VDB through the DV designer in eclipse which is part of the integration stack and this VDB is in Java Archive (JAR) format.  The VDB is deployed to the Data Virtualization server and then the data services can be accessed through JDBC, ODBC, REST, SOAP, OData, etc.

The two main high level components are the Modeling and Execution Environments.  The Modeling Environment is used to define the abstraction layers.  The Execution Environment is used to actualize the abstract structures from the underlying data, and expose them through standard APIs. The DV query engine is a required part of the execution environment, to optimally federate data from multiple disparate sources.

Now that we highlighted the concepts, the last topic to cover is why the data abstraction, the data services, are good for SOA and Microservices.  Below are some of the reasons why the data services are important in these architectures:

  • Expose all data through a single uniform interface
  • Provide a single point of access to all business services in the system
  • Expose data using the same paradigm as business services - as "data services"
  • Expose legacy data sources as data services
  • Provide a uniform means of exposing/accessing metadata
  • Provide a searchable interface to data and metadata
  • Expose data relationships and semantics
  • Provide uniform access controls to information

Stayed tuned for the next Data Virtualization Primer topic!

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.

bigdata,jboss,teiid,big data,microservices,dataservices,data visualization

Published at DZone with permission of Kenneth Peeples, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}