Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
CIO request — “I want to build a data as a service offering for my data” to the rest of the organization.
Underutilization and the complexity of managing growing data sprawl have motivated several trends during the last several years. Data-as-a-Service (DaaS) represents an opportunity for improving IT efficiency and performance through centralization of resources. DaaS strategies have increased dramatically in the last few years with the maturation of technologies such as data virtualization, data integration, SOA, BPM and Platform-as-a-service.
These questions are accelerating the Data-as-a-Service (DaaS) trend: How to deliver the right data to the right place at the right time? How to “virtualize” the data often trapped inside applications? How to support changing business requirements (analytics, reporting, and performance management) in spite of ever changing data volumes and complexity.
Enterprise DaaS strategy & Infrastructure is core focus area for business unit and enterprise CIOs.
- Enterprise Datawarehouse (EDW) strategies are increasingly moving to cross enterprise Data-as-a-Service (DaaS) strategies.
- Structured and unstructured data growth force the evolution to DaaS
- As Data in app silos moves to a centralized corporate/enterprise asset – DaaS infrastructure becomes critical.
- To do any form of enterprise analytics you need DaaS in place first.
In the early years of this market, most DaaS was focused primarily on the financial services, telecom, and government sectors. However, in the past 24 months, we have seen a significant increase in adoption in the healthcare, insurance, retail, manufacturing, eCommerce, and media/entertainment sectors.
Data as a Service (DaaS) Use Cases
Data as a Service (DaaS) is based on the concept that the transaction, product, customer data can be provided on demand to the user regardless of geographic or organizational separation of provider and consumer. Additionally, the emergence of PaaS and service-oriented architecture (SOA) has rendered the actual platform on which the data resides also irrelevant.
Data as a Service (DaaS) has many use cases:
- providing a single version of the truth;
- enabling real-time business intelligence (BI),
- high-performance scalable transaction processing;
- exposing big-data analytics;
- federating views across multiple domains;
- improving security and access;
- integrating with cloud and partner data and social media;
- delivering information to mobile apps
- enterprisewide search,
Organizations are looking to solve tough data and process integration challenges as they once again begin to invest in new business capabilities.
What is Data-as-a-Service (DaaS)?
Data as a Service (DaaS) brings the notion that data related services can happen in a centralized place – aggregation, quality, cleansing and enriching data and offering it to different systems, applications or mobile users, irrespective of where they were. As such, DaaS solutions provide the following advantages:
- Agility (and Time to Market) – Customers can move quickly due to the consolidation of data access and the fact that they don’t need extensive knowledge of the underlying data. If customers require a slightly different data structure or has location specific requirements, the implementation is easy because the changes are minimal.
- Cost-effectiveness – Providers can build the base with the data experts and outsource the presentation layer, which makes for very cost-effective report and dashboard user interfaces and makes change requests at the presentation layer much more feasible.
- Data quality – Access to the data is controlled via data services, which tends to improve data quality, as there is a single point for updates. Once those services are tested thoroughly, they only need to be regression tested, if they remain unchanged for the next deployment.
- Cloud like Efficiency, High availability and Elastic capacity. These benefits derive from the virtualization foundation —one gets efficiency from the high utilization of sharing physical servers, availability from clustering across multiple physical servers, and elastic capacity from the ability to dynamically resize clusters and/or migrate live cluster nodes to different physical servers.
Agility (and Time to Market) is the important driver for DaaS probably more than cost and data quality is a metric needed to show value to the technology team.
Data-as-a-Service (DaaS) Elements
Client need — “I want to build a data as a service offering for my data” to the rest of the organization.
Components to enable this are as follows:
1) Data acquisition – can come from any source….datawarehouses, emails, portals, third party data sources
2) Data stewardship and standardization — boil it down to a standard manually or automatically
3) Data aggregate – Stick build data warehouse for acquisition. This has a strong service and technology driven quality control mechanism. Different than let’s write 100 etl programs.
4) Data servicing: via web services, extracts, reports etc… Make it easy to consume for the end user either machine to machine or directly via reporting universe. It’s probably a while before we move up market to reporting but machine to machine consumption is in our wheelhouse.
Domain Knowledge, Application Knowledge, People/talent, Processes, Technology Platforms are key requirements of DaaS strategy.
Obviously, the market leaders want to position ourselves to become the experts in knowing the underlying data so everyone else in the organization does not have to….domain expertise becomes really important here.
1) Platform as a Service (PaaS) is being applied to Enterprise Data
2) Data Virtualizaiton is a pre-cursor to DaaS. Vendors include: Composite Software, Denodo Technologies, IBM, Informatica, Microsoft, Oracle, and Red Hat. Other vendors who fill pieces of the DaaS puzzle include Endeca Technologies, Gigaspaces, Ipedo, Memcached, Pentaho, Quest Software, Talend, and Terracotta.
3) A variety of technologies comprise the DaaS category including distributed data caching, search engines, elastic caches, information lifecycle management (ILM) solutions, data replication, data quality, data transformation, content management, and data modeling.
Published at DZone with permission of Ravi Kalakota , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.