Over a million developers have joined DZone.

On-Demand, Service-Based Big Data Integration for Collaboration [Slides]

DZone's Guide to

On-Demand, Service-Based Big Data Integration for Collaboration [Slides]

This research paper and slideshow tackle the challenges Big Data integration faces today and architects a possible solution for the problems.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Today, I presented my paper "Óbidos" at the VLDB DMAH workshop in Munich. The abstract and the presentation of the paper are given below.


Biomedical research requires distributed access, analysis, and sharing of data from various disparate sources on an Internet scale. Due to the volume and variety of Big Data, materialized data integration is often infeasible or too expensive, considering the costs of bandwidth, storage, maintenance, and management. Óbidos (On-demand Big Data Integration, Distribution, and Orchestration System) provides a novel on-demand integration approach for heterogeneous distributed data. Instead of integrating data from the data sources to build a complete data warehouse as the initial step, Óbidos employs a hybrid approach of virtual and materialized data integrations. By allocating unique identifiers as pointers to virtually integrated data sets, Óbidos supports efficient data sharing among data consumers. We designed Óbidos as a generic service-based data integration system and implemented and evaluated a prototype for multimodal medical data.

Please find the full text of the paper here and the presentation below:

I mostly worked on this paper while I was doing my internship at Emory University. This is also my first paper to get accepted from UCLouvain/Belgium.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

big data ,big data integration ,distributed systems ,distributed data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}