Over a million developers have joined DZone.

On-Demand, Service-Based Big Data Integration for Collaboration [Slides]

DZone's Guide to

On-Demand, Service-Based Big Data Integration for Collaboration [Slides]

This research paper and slideshow tackle the challenges Big Data integration faces today and architects a possible solution for the problems.

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Today, I presented my paper "Óbidos" at the VLDB DMAH workshop in Munich. The abstract and the presentation of the paper are given below.


Biomedical research requires distributed access, analysis, and sharing of data from various disparate sources on an Internet scale. Due to the volume and variety of Big Data, materialized data integration is often infeasible or too expensive, considering the costs of bandwidth, storage, maintenance, and management. Óbidos (On-demand Big Data Integration, Distribution, and Orchestration System) provides a novel on-demand integration approach for heterogeneous distributed data. Instead of integrating data from the data sources to build a complete data warehouse as the initial step, Óbidos employs a hybrid approach of virtual and materialized data integrations. By allocating unique identifiers as pointers to virtually integrated data sets, Óbidos supports efficient data sharing among data consumers. We designed Óbidos as a generic service-based data integration system and implemented and evaluated a prototype for multimodal medical data.

Please find the full text of the paper here and the presentation below:

I mostly worked on this paper while I was doing my internship at Emory University. This is also my first paper to get accepted from UCLouvain/Belgium.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

big data ,big data integration ,distributed systems ,distributed data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}