Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

A Thought on Data Gravity

DZone's Guide to

A Thought on Data Gravity

For those with lots of data, and I'm talking about petabytes, the location of data is a major factor cloud location decisions. This is when data gravity becomes an issue.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.


For those with lots of data, and I'm talking about petabytes, the location of data is a major factor cloud location decisions. This is when data gravity becomes an issue.

Data Gravity, explained on Technopedia:

Data is something that continues to accumulate over time, and could be considered to become more dense, or have a greater mass. As density or mass accumulates, the data's gravitational pull increases. Services and applications have their own mass and; therefore, have their own gravity. But data is much bigger and denser than the two. So, as data continues to build mass, services and applications are more likely to be drawn to the data, rather than vice versa. This much like an apple falling to earth, which if often provided as a typical example of gravity. Because the earth has more mass, the apple falls to the earth, rather than the other way around.

Paying to host petabytes of data in a cloud provider can be expensive. There's a point where it's more cost effective to host it on premise using something like Ceph.

Moving data out of cloud providers is also expensive and time-consuming when dealing with petabytes of it.

This can create a form of lock-in to a platform.

Dealing with enterprises, some of whom have large amounts of data to go along with the amount of compute capacity they need, has made me consider where private clouds can potentially be useful.

Whether a startup or an enterprise it's useful to be aware of the impact of data gravity.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
cloud ,data ,data gravity

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}