Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Putting the Data Back Into the Data Center

DZone's Guide to

Putting the Data Back Into the Data Center

We should be challenging our compute-centric view of a data center; instead, we should be considering the flow and processing of data.

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

For the past two decades, data centers have been more about compute than data, but the machine learning and IoT revolutions are changing that focus for the 2020 Data Center (AKA DC2020). My experience at IBM Think 2018 suggests that we should be challenging our compute-centric view of a data center; instead, we should be considering the flow and processing of data. Since data is not localized, that reinforces our concept of DC2020 as a distributed and integrated environment.

We have defined data centers by the compute infrastructure stored there. Cloud (especially equated with virtualized machines) has been an infrastructure as a service (IaaS) story. Even big data "lakes" are primary compute clusters with distributed storage. This model dominates because data sources locked in application silos control of the compute translates directly to control of the data.

What if control of data is being decoupled from applications? Data is becoming its own thing with new technologies like machine learning, IoT, blockchain, and other distributed sourcing.

In a data-centric model, we are more concerned with movement and access to data than building applications to control it. Think of event-driven (serverless) and microservice platforms that effectively operate on data-in-flight. It will become impossible to actually know all the ways that data is manipulated as function as a service progresses because there are no longer boundaries for applications.

This data-centric, distributed architecture model will be even more pronounced as processing moves out of data centers and into the edge. IT infrastructure at the edge will be used for handling latency critical data and aggregating data for centralization. These operations will not look like traditional application stacks: they will be data processing microservices and functions.

This data-centric approach relegates infrastructure services to a subordinate role. We should not care about servers or machines except as they support platforms driving data flows.

I am not abandoning making infrastructure simple and easy — we need to do that more than ever! However, it's easy to underestimate the coming transformation of application architectures based on advanced data processing and sharing technologies. The amount and sources of data have already grown beyond human comprehension because we still think of applications in a client-server mindset.

We're only at the start of really embedding connected sensors and devices into our environment. As devices from many sources and vendors proliferate, they also need to coordinate. That means we're reaching a point where devices will start talking to each other locally instead of via our centralized systems. It's part of the coming data avalanche.

Current management systems will not survive explosive growth. We're entering a phase where control and management paradigms cannot keep up.

As an industry, we are rethinking management automation from declarative ("start this") to intent ("maintain this") focused systems. This is the simplest way to express the difference between OpenStack and Kubernetes. That change is required to create autonomous infrastructure designs; however, it also means that we need to change our thinking about infrastructure as something that follows data instead of leads it.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:
big data ,data center ,data processing

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}