Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

What's the Big Deal with Unstructured Data?

DZone's Guide to

What's the Big Deal with Unstructured Data?

· Big Data Zone
Free Resource

See how the beta release of Kubernetes on DC/OS 1.10 delivers the most robust platform for building & operating data-intensive, containerized apps. Register now for tech preview.


Unstructured DataWe hear the term “unstructured data” often. It’s brought up as the enormous challenge of big data and often cited as the reason why traditional relational databases don’t meet the needs of Big Data. But that conversation doesn’t adequately describe the challenge organizations face with unstructured data.

To get your head around unstructured data, you have to consider the history of data itself. When we first started digitizing our world in the 20th century, we first went after the low hanging fruit of transactional data: accounting. It was a quick win to transfer spreadsheets of information in neat columns and rows.

Decades later we’re digitizing everything in sight and sharing it across the enterprise, our partners and our personal connections. Despite everything that we’ve accomplished there is still an enormous amount of enterprise information that sits in text documents and presentations, graphics, email, audio, video, web pages and in various office software. Keep this in mind: It isn’t that unstructured data lacks any structure, it’s that unstructured data doesn’t fit the enterprise relational data model.

Even worse, much of our enterprise process exists as unstructured data in the heads of workers and lacking any systematic approach for capture, management, communication, measurement and improvement. When the work activities themselves are unstructured, the day-to-day behavior of workers lacks cohesiveness and efficiency. But I digress. Let’s get back to data itself.

Why Haven’t We Fixed This?

What keeps us from successfully managing unstructured data? A few things:

  • There's a lack of tools that easily manage unstructured data. Tools need to provide efficient text parsing and analytics, taxonomy and metadata management.
  • Integrating unstructured data with existing information systems is difficult. The two are often seen as apples and oranges when it comes to analytics and decision making.
  • There's a shortage of skills in existing staff.
  • There's a missing sense of urgency for managing unstructured data.

Despite our best efforts to corral the unstructured beast, this kind of data continues to grow and presents a real problem for organizations that want to automate and improve their ability to understand their businesses, anticipate what’s coming and act quickly on risk and opportunity. There are certainly tools that are maturing and providing the beginnings of a solution. The challenge, however, will be in finding the urgency and getting our organizations to see the value in getting data out of its various hiding places and into a place where it can be used and valued.



New Mesosphere DC/OS 1.10: Production-proven reliability, security & scalability for fast-data, modern apps. Register now for a live demo.

Topics:

Published at DZone with permission of Christopher Taylor, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}