Over a million developers have joined DZone.

Building “The House of Data”

DZone's Guide to

Building “The House of Data”

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

This article was originally written by Asami Novak for the New Relic blog.

Who knew that working at a media company could be harder than being a rocket scientist? But that’s what Colin Coleman, senior director of Analytics Product Strategy and Data Governance at Turner Broadcasting System, claims—and he spent almost a decade working at the NASA Ames Research Center.


The comparison came up as Coleman sat down with Mike Dauber of Battery Ventures at the Gigaom Structure Data conference in NYC this week to talk about Turner’s experiences transitioning to a big data environment, and the challenges it’s facing to collect and try to make sense of growing amounts of new audience and advertiser-related data.

As Coleman explained, the real challenge at TBS—a division of Time Warner–is trying to find a way to address both its business-to-consumer (B2C) audience and business-to-business (B2B) customers. On the consumer side, Turner needs to target an incredibly diverse demographic—from preteens who watch the Cartoon Network to more mature viewers tuning into CNN or Turner Classic Movies. The company needs to stitch together data from more than 100 branded channels, each running across multiple platforms. Meanwhile, the company also has to meet the needs of its advertisers looking to gain insights about Turner’s consumer audiences, especially as the company’s engagement models expand to include digital, mobile, and so on.

“The physics of engagement are shifting and we’re trying to solve it all at the same time,” said Coleman. The solution, for Turner, was to “build the house of data” and for the time being, that means using Hadoop for what it calls internally, “hadumping.” But the technical changes at the company also bring the need for an organizational shift. Critically, it’s not just the data aspects coming into play, but governance and security issues as well.

Although Coleman admits Turner’s big data transition is still a work in progress, the ultimate goal is to better understand the physics of engagement, and to link consumer engagement patterns with the ad side of the house in real time. “Getting the insight—that’s the Holy Grail,” he said.

You can learn more about Turner’s data story in this Gigaom article and the video below:


Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}