Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Introducing Federated Analytics

DZone's Guide to

Introducing Federated Analytics

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

federated analytics

Federated analytics is a term I coined up to identify a specific capability offered by a data analytics platform. Federated analytics is the capability of joining various, distributed data sources and performing analytics as if they were a single data source.

If you consider a case where you have http access logs, a customer details spreadsheet and a live stream coming from an API gateway or an ESB. One possibility would be to combine the data in these three sources and understand in real time which of your customers are accessing your services through which services and from what location. If you consider combinations alone (based on the fields available in the data source), the numbers are daunting even with three data sources. What if there were 10s or 100s. With federated analytics, the capabilities that comes to understanding your data and even figuring out hidden trends becomes much easier and accessible, for an organization of any size.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}