Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

“Lambda Architecture” for Real-Time Hashtag Analysis

DZone's Guide to

“Lambda Architecture” for Real-Time Hashtag Analysis

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

In wrote an article in the Datasalt blog where I show how to use Trident, Hadoop and Splout SQL together to build a toy example "lambda architecture". There you will learn the basics of Trident, a higher-level API on top of Storm, and Splout SQL, a fast SQL read-only DB for Hadoop. 

The example architecture is hosted on github . It simulates counting the number of appearances of hashtags in tweets, by date. The ultimate goal is to solve this simple problem in a fully scalable way, and provide a remote low-latency service for querying the evolution of the counts of a hashtag, including both consolidated and real-time statistics for it.


You can read the full article here.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}