Over a million developers have joined DZone.

Webscale means having time-based architectures

DZone's Guide to

Webscale means having time-based architectures

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

In an interesting piece yesterday, GigaOM reported that Netflix has an architecture built around timelines. This struck home for a guy who spends a great deal of time talking to people skeptical about the need for zero latency, real-time systems. The truth is that some things have to go at the highest speeds and others don’t. The problem is in creating systems that use resources wisely to get it right in each case.

Immediate response

If it was only about the hype, you’d think Hadoop is the answer to processing web data, but it isn’t that simple. Hadoop is still (despite even newer hype) a batch processing concept that allows data to “get stale and applications probably don’t include the newest user input.” Before that sounds too negative, there is data that can afford to be stale without a business penalty and can be processed and moved in an architecture that is offline and more traditional.

For real-time, Netflix needs to use the absolute latest inputs and has a solution:

Netflix uses online processing for receiving information from users in real time and serving up responses right away, such as looking at a new rating or some other customer action to change the set of movies shown to the customer. Real-time processing works best when algorithms are relatively simple and when data is on the smaller side. The data feeding in to computations must also be available right away.

This takes very high levels of integration and applications reading in-memory data at lightning speed. This means a very particular, state-of-the-art architecture that Gartner’s Massimo Pezzini calls, “The Next Generation Architecture: In-memory computing.

Middle ground

Between batch and real-time exists a middle ground that Netflix calls ‘nearline’. This is also a reality for business today and involves NoSQL and SQL databases that are more complex and less time sensitive than real-time needs.

The Netflix story is available on their blog and is a great example of how companies that are at the front of the ‘time wars’ are managing their requirements in time-based ways rather than monolithic, expensive and risky models. In a world of many shades of gray, this just makes sense.

Netflix architecture

Republished with permission 

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}