Over a million developers have joined DZone.

Apache Spark for Big Data Processing

A video from SpringOne2GX 2015 on using Spark with Spring XD, as well as several concrete examples of analyzing data with Spark.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Recorded at SpringOne2GX 2015

Presenters: Ludwine Probst and Ilayaperumal Gopinathan

Big Data Track

Slides: http://www.slideshare.net/SpringCentral/apache-spark-for-big-data-processing

Today, we live in the world of Big Data. Hadoop and MapReduce are highly dominant in the domain of large scale data processing. However, the MapReduce model shows its limits for various types of treatment, especially for highly iterative algorithms frequently encountered in the field of Machine Learning.

Spark is an in-memory data processing framework that, unlike Hadoop, provides interactive and real-time analysis on large datasets. Furthermore, Spark has a more flexible programming model and gives better performance than Hadoop.

In this talk, we aim at giving a portrait of Spark and at browsing its ecosystem, in particular Spark Streaming and MLlib with a concrete example. We will also show how you can use Spark with Spring XD, allowing you to take advantage of the strengths in each platform.

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.


Published at DZone with permission of Pieter Humphrey, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}