Over a million developers have joined DZone.

Apache Spark v1.0 Solidifies Its Place Among Big Data Tools

· Big Data Zone

Read this eGuide to discover the fundamental differences between iPaaS and dPaaS and how the innovative approach of dPaaS gets to the heart of today’s most pressing integration problems, brought to you in partnership with Liaison.

The Apache Foundation announced today the release of Apache Spark v1.0, an open source large-scale data processing and advanced analytics engine. Spark allows developers to write applications in Java, Scala or Python.

The release of version 1.0 signifies a step forward towards greater stability and community involvement.

According to the press release, Spark offers flexibility in large-scale data processing that has earned it the nickname "Hadoop Swiss Army Knife." Chief among these is the speed at which it is able to process data, which outstrips Hadoop's MapReduce by 10 to 100 times. Additionally,

Apache Spark is well-suited for machine learning,  interactive queries, and stream processing. It is 100% compatible with Hadoop's Distributed File System (HDFS), HBase, Cassandra, as well as any Hadoop storage system, making existing data immediately usable in Spark. In addition, Spark supports SQL queries, streaming data, and complex analytics such as machine learning and graph algorithms out-of-the-box.

New in v1.0, Apache Spark offers strong API stability guarantees (backward-compatibility throughout the 1.X series), a new Spark SQL component for accessing structured data, as well as richer integration with other Apache projects (Hadoop YARN, Hive, and Mesos).

In a blog post at Cloudera, Sean Owens writes "Spark has a number of features that make it a compelling crossover platform for investigative as well as operational analytics." It will be interesting to see how data scientists and other users integrate Spark into their workflow.

For more information, check out the Spark website  and The Apache Foundation's official release.

Discover the unprecedented possibilities and challenges, created by today’s fast paced data climate and why your current integration solution is not enough, brought to you in partnership with Liaison


The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}