Over a million developers have joined DZone.

How to Combine Spark with Heroku for Fast, Simple App Deployment

· Cloud Zone

Download the Essential Cloud Buyer’s Guide to learn important factors to consider before selecting a provider as well as buying criteria to help you make the best decision for your infrastructure needs, brought to you in partnership with Internap.

The Spark web framework promises to make distributed data management more efficient. The Heroku service facilitates testing and deploying your apps. Put the two together, and you get app development at Internet speed.

It didn't take long for the Spark web framework for distributed data management to capture the attention of app developers. Spark was originally conceived as a way to offer "a simple API for distributed data processing in general-purpose programming languages (Java, Python, Scala)," as Reynold Xin explains in a March 26, 2015, post on Opensource.com.

To date, Spark is known primarily as a faster, simpler, and more efficient alternative to Hadoop's MapReduce. In a March 17, 2015, article in InfoWorld, Platfora executive Peter Schlampp writes that MapReduce can be difficult to use even for data scientists. By contrast, Spark features a tool for accelerated queries, a machine learning library, a graph processing engine, a streaming analytics engine, and other advanced analytics "right out of the box."

Xin credits the Spark API's functional transformations on distributed collections of data, or Resilient Distributed Datasets (RDDs), as the reason the framework can express tasks in dozens of lines of code that previously required thousands of lines. In an attempt to deliver the power of distributed processing to a wider audience, a new DataFrame API has been created for Spark as an extension to the original RDD API.

DataFrames are able to scale from kilobytes of data on a single PC to petabytes in a large cluster. They support many different data formats and storage systems, and they feature APIs for Java, Python, and Scala, with an API for R in development via SparkR. The Spark SQL Catalyst optimizer delivers state-of-the-art code generation and optimization.

young=users[users.age < 21]

This example illustrates how to use DataFrames to manipulate the demographic data of a large group of users. Source: Opensource.com

Teaming Spark and Heroku to simplify and speed up Java app development

In an attempt to take some of the sting out of developing and testing Java apps, Arjun Surendra created a tutorial that demonstrates how to use Heroku to develop and deploy a Java app to the Internet in under five minutes. Surendra uses the netbeans API, but the technique works with any IDE or text editor. It requires JDK7 in addition to Spark.

You start by creating a simple Java Maven project that includes App.java, home.ftl.html, login.ftl.html, log4j.properties, pom.xml, and system.properties. Finally, you add the Procfile.

web: java $JAVA_OPTS - jar target/hellowworld-1.0

Developing and deploying a Java app to the Internet via Heroku uses a straightforward Procfile. Source: Arjun Surendra

Once you're ready to deploy the app to Heroku, the Heroku tool belt installs the required git version control. You then create the Heroku repository, enter your Heroku credentials, and push the code to the Heroku owned git repo, where it is built and deployed automatically.

The Heroku Dev Center provides information on using Heroku with Spring MVC Hibernate, as well as with the Play Framework, and with Node.js. Another useful resource is the Java on Heroku Forum, which includes a post describing how to troubleshoot application errors related to Procfile, system.properties, pom.xml, and other project components.

For efficient, centralized management of your apps, databases, IT operations, and business services in real time, sign up for the Happy Apps application-management service. Happy Apps lets you set up rules so you are alerted via SMS and email whenever incidents or specific events occur. You can group and monitor multiple apps, databases, web servers, and app servers. In addition to an overall status, you can view the status of each individual group member.

Happy Apps is the only app-management service to support SSH and agent-based connectivity to all your apps on public, private, and hybrid clouds. The service provides dependency maps for determining the impact your IT systems will have on other apps. All checks performed on your apps are collected in easy-to-read reports that can be analyzed to identify repeating patterns and performance glitches over time. Visit the Happy Apps site to sign up for a free trial.

The Cloud Zone is brought to you in partnership with Internap. Read Bare-Metal Cloud 101 to learn about bare-metal cloud and how it has emerged as a way to complement virtualized services.

Topics:

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}