Over a million developers have joined DZone.

Running Your First Apache Spark App [Code Snippets]

DZone 's Guide to

Running Your First Apache Spark App [Code Snippets]

Take a look at a quick tip for getting your Apache Spark application up and running — especially if you make this common mistake.

· Big Data Zone ·
Free Resource

The Spark Getting Started Guide is pretty good, but it’s not immediately obvious that you don’t run your app using Spark API as a standalone executable app. If you try, you’ll get an error like this:

17/11/07 19:15:20 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:376)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at kh.textanalysis.spark.SparkWordCount.workCount(SparkWordCount.java:16)
at kh.textanalysis.spark.SparkWordCount.main(SparkWordCount.java:10)

Instead, if using Maven, package the app with mvn package and start a local master node:


Then, you submit it to your Spark node for processing:

./sbin/spark-submit \
  --class "MyApp" \
  --master local[1] \

And that's it! 

big data ,apache spark ,spark application ,maven ,spark api

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}