Over a million developers have joined DZone.

Setting up a Standalone Apache Spark 1.5.1 Cluster on Ubuntu

DZone's Guide to

Setting up a Standalone Apache Spark 1.5.1 Cluster on Ubuntu

Zone Leader Tim Spans provides a tutorial for setting up a standalone cluster of Spark 1.5.1 on Ubuntu.

· Big Data Zone ·
Free Resource

How to Simplify Apache Kafka. Get eBook.

Step 1: Install All The Things!

sudo apt-get install git -y
sudo apt-add-repository ppa:webupd8team/java -y
sudo apt-get update -y
sudo apt-get install oracle-java8-installer -y
sudo apt-get install oracle-java8-set-default 
sudo apt-get install maven gradle -y
sudo apt-get install sbt -y
sudo wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz
sudo tar -xvf spark*.tgz
sudo chmod 755 spark*
sudo apt-get update
sudo apt-get install -y openjdk-7-jdk
sudo apt-get install -y autoconf libtool
sudo apt-get -y install build-essential python-dev python-boto libcurl4-nss-dev libsasl2-dev maven libapr1-dev libsvn-dev

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)

# Add the repository
echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \
 sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update
sudo apt-get -y install mesos

I also installed Apache Mesos for clustering for future upgrade from a standalone Spark cluster.


For standalone Spark cluster, I used:   spark-1.5.1-bin-hadoop2.6

#!/usr/bin/env bash


To Start A Node

sbin/start-slave.sh masterIP:7077


Installing Other Tools and Servers on Ubuntu

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo apt-get install -y mongodb-org=3.0.4 mongodb-org-server=3.0.4 mongodb-org-shell=3.0.4 mongodb-org-mongos=3.0.4 mongodb-org-tools=3.0.4
sudo service mongod start
sudo tail -5000 /var/log/mongodb/mongod.log


sudo apt-get update
sudo apt-get install postgresql postgresql-contrib


sudo apt-get install build-essential
sudo apt-get install tcl8.5
sudo wget http://download.redis.io/releases/redis-stable.tar.gz
sudo tar xzf redis-stable.tar.gz
cd redis-stable
make test
sudo make install
cd utils
sudo ./install_server.sh
sudo service redis_6379 start



sudo wget http://downloads.typesafe.com/scala/2.11.7/scala-2.11.7.deb
sudo dpkg -i scala-2.11.7.deb


echo "deb http://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-get update
sudo apt-get install sbt
sudo apt-get install unzip
curl -s get.gvmtool.net | bash
source "/root/.gvm/bin/gvm-init.sh"
gvm install gradle

12 Best Practices for Modern Data Ingestion. Download White Paper.

apache spark ,scala ,java

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}