Over a million developers have joined DZone.

Setting up a Standalone Apache Spark 1.5.1 Cluster on Ubuntu

DZone's Guide to

Setting up a Standalone Apache Spark 1.5.1 Cluster on Ubuntu

Zone Leader Tim Spans provides a tutorial for setting up a standalone cluster of Spark 1.5.1 on Ubuntu.

· Big Data Zone ·
Free Resource

Access NoSQL and Big Data through SQL using standard drivers (ODBC, JDBC, ADO.NET). Free Download 

Step 1: Install All The Things!

sudo apt-get install git -y
sudo apt-add-repository ppa:webupd8team/java -y
sudo apt-get update -y
sudo apt-get install oracle-java8-installer -y
sudo apt-get install oracle-java8-set-default 
sudo apt-get install maven gradle -y
sudo apt-get install sbt -y
sudo wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz
sudo tar -xvf spark*.tgz
sudo chmod 755 spark*
sudo apt-get update
sudo apt-get install -y openjdk-7-jdk
sudo apt-get install -y autoconf libtool
sudo apt-get -y install build-essential python-dev python-boto libcurl4-nss-dev libsasl2-dev maven libapr1-dev libsvn-dev

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)

# Add the repository
echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \
 sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update
sudo apt-get -y install mesos

I also installed Apache Mesos for clustering for future upgrade from a standalone Spark cluster.


For standalone Spark cluster, I used:   spark-1.5.1-bin-hadoop2.6

#!/usr/bin/env bash


To Start A Node

sbin/start-slave.sh masterIP:7077


Installing Other Tools and Servers on Ubuntu

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo apt-get install -y mongodb-org=3.0.4 mongodb-org-server=3.0.4 mongodb-org-shell=3.0.4 mongodb-org-mongos=3.0.4 mongodb-org-tools=3.0.4
sudo service mongod start
sudo tail -5000 /var/log/mongodb/mongod.log


sudo apt-get update
sudo apt-get install postgresql postgresql-contrib


sudo apt-get install build-essential
sudo apt-get install tcl8.5
sudo wget http://download.redis.io/releases/redis-stable.tar.gz
sudo tar xzf redis-stable.tar.gz
cd redis-stable
make test
sudo make install
cd utils
sudo ./install_server.sh
sudo service redis_6379 start



sudo wget http://downloads.typesafe.com/scala/2.11.7/scala-2.11.7.deb
sudo dpkg -i scala-2.11.7.deb


echo "deb http://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-get update
sudo apt-get install sbt
sudo apt-get install unzip
curl -s get.gvmtool.net | bash
source "/root/.gvm/bin/gvm-init.sh"
gvm install gradle

The fastest databases need the fastest drivers - learn how you can leverage CData Drivers for high performance NoSQL & Big Data Access.

apache spark ,scala ,java

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}