Over a million developers have joined DZone.

Setting up a Standalone Apache Spark 1.5.1 Cluster on Ubuntu

DZone's Guide to

Setting up a Standalone Apache Spark 1.5.1 Cluster on Ubuntu

Zone Leader Tim Spans provides a tutorial for setting up a standalone cluster of Spark 1.5.1 on Ubuntu.

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Step 1: Install All The Things!

sudo apt-get install git -y
sudo apt-add-repository ppa:webupd8team/java -y
sudo apt-get update -y
sudo apt-get install oracle-java8-installer -y
sudo apt-get install oracle-java8-set-default 
sudo apt-get install maven gradle -y
sudo apt-get install sbt -y
sudo wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz
sudo tar -xvf spark*.tgz
sudo chmod 755 spark*
sudo apt-get update
sudo apt-get install -y openjdk-7-jdk
sudo apt-get install -y autoconf libtool
sudo apt-get -y install build-essential python-dev python-boto libcurl4-nss-dev libsasl2-dev maven libapr1-dev libsvn-dev

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)

# Add the repository
echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \
 sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update
sudo apt-get -y install mesos

I also installed Apache Mesos for clustering for future upgrade from a standalone Spark cluster.


For standalone Spark cluster, I used:   spark-1.5.1-bin-hadoop2.6

#!/usr/bin/env bash


To Start A Node

sbin/start-slave.sh masterIP:7077


Installing Other Tools and Servers on Ubuntu

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo apt-get install -y mongodb-org=3.0.4 mongodb-org-server=3.0.4 mongodb-org-shell=3.0.4 mongodb-org-mongos=3.0.4 mongodb-org-tools=3.0.4
sudo service mongod start
sudo tail -5000 /var/log/mongodb/mongod.log


sudo apt-get update
sudo apt-get install postgresql postgresql-contrib


sudo apt-get install build-essential
sudo apt-get install tcl8.5
sudo wget http://download.redis.io/releases/redis-stable.tar.gz
sudo tar xzf redis-stable.tar.gz
cd redis-stable
make test
sudo make install
cd utils
sudo ./install_server.sh
sudo service redis_6379 start



sudo wget http://downloads.typesafe.com/scala/2.11.7/scala-2.11.7.deb
sudo dpkg -i scala-2.11.7.deb


echo "deb http://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-get update
sudo apt-get install sbt
sudo apt-get install unzip
curl -s get.gvmtool.net | bash
source "/root/.gvm/bin/gvm-init.sh"
gvm install gradle

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

apache spark ,scala ,java

Published at DZone with permission of Tim Spann, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}