Docker Swarm Part I: Multi-Host Cassandra Cluster
In this first of three posts, see how to install and configure Docker Swarm in preparation for a Cassandra Cluster.
Join the DZone community and get the full member experience.
Join For FreeWe wanted to do a series that gave a number of different examples on how you can use Docker Swarm and Flocker together, and this is the first post in the series. The series will take you through using a Flocker + Swarm cluster in a number of different use-cases. These use-cases include the following:
- Setting up a Swarm Cluster with Consul for Service Discovery on a Flocker Cluster
- Creating a multi-service Twitter, NodeJS app without overlay networking and transitioning its configuration to use overlay networking.
- Creating a multi-host Cassandra Cluster with overlay networking and Flocker Volumes
- Creating a single Redis server with a Flocker volume and testing the experimental Swarm rescheduling
on-node-failure
feature.
This series is jam packed with goodies and is meant to be read as a fun overview of how to use Flocker and Swarm together during your #SwarmWeek adventures.
This portion of the series will focus on a subset of the list above. since it's the first part in the series, we will also install and configure Swarm.
Setting up a Swarm Cluster With Consul
This portion assumes you have 3 Docker hosts already created running Flocker. We are also using Ubuntu 14.04 as its example. Learn how to install Flocker here.
On one of your Docker hosts, we will start a Consul server. Docker uses Consul as a Key/Value store to store cluster state, such as networking and manager/engine info. You can do this container-based, but I like to start it on the server so I don’t have to worry about stopping and starting the Docker daemon and killing my consul server with it.
$ consul agent -server -data-dir="/tmp/consul" -bootstrap -advertise=10.0.204.4 -http-port=8500 -client=0.0.0.0 &
Once the consul server is up and running and before we enable Swarm to manage our cluster, we need to first prep our Docker daemon. The first thing we need to do is add some DOCKER_OPTS
to the daemon on every node.
#file /etc/default/docker
#Use DOCKER_OPTS to modify the daemon startup options.
DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-store=consul://<ip-of-consul-host>:8500/network --cluster-advertise=<this-nodes-private-ip>:2375"
Make sure to restart
the Docker daemon after you have made the above change to your Docker options. Then, on one of your nodes start a Primary Swarm Manager:
$ docker run -d -p 4000:4000 --restart=always swarm --experimental manage -H :4000 --replication --advertise <manager0_ip>:4000 consul://<consul_ip>:<port>
On a second node, start a Secondary Swarm Manager Replica:
$ docker run -d -p 4000:4000 --restart=always swarm --experimental manage -H :4000 --replication --advertise <manager1_ip>:4000 consul://<consul_ip>:8500
Then, on every Docker host that will participate in the Swarm cluster run the following join
command.
$ docker run -d --restart=always swarm --experimental join --advertise=<node_ip>:2375 consul://<consul_ip>:8500
Now on your Primary Swarm Manager node, you can run the docker info
command to see your Swarm cluster.
$ docker -H :4000 info
Containers: 5
Running: 5
Paused: 0
Stopped: 0
Images: 18
Server Version: swarm/1.1.3
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 2
ip-10-0-57-22: 10.0.57.22:2375
└ Status: Healthy
└ Containers: 1
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 7.67 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.13.0-79-generic, operatingsystem=Ubuntu 14.04.3 LTS, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-03-09T14:12:09Z
ip-10-0-195-84: 10.0.195.84:2375
└ Status: Healthy
└ Containers: 4
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 7.67 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.13.0-63-generic, operatingsystem=Ubuntu 14.04.3 LTS, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-03-09T14:11:52Z
Plugins:
Volume:
Network:
Kernel Version: 3.13.0-63-generic
Operating System: linux
Architecture: amd64
CPUs: 4
Total Memory: 15.34 GiB
Name: 6650c975f163
Debug mode (client): false Debug mode (server): false Experimental: true
That's it! You're ready to start deploying applications.
Warning: this configuration does not include setting up TLS. Learn more about Docker Security
Creating a Multi-Host Cassandra Cluster
Now for our first application. We have Flocker and Docker Swarm set up to support overlay networking so we can start on our first example. This example will use this repository for creating a Multi-Host Cassandra Cluster.
What is Cassandra?
“Apache Cassandra™ is a massively scalable open source NoSQL database. Cassandra is perfect for managing large amounts of structured, semi-structured, and unstructured data across multiple data centers and the cloud. Cassandra delivers continuous availability, linear scalability, and operational simplicity across many commodity servers with no single point of failure, along with a powerful dynamic data model designed for maximum flexibility and fast response times.”
Cassandra has automatic data distribution and built-in, customizable data replication to support transparent partitioning and redundant copies of its data. Learn more about Cassandra here.
Why Use Flocker With Cassandra?
Cassandra states that in-memory approaches to data storage can give you “blazing speed” however the cost of being limited to small data sets may not be so desirable. Cassandra implements a “commit-log based persistence design” that lets you tune to your desires or needs for security and performance. Allowing Cassandra to write to disk has security improvements for your data, and you can use containerized environments with Flocker to help you do so. To learn more about Cassandra and persistence, read the article What persistence is and what does it matter.
Running a Multi-node Cassandra Cluster With Your Swarm Cluster
The first thing we want to do is create an overlay network for our cluster to use. Docker multi-host networking allows containers to easily span multiple machines while being able to access containers by name over the same isolated network. Let's create an overlay network in our setup.
Note: run these docker commands against your Swarm Manager!
$ docker network create --driver overlay --subnet=192.168.0.0/24 overlay-net
Next, we need to create the persistent volume resources needed by our Cassandra cluster. We will create three volumes named testvol1
, testvol2
and testvol3
.
$ docker volume create -d flocker --name=testvol1 -o size=10G
$ docker volume create -d flocker --name=testvol2 -o size=10G
$ docker volume create -d flocker --name=testvol3 -o size=10G
Once your network and volumes resources are in place, you can copy this Docker Compose file or pull it from the repository linked earlier.
Notice in the below Docker Compose v2 file that we are referencing our Cassandra containers by name in CASSANDRA_BROADCAST_ADDRESS
and CASSANDRA_SEEDS
instead of by IP address. This is because the containers are deployed on our overlay network overlay-net
and can access each other by name! We also reference a Flocker volume for each Cassandra container to store state. This makes our Cassandra cluster very flexible and means the Cassandra containers will always be able to connect to each other no matter where they are started as long as they are part of the network.
version: '2'
services:
cassandra-1:
image: cassandra
container_name: cassandra-1
environment:
CASSANDRA_BROADCAST_ADDRESS: "cassandra-1"
ports:
- 7000
volumes:
- "cassandra1:/var/lib/cassandra"
restart: always
cassandra-2:
image: cassandra
container_name: cassandra-2
environment:
CASSANDRA_BROADCAST_ADDRESS: "cassandra-2"
CASSANDRA_SEEDS: "cassandra-1"
ports:
- 7000
depends_on:
- cassandra-1
volumes:
- "cassandra2:/var/lib/cassandra"
restart: always
cassandra-3:
image: cassandra
container_name: cassandra-3
environment:
CASSANDRA_BROADCAST_ADDRESS: "cassandra-3"
CASSANDRA_SEEDS: "cassandra-1"
ports:
- 7000
depends_on:
- cassandra-2
volumes:
- "cassandra3:/var/lib/cassandra"
restart: always
volumes:
cassandra1:
external:
name: testvol1
cassandra2:
external:
name: testvol2
cassandra3:
external:
name: testvol3
networks:
default:
external:
name: overlay-net
Next, we can instruct Docker Compose to start our Cassandra cluster.
$ docker-compose -f cassandra-multi.yml up -d
Pulling cassandra-1 (cassandra:latest)...
ip-10-0-195-84: Pulling cassandra:latest... : downloaded
ip-10-0-57-22: Pulling cassandra:latest... : downloaded
Creating cassandra-1
Creating cassandra-2
Creating cassandra-3
View the running containers. Notice that our Cassandra nodes are deployed to 2 different Docker Hosts; this is because we are using Swarm to schedule our Cassandra containers.
Note: We enable restart: always
to keep our Cassandra containers up and because Swarm may deploy the containers too fast for Cassandra to bootstrap correctly causing a Other bootstrapping/leaving/moving nodes detected
error and the restart will try and recover the bootstrap correctly when this happens which in this case you would see a Detected previous bootstrap failure; retrying
message
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
75868663fc45 cassandra "/docker-entrypoint.s" 22 minutes ago Up 22 minutes 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp, 10.0.195.84:32773->7000/tcp ip-10-0-195-84/cassandra-2
cc5ee1fc0faa cassandra "/docker-entrypoint.s" 22 minutes ago Up 20 minutes 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp, 10.0.57.22:32775->7000/tcp ip-10-0-57-22/cassandra-3
0d8ea530863f cassandra "/docker-entrypoint.s" 22 minutes ago Up 22 minutes 7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp, 10.0.57.22:32773->7000/tcp ip-10-0-57-22/cassandra-1
Note that the Cassandra containers are using the Flocker volumes.
SSH into one of your Docker hosts that is running a cassandra cluster container.
$ docker inspect -f cassandra-2 | grep flocker
.
.
[{testvol2 /flocker/40948462-8d21-4165-b5d5-9c7d148016f3 /var/lib/cassandra flocker rw true rprivate}
$ df -h
Filesystem Size Used Avail Use% Mounted on
.
.
/dev/xvdh 9.8G 24M 9.2G 1% /flocker/40948462-8d21-4165-b5d5-9c7d148016f3
/dev/xvdf 9.8G 53M 9.2G 1% /flocker/c2915fbb-7b85-4c58-9069-ce08ffb3e064
$ ls /flocker/40948462-8d21-4165-b5d5-9c7d148016f3/
commitlog data hints saved_caches
Next, let's connect to our Cassandra cluster and interact with it. We can run a one-off CLI container on the same network and connect to any of our
cassandra-X
nodes.
$ docker run -it --rm --net=overlay-net cassandra sh -c 'exec cqlsh "cassandra-1"'
Connected to Test Cluster at cassandra-1:9042.
[cqlsh 5.0.1 | Cassandra 3.3 | CQL spec 3.4.0 | Native protocol v4]
Use HELP for help.
cqlsh>
cqlsh> SHOW VERSION;
[cqlsh 5.0.1 | Cassandra 3.3 | CQL spec 3.4.0 | Native protocol v4]
cqlsh> SHOW HOST;
Connected to Test Cluster at cassandra-1:9042.
Improper SHOW command.
cqlsh> DESCRIBE CLUSTER;
Cluster: Test Cluster
Partitioner: Murmur3Partitioner
cqlsh> DESCRIBE TABLES;
Keyspace system_traces
----------------------
events sessions
Keyspace system_schema
----------------------
tables triggers views keyspaces dropped_columns
functions aggregates indexes types columns
Keyspace system_auth
--------------------
resource_role_permissons_index role_permissions role_members roles
Keyspace system
---------------
available_ranges peers paxos range_xfers
batches compaction_history batchlog local
"IndexInfo" sstable_activity size_estimates hints
views_builds_in_progress peer_events built_views
Keyspace system_distributed
---------------------------
repair_history parent_repair_history
There you have it. You’ve deployed Cassandra with Docker Swarm and Flocker with overlay networking using Docker Compose.
We’d love to hear your feedback!
Related Refcard:
Published at DZone with permission of Ryan Wallner, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments