Docker Swarm Part I: Multi-Host Cassandra Cluster

DZone 's Guide to

Docker Swarm Part I: Multi-Host Cassandra Cluster

In this first of three posts, see how to install and configure Docker Swarm in preparation for a Cassandra Cluster.

· Database Zone ·
Free Resource

We wanted to do a series that gave a number of different examples on how you can use Docker Swarm and Flocker together, and this is the first post in the series. The series will take you through using a Flocker + Swarm cluster in a number of different use-cases. These use-cases include the following:

  1. Setting up a Swarm Cluster with Consul for Service Discovery on a Flocker Cluster
  2. Creating a multi-service Twitter, NodeJS app without overlay networking and transitioning its configuration to use overlay networking.
  3. Creating a multi-host Cassandra Cluster with overlay networking and Flocker Volumes
  4. Creating a single Redis server with a Flocker volume and testing the experimental Swarm rescheduling on-node-failure feature.

This series is jam packed with goodies and is meant to be read as a fun overview of how to use Flocker and Swarm together during your #SwarmWeek adventures.

This portion of the series will focus on a subset of the list above. since it's the first part in the series, we will also install and configure Swarm.

Setting up a Swarm Cluster With Consul

This portion assumes you have 3 Docker hosts already created running Flocker. We are also using Ubuntu 14.04 as its example. Learn how to install Flocker here.

On one of your Docker hosts, we will start a Consul server. Docker uses Consul as a Key/Value store to store cluster state, such as networking and manager/engine info. You can do this container-based, but I like to start it on the server so I don’t have to worry about stopping and starting the Docker daemon and killing my consul server with it.   

$ consul agent -server -data-dir="/tmp/consul" -bootstrap -advertise= -http-port=8500  -client= &

Once the consul server is up and running and before we enable Swarm to manage our cluster, we need to first prep our Docker daemon. The first thing we need to do is add some DOCKER_OPTS to the daemon on every node.   

#file /etc/default/docker
#Use DOCKER_OPTS to modify the daemon startup options.
DOCKER_OPTS="-H tcp:// -H unix:///var/run/docker.sock --cluster-store=consul://<ip-of-consul-host>:8500/network --cluster-advertise=<this-nodes-private-ip>:2375"

Make sure to restart the Docker daemon after you have made the above change to your Docker options. Then, on one of your nodes start a Primary Swarm Manager:

$ docker run -d -p 4000:4000 --restart=always swarm --experimental manage -H :4000 --replication --advertise <manager0_ip>:4000 consul://<consul_ip>:<port>

 On a second node, start a Secondary Swarm Manager Replica:
$ docker run -d -p 4000:4000 --restart=always swarm --experimental manage -H :4000 --replication --advertise <manager1_ip>:4000 consul://<consul_ip>:8500

Then, on every Docker host that will participate in the Swarm cluster run the following join command.

$ docker run -d --restart=always swarm --experimental join --advertise=<node_ip>:2375 consul://<consul_ip>:8500

Now on your Primary Swarm Manager node, you can run the docker info command to see your Swarm cluster.   

$ docker -H  :4000 info
Containers: 5
 Running: 5
 Paused: 0
 Stopped: 0
Images: 18
Server Version: swarm/1.1.3
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 2
  └ Status: Healthy
  └ Containers: 1
  └ Reserved CPUs: 0 / 2
  └ Reserved Memory: 0 B / 7.67 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.13.0-79-generic, operatingsystem=Ubuntu 14.04.3 LTS, storagedriver=aufs
  └ Error: (none)
  └ UpdatedAt: 2016-03-09T14:12:09Z
  └ Status: Healthy
  └ Containers: 4
  └ Reserved CPUs: 0 / 2
  └ Reserved Memory: 0 B / 7.67 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.13.0-63-generic, operatingsystem=Ubuntu 14.04.3 LTS, storagedriver=aufs
  └ Error: (none)
  └ UpdatedAt: 2016-03-09T14:11:52Z
Kernel Version: 3.13.0-63-generic
Operating System: linux
Architecture: amd64
CPUs: 4
Total Memory: 15.34 GiB
Name: 6650c975f163
Debug mode (client): false Debug mode (server): false Experimental: true

 That's it! You're ready to start deploying applications.

Warning: this configuration does not include setting up TLS. Learn more about Docker Security

Creating a Multi-Host Cassandra Cluster

Now for our first application. We have Flocker and Docker Swarm set up to support overlay networking so we can start on our first example. This example will use this repository for creating a Multi-Host Cassandra Cluster.

What is Cassandra?

“Apache Cassandra™ is a massively scalable open source NoSQL database. Cassandra is perfect for managing large amounts of structured, semi-structured, and unstructured data across multiple data centers and the cloud. Cassandra delivers continuous availability, linear scalability, and operational simplicity across many commodity servers with no single point of failure, along with a powerful dynamic data model designed for maximum flexibility and fast response times.”

Cassandra has automatic data distribution and built-in, customizable data replication to support transparent partitioning and redundant copies of its data. Learn more about Cassandra here.

Why Use Flocker With Cassandra?

Cassandra states that in-memory approaches to data storage can give you “blazing speed” however the cost of being limited to small data sets may not be so desirable. Cassandra implements a “commit-log based persistence design” that lets you tune to your desires or needs for security and performance. Allowing Cassandra to write to disk has security improvements for your data, and you can use containerized environments with Flocker to help you do so. To learn more about Cassandra and persistence, read the article What persistence is and what does it matter.

Running a Multi-node Cassandra Cluster With Your Swarm Cluster

The first thing we want to do is create an overlay network for our cluster to use. Docker multi-host networking allows containers to easily span multiple machines while being able to access containers by name over the same isolated network. Let's create an overlay network in our setup.

Note: run these docker commands against your Swarm Manager!   

$ docker network create --driver overlay --subnet= overlay-net

Next, we need to create the persistent volume resources needed by our Cassandra cluster. We will create three volumes named testvol1 , testvol2 and testvol3.   

$ docker volume create -d flocker --name=testvol1 -o size=10G
$ docker volume create -d flocker --name=testvol2 -o size=10G
$ docker volume create -d flocker --name=testvol3 -o size=10G

Once your network and volumes resources are in place, you can copy this Docker Compose file or pull it from the repository linked earlier.

Notice in the below Docker Compose v2 file that we are referencing our Cassandra containers by name in CASSANDRA_BROADCAST_ADDRESS and CASSANDRA_SEEDS instead of by IP address. This is because the containers are deployed on our overlay network overlay-net and can access each other by name! We also reference a Flocker volume for each Cassandra container to store state. This makes our Cassandra cluster very flexible and means the Cassandra containers will always be able to connect to each other no matter where they are started as long as they are part of the network.   

version: '2'
    image: cassandra
    container_name: cassandra-1
    - 7000
    - "cassandra1:/var/lib/cassandra"
    restart: always
    image: cassandra
    container_name: cassandra-2
      CASSANDRA_SEEDS: "cassandra-1"
    - 7000
      - cassandra-1
    - "cassandra2:/var/lib/cassandra"
    restart: always
    image: cassandra
    container_name: cassandra-3
      CASSANDRA_SEEDS: "cassandra-1"
    - 7000
      - cassandra-2
    - "cassandra3:/var/lib/cassandra"
    restart: always

        name: testvol1
        name: testvol2
        name: testvol3

       name: overlay-net

Next, we can instruct Docker Compose to start our Cassandra cluster.

$ docker-compose -f cassandra-multi.yml up -d
Pulling cassandra-1 (cassandra:latest)...
ip-10-0-195-84: Pulling cassandra:latest... : downloaded
ip-10-0-57-22: Pulling cassandra:latest... : downloaded
Creating cassandra-1
Creating cassandra-2
Creating cassandra-3

View the running containers. Notice that our Cassandra nodes are deployed to 2 different Docker Hosts; this is because we are using Swarm to schedule our Cassandra containers.

Note: We enable restart: always to keep our Cassandra containers up and because Swarm may deploy the containers too fast for Cassandra to bootstrap correctly causing a Other bootstrapping/leaving/moving nodes detected error and the restart will try and recover the bootstrap correctly when this happens which in this case you would see a Detected previous bootstrap failure; retrying message   

$ docker ps
CONTAINER ID        IMAGE                                    COMMAND                  CREATED             STATUS              PORTS                                                                 NAMES
75868663fc45        cassandra                                "/docker-entrypoint.s"   22 minutes ago      Up 22 minutes       7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp,>7000/tcp   ip-10-0-195-84/cassandra-2
cc5ee1fc0faa        cassandra                                "/docker-entrypoint.s"   22 minutes ago      Up 20 minutes       7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp,>7000/tcp    ip-10-0-57-22/cassandra-3
0d8ea530863f        cassandra                                "/docker-entrypoint.s"   22 minutes ago      Up 22 minutes       7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp,>7000/tcp    ip-10-0-57-22/cassandra-1

Note that the Cassandra containers are using the Flocker volumes.

SSH into one of your Docker hosts that is running a cassandra cluster container.   

$ docker inspect -f cassandra-2 | grep flocker
[{testvol2 /flocker/40948462-8d21-4165-b5d5-9c7d148016f3 /var/lib/cassandra flocker rw true rprivate}

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvdh       9.8G   24M  9.2G   1% /flocker/40948462-8d21-4165-b5d5-9c7d148016f3
/dev/xvdf       9.8G   53M  9.2G   1% /flocker/c2915fbb-7b85-4c58-9069-ce08ffb3e064

$ ls /flocker/40948462-8d21-4165-b5d5-9c7d148016f3/
commitlog  data  hints  saved_caches

Next, let's connect to our Cassandra cluster and interact with it. We can run a one-off CLI container on the same network and connect to any of our  cassandra-X nodes.
$ docker run -it --rm --net=overlay-net cassandra sh -c 'exec cqlsh "cassandra-1"'
Connected to Test Cluster at cassandra-1:9042.
[cqlsh 5.0.1 | Cassandra 3.3 | CQL spec 3.4.0 | Native protocol v4]
Use HELP for help.
[cqlsh 5.0.1 | Cassandra 3.3 | CQL spec 3.4.0 | Native protocol v4]
cqlsh> SHOW HOST;
Connected to Test Cluster at cassandra-1:9042.
Improper SHOW command.

Cluster: Test Cluster
Partitioner: Murmur3Partitioner

Keyspace system_traces
events  sessions

Keyspace system_schema
tables     triggers    views    keyspaces  dropped_columns
functions  aggregates  indexes  types      columns

Keyspace system_auth
resource_role_permissons_index  role_permissions  role_members  roles

Keyspace system
available_ranges          peers               paxos           range_xfers
batches                   compaction_history  batchlog        local
"IndexInfo"               sstable_activity    size_estimates  hints
views_builds_in_progress  peer_events         built_views

Keyspace system_distributed
repair_history  parent_repair_history

There you have it. You’ve deployed Cassandra with Docker Swarm and Flocker with overlay networking using Docker Compose.

Happy Swarming! And be sure to check out Part II and Part III!

We’d love to hear your feedback!

Related Refcard:

cassandra, docker swarm

Published at DZone with permission of Ryan Wallner , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}