At Nuxeo, we've been using Docker for two years now, which is quite a long time in the Docker time scale! We built our Nuxeo.io infrastructure on top of it and are also using it in our CI chain.
I attended DockerCon EU 2015 in Barcelona and was very impressed by it. A lot of topics that were announced last year have now become a reality and they sure have caught my interest. One such example is Docker Swarm, which is now able to schedule 50k containers in 30 minutes on AWS. Other examples are Docker Compose, now fully compliant with Swarm, and Docker Machine, which allows you to easily setup a Docker Swarm cluster in minutes.
Additionally, network and volume plugins have become first class citizens of Docker! They are now fully pluggable with solutions such as Weave or Flocker. This, l think, is a huge change in the Docker ecosystem because it allows us to build stateful isolated applications on top of containers.
In this post, I will share some feedback on my experiments with these new capabilities.
Building an Open/Close Container
As a developer, I often think about the open/closed principle. For Nuxeo Cloud, since we don’t have a generic dedicated Nuxeo image, I wrote a new and more open version of our Dockerfile which is now on Github. This allows you to configure some of the basic configuration properties. If it's not enough there's a way to add your own piece of nuxeo.conf.
It should be integrated into the official Docker library soon, so you will be able to run a Nuxeo server on your Docker infrastructure:
$ docker run -d --name nuxeo nuxeo/nuxeo:LTS-2015
This is cool but it works with an embedded DB and an embedded Elasticsearch and is just there for testing purposes. We could add some links to other DB containers, but the setup is rather complex and relies on environment variables which never convinced me to use links in a general manner. Looks like I was not the only one since links are now deprecated in favor of using networking support.
Docker Compose and Networking Support
Docker Compose allows us to define an application as a set of containers by writing a simple YML configuration file. For instance, in the following configuration we will define a Nuxeo application with four containers: a Postgres DB, an Elasticsearch node, a Redis server and, of course, a Nuxeo server.
redis: image: redis postgres: # Building our own image allows to customize a bit # the Postgres setup, but it's base on the Postgres official image build: ./postgres/. environment: - POSTGRES_USER=nuxeo - POSTGRES_PASSWORD=nuxeo elasticsearch: image: elasticsearch:1 nuxeo: images:nuxeo/nuxeo restart: always ports: - "8080:8080" environment: - NUXEO_DB_TYPE=postgresql - NUXEO_DB_HOST=nuxeocompose_postgres_1 - NUXEO_ES_HOST=nuxeocompose_elasticsearch_1 - NUXEO_REDIS_HOST=nuxeocompose_redis_1
By running the command:
$ docker-compose --x-networking --x-network-driver=overlay up -d
It will start the four containers, create a dedicated network for that application and populate the
/etc/hosts file with the addresses of the other services. So instead of links we now have some DNS name exposed in all our containers. It is then very easy to configure the Nuxeo Platform with the dedicated environment variables. The created network is private and no other container can see it if it’s not explicitly allowed to with Docker network commands.
The network support in Compose is still experimental, that's why you have to manually enable it via the
--x-networking command line option.
Compose also allows you to scale each part of your infrastructure. Of course, it has to be designed for scaling, but it's very easy to do so. For instance, to add two other Elasticsearch nodes in our infrastructure, all we have to do is:
$ docker-compose --x-networking --x-network-driver=overlay scale elasticsearch=3
My Application in the Cloud
For now, we are just deploying our application on a single node which means that even if we try to scale the small parts of our infrastructure, we will be limited by the host machine's resources. That's the type of problem that Docker Swarm can solve. Swarm is basically a tool that knows how to manage a cluster of Docker hosts and can schedule your containers across a cluster.
The clever thing about Swarm is that it exposes the same exact API as Docker, which means that launching a container across a cluster or launching it on a single host will share the exact same command. When you send a command to Swarm, it's just the same as sending commands to Docker.
From a network point of view, it works just like it does on a single host. You can have the same private overlay network as in our first sample, but the overlay works across the hosts of your cluster. Just like Docker Compose talks to the Docker API, it can talk to Swarm too. It means that you can send the same Compose command to Swarm to launch your application. The containers will be distributed across the cluster, depending on the scheduling strategy. By default, it will spread the containers across the cluster. To sum up, here is a sample session to setup a cluster and deploy our application on it:
$ # Simplified version of the cluster's creation $ docker-machine create --swarm --swarm-master swarm-master $ docker-machine create --swarm swarm-agent-00 $ docker-machine create --swarm swarm-agent-01 $ $ eval $(docker-machine env --swarm swarm-master) $ docker-compose --x-networking --x-network-driver=overlay up -d
The Docker Machine commands are simplified here. The full version is in the setup script of the sample Github project. In fact, it requires setting up a service discovery server (Consul, etcd or Zookeeper) and then making a reference to it. This servers allow Docker Swarm to correctly setup the network overlay and make things as smooth as possible.
So, we've been able to deploy our multi-tier application on a cluster of hosts with only a few commands. There is however a problem in this setup: if a host goes down or if we decide to move a container from one host to another, we may loose our data since we don't use volume to persist them. Even if we had used volumes, they are linked to the host on which the container is running. We still have to find a way to deal with that in order to have a completely stateful multi-host application. I will be discussing that in my next blog. Stay tuned!