Over a million developers have joined DZone.

Using Docker for Dev Environment Setup Automation

DZone's Guide to

Using Docker for Dev Environment Setup Automation

How long does it take to set up your current project locally? Two days? Two hours? With Docker, this can be done virtually in one command.

· DevOps Zone ·
Free Resource

Get the fastest log management and analysis with Graylog open source or enterprise edition free up to 5GB per day

In this article, I show how Docker can help in setting up a runtime environment, a database with a predefined dataset, get it all together and run isolated from everything else on your machine.

Let’s start with the goals:

  1. I want to have isolated Java SDK, Scala SDK, and SBT (build tool).
  2. I want to be still able to edit and rebuild my project code easily from my IDE.
  3. I need a MongoDB instance running locally.
  4. Last but not least, I want to have some minimal data set in my MongoDB out of the box.
  5. Ah, right, all of the above must be downloaded and configured in a single command.

All these goals can be achieved by running and tying together just three Docker containers. Here is a high-level overview of these containers:

High-level design overview

It’s impressive that such a simple setup brings all listed benefits, isn’t it? Let’s dive in.

Step I: Development Container Configuration

First two goals are covered by Dev Container, so let’s start with that one.
Our minimal project structure should look like this:


The structure will become more sophisticated as the article progresses, but for now it’s sufficient. Dockerfile is the place where Dev Container’s image is described (if you aren’t yet familiar with images and containers, you should read this first). Let’s look inside:

FROM java:openjdk-8u72-jdk

# install SBT
RUN apt-get update \
    && apt-get install -y apt-transport-https \
    && echo "deb https://dl.bintray.com/sbt/debian /" | tee -a /etc/apt/sources.list.d/sbt.list \
    && apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823 \
    && apt-get update \
    && apt-get -y install sbt

ENV SBT_OPTS="-Xmx1200M -Xss512K -XX:MaxMetaspaceSize=512M -XX:MetaspaceSize=300M"

# make SBT dependencies an image layer: no need to update them on every container rebuild
COPY project/build.properties /setup/project/build.properties
RUN cd /setup \
    && sbt test \
    && cd / \
    && rm -r /setup



CMD ["sbt"]

  1. The Official OpenJDK 1.8 image is specified as a base image in the first line of Dockerfile. Note: official repositories are available for many other popular platforms. They are easily recognizable by naming convention: it never contains slashes i.e. it always just a repository name without author name. 

  2. Install SBT and specify environment variable SBT_OPTS.

  3. An SBT-specific trick (skip it safely if you don’t use SBT) is to speed up containers' starting times. As you may know, nothing is eternal in a container: it can (and normally is) destroyed every now and then. By making some required dependencies a part of the image, we download them just once and in this way significantly speed up containers building time.

  4. Declare port 8080 as the one listened by an application inside the container. We’ll refer it later to access the application.

  5. Declare a new volume under the /app folder and start the next commands from there.We will use it in a moment to make all project files accessible from two worlds: from the host and from the container.

  6. Default command to run SBT interactive mode on container startup.For build tools without interactive mode (like Maven), this can be  CMD [“/bin/bash”].

Now, we can already test the image:

docker build -t myfancyimage .
docker run --rm -it -v $(pwd):/app -p myfancyimage

The first command builds an image from our Dockerfile and gives it a name myfancyimage. The second command builds and starts the container from the image. It binds current folder to the container’s volume ($(pwd):/app) and binds host port 9090 to container’s exposed port 8080.

Step II: MongoDB Container Configuration

Ok, now it’s time to bring in some data. We start with adding Mongo Engine container, and later supply it with sample data snapshot. As we’re about to run multiple containers linked together, it’s convenient to describe how to run these containers via docker-compose configuration file. Let’s add to the project’s root docker-compose.yml  with the following content:

version: '2'
  dev-container:               ❶        
      context: .
      dockerfile: Dockerfile
    image: myfancyimage
      - ""
      - .:/app
    links:                     ❸    
      - mongo-engine
  mongo-engine:                ❷
    image: mongo:3.2
    command: --directoryperdb 
  1. The commands for building and running myfancyimage are transformed to the dev-container.

  2. Container mongo-engine with MongoDB 3.2 from official DockerHub repository.

  3. Link mongo-engine to dev-container: mongo-engine will start prior to dev-container and they will share a network. MongoDB is available to dev-container by the URL "mongodb://mongo-engine/".

Let’s try it:

docker-compose run --service-ports dev-container

It’s important to add the --service-ports flag to enable configured ports mapping.

Step III: Data Container Configuration

All right, here comes the hardest part: sample data distribution. Unfortunately, there’s no suitable mechanism for Docker Data Volumes distribution, although there exist a few Docker volumes managers (i.e., Flocker, Azure Volume driver, etc.), these tools serve other goals.

Note: An alternative solution would be to restore data from DB dump programmatically or even generate it randomly. But this approach is not generic, i.e., it involves specific tools and scripts for each DB, and in general is more complicated.

The data distribution mechanism we’re seeking must support two operations:

  1. Replicate a fairly small dataset from a remote shared repository to local environment.
  2. Publish new or modified data set to the remote repository.

One obvious approach is to distribute data via docker images. In this case, a remote repository is the same place we store our Docker images. It can be either DockerHub or a private Docker Registry instance. The solution described below can work with both.

Meeting the 1st requirement is easy: we need to run a container from a data image, mark data folder as a volume, and link that volume (via the --volumes-from argument) to Mongo Engine container.
The 2nd requirement is complicated. After doing some changes inside the volume we cannot simply commit those changes back to docker image: volume is technically not a part of the modifiable top layer of a container. In simpler words, Docker daemon just doesn’t see any changes to commit.

Here’s the trick: If we can read changed data but cannot commit it from the volume then we need to copy it first elsewhere outside of all volumes so that the daemon detects changes. Applying the trick has a not-so-obvious consequence: we cannot create a volume directly from the data folder but have to use another path for it, and then copy all data to the volume when the container starts. Otherwise, we’ll have to alternate the volume path depending on where the data is stored this time, and this is hardly automated.

The whole process of cloning and saving a dataset is displayed on the diagrams below:

Image title

Making data snapshot available to other containers as a volume on startup.

Image title

Committing changes to new image and pushing it to storage.

We’ll dig into scripts for taking and applying data snapshots a bit later. For now, let’s assume they are present in the data snapshot container’s /usr folder. Here is how the docker-compose.yml is updated with the data container definition:

version: '2'
      context: .
      dockerfile: Dockerfile
    image: myfancyimage
      - ""
      - .:/app
      - mongo-engine
    image: mongo:3.2
    command: --directoryperdb 
    volumes_from:                      ❶        
      - data-snapshot
      - data-snapshot
    image: rgorodischer/data-snapshot:scratch
      - /data/active                   ❷
    command: /usr/apply_snapshot.sh    ❸
  1. Link volumes from the data-snapshot container defined below.

  2. Make a volume from folder /data/active in the data-snapshot container.

  3. Run /usr/apply_snapshot.sh script on data-snapshot container startup.

Now, let’s see what the scripts are doing. apply_snapshot.sh is simply copying /data/snapshot folder contents to the volume folder /data/active (see the second diagram). Here’s its full listing:

set -e

mkdir -p $DEST
if [[ -z $(ls -A $DEST) ]]; then
  cp -a /data/snapshot/. $DEST
  echo "$DEST is not empty."

Accompanying script take_snapshot.sh is doing the opposite: replaces contents of /data/snapshot with contents of /data/active folder. It also removes files with .lock extension, which is the only MongoDB-specific action here (and more a precaution than a necessity). A listing of take_snapshot.sh is shown below:

set -e

rm -rf /data/snapshot
mkdir -p /data/snapshot
cp -a /data/active/. /data/snapshot
find /data/snapshot -type f -name *.lock -delete

Dockerfile for data-snapshot container and its image can be found at GitHub and DockerHub respectively.

Taking a snapshot is directed externally from publish_snapshot.sh:

set -e


data_container_id=$(docker ps -a | grep $DATA_CONTAINER | cut -d' ' -f 1)           
if [[ -z $data_container_id ]]; then
  echo "Data container is not found."
  exit 1

docker login $REGISTRY                                                              

echo "Taking the snapshot..\n"                              
docker run --name $SNAPSHOT_TAKER --volumes-from $data_container_id rgorodischer/data-snapshot:scratch /usr/take_snapshot.sh

echo -en "Snapshot description:\n"
read msg

echo -en "Author:\n"
read author

echo -en "Snapshot tag (alphanumeric string without spaces):\n"
read tag

docker commit --author="$author" --message="$msg" $SNAPSHOT_TAKER $REGISTRY/$REPOSITORY:$tag

docker push $REGISTRY/$REPOSITORY:$tag                                               

docker rm -f $SNAPSHOT_TAKER &> /dev/null    
  1. Run a temporary container from data-snapshot:scratch image with linked data-snapshot’s volume, execute /usr/take_snapshot.sh script on startup and stop the container (it’s stopped automatically because no other processes are run there). I run the container from my image on DockerHub, but most likely you want to use your own copy.

  2. Commit changes to new local image tagged with $tag.

  3. Push new data snapshot image to your repository.

  4. Remove the temporary container.

Now imagine you’ve just published a new shiny data snapshot tagged essential-data-set.
Then you simply update data-snapshot definition in docker-compose.yml with the new tag and make a Git push. Your teammate pulls those changes, and can reestablish the whole dev environment including your new dataset just by running a single command:

docker-compose run --service-ports dev-container

As a final step, you can add some scripting for removing existing containers and volumes before updating the environment, so that docker-compose can work out smoothly every run.

Get the fastest log management and analysis with Graylog open source or enterprise edition free up to 5GB per day

devops ,docker ,development environment ,automation

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}