Using Docker for Dev Environment Setup Automation
How long does it take to set up your current project locally? Two days? Two hours? With Docker, this can be done virtually in one command.
Join the DZone community and get the full member experience.
Join For FreeIn this article, I show how Docker can help in setting up a runtime environment, a database with a predefined dataset, get it all together and run isolated from everything else on your machine.
Let’s start with the goals:
- I want to have isolated Java SDK, Scala SDK, and SBT (build tool).
- I want to be still able to edit and rebuild my project code easily from my IDE.
- I need a MongoDB instance running locally.
- Last but not least, I want to have some minimal data set in my MongoDB out of the box.
- Ah, right, all of the above must be downloaded and configured in a single command.
All these goals can be achieved by running and tying together just three Docker containers. Here is a high-level overview of these containers:
It’s impressive that such a simple setup brings all listed benefits, isn’t it? Let’s dive in.
Step I: Development Container Configuration
First two goals are covered by Dev Container, so let’s start with that one.
Our minimal project structure should look like this:
/myfancyproject
/project
/module1
/module2
.dockerignore
build.sbt
Dockerfile
The structure will become more sophisticated as the article progresses, but for now it’s sufficient. Dockerfile is the place where Dev Container’s image is described (if you aren’t yet familiar with images and containers, you should read this first). Let’s look inside:
❶
FROM java:openjdk-8u72-jdk
❷
# install SBT
RUN apt-get update \
&& apt-get install -y apt-transport-https \
&& echo "deb https://dl.bintray.com/sbt/debian /" | tee -a /etc/apt/sources.list.d/sbt.list \
&& apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823 \
&& apt-get update \
&& apt-get -y install sbt
ENV SBT_OPTS="-Xmx1200M -Xss512K -XX:MaxMetaspaceSize=512M -XX:MetaspaceSize=300M"
❸
# make SBT dependencies an image layer: no need to update them on every container rebuild
COPY project/build.properties /setup/project/build.properties
RUN cd /setup \
&& sbt test \
&& cd / \
&& rm -r /setup
❹
EXPOSE 8080
❺
VOLUME /app
WORKDIR /app
❻
CMD ["sbt"]
The Official OpenJDK 1.8 image is specified as a base image in the first line of Dockerfile. Note: official repositories are available for many other popular platforms. They are easily recognizable by naming convention: it never contains slashes i.e. it always just a repository name without author name.
Install SBT and specify environment variable SBT_OPTS.
An SBT-specific trick (skip it safely if you don’t use SBT) is to speed up containers' starting times. As you may know, nothing is eternal in a container: it can (and normally is) destroyed every now and then. By making some required dependencies a part of the image, we download them just once and in this way significantly speed up containers building time.
Declare port 8080 as the one listened by an application inside the container. We’ll refer it later to access the application.
Declare a new volume under the
/app
folder and start the next commands from there.We will use it in a moment to make all project files accessible from two worlds: from the host and from the container.Default command to run SBT interactive mode on container startup.For build tools without interactive mode (like Maven), this can be
CMD [“/bin/bash”]
.
Now, we can already test the image:
docker build -t myfancyimage .
docker run --rm -it -v $(pwd):/app -p 127.0.0.1:9090:8080 myfancyimage
The first command builds an image from our Dockerfile and gives it a name myfancyimage. The second command builds and starts the container from the image. It binds current folder to the container’s volume ($(pwd):/app
) and binds host port 9090 to container’s exposed port 8080.
Step II: MongoDB Container Configuration
Ok, now it’s time to bring in some data. We start with adding Mongo Engine container, and later supply it with sample data snapshot. As we’re about to run multiple containers linked together, it’s convenient to describe how to run these containers via docker-compose configuration file. Let’s add to the project’s root docker-compose.yml
with the following content:
version: '2'
services:
dev-container: ❶
build:
context: .
dockerfile: Dockerfile
image: myfancyimage
ports:
- "127.0.0.1:9090:8080"
volumes:
- .:/app
links: ❸
- mongo-engine
mongo-engine: ❷
image: mongo:3.2
command: --directoryperdb
The commands for building and running myfancyimage are transformed to the
dev-container
.Container
mongo-engine
with MongoDB 3.2 from official DockerHub repository.Link
mongo-engine
todev-container
: mongo-engine will start prior to dev-container and they will share a network. MongoDB is available to dev-container by the URL "mongodb://mongo-engine/".
Let’s try it:
docker-compose run --service-ports dev-container
It’s important to add the --service-ports flag to enable configured ports mapping.
Step III: Data Container Configuration
All right, here comes the hardest part: sample data distribution. Unfortunately, there’s no suitable mechanism for Docker Data Volumes distribution, although there exist a few Docker volumes managers (i.e., Flocker, Azure Volume driver, etc.), these tools serve other goals.
Note: An alternative solution would be to restore data from DB dump programmatically or even generate it randomly. But this approach is not generic, i.e., it involves specific tools and scripts for each DB, and in general is more complicated.
The data distribution mechanism we’re seeking must support two operations:
- Replicate a fairly small dataset from a remote shared repository to local environment.
- Publish new or modified data set to the remote repository.
One obvious approach is to distribute data via docker images. In this case, a remote repository is the same place we store our Docker images. It can be either DockerHub or a private Docker Registry instance. The solution described below can work with both.
Meeting the 1st requirement is easy: we need to run a container from a data image, mark data folder as a volume, and link that volume (via the --volumes-from argument) to Mongo Engine container.
The 2nd requirement is complicated. After doing some changes inside the volume we cannot simply commit those changes back to docker image: volume is technically not a part of the modifiable top layer of a container. In simpler words, Docker daemon just doesn’t see any changes to commit.
Here’s the trick: If we can read changed data but cannot commit it from the volume then we need to copy it first elsewhere outside of all volumes so that the daemon detects changes. Applying the trick has a not-so-obvious consequence: we cannot create a volume directly from the data folder but have to use another path for it, and then copy all data to the volume when the container starts. Otherwise, we’ll have to alternate the volume path depending on where the data is stored this time, and this is hardly automated.
The whole process of cloning and saving a dataset is displayed on the diagrams below:
Making data snapshot available to other containers as a volume on startup.
Committing changes to new image and pushing it to storage.
We’ll dig into scripts for taking and applying data snapshots a bit later. For now, let’s assume they are present in the data snapshot container’s /usr folder. Here is how the docker-compose.yml is updated with the data container definition:
version: '2'
services:
dev-container:
build:
context: .
dockerfile: Dockerfile
image: myfancyimage
ports:
- "127.0.0.1:9090:8080"
volumes:
- .:/app
links:
- mongo-engine
mongo-engine:
image: mongo:3.2
command: --directoryperdb
volumes_from: ❶
- data-snapshot
depends_on:
- data-snapshot
data-snapshot:
image: rgorodischer/data-snapshot:scratch
volumes:
- /data/active ❷
command: /usr/apply_snapshot.sh ❸
Link volumes from the data-snapshot container defined below.
Make a volume from folder
/data/active
in the data-snapshot container.Run
/usr/apply_snapshot.sh
script on data-snapshot container startup.
Now, let’s see what the scripts are doing. apply_snapshot.sh
is simply copying /data/snapshot folder contents to the volume folder /data/active (see the second diagram). Here’s its full listing:
#!/bin/ash
set -e
DEST=/data/active
mkdir -p $DEST
if [[ -z $(ls -A $DEST) ]]; then
cp -a /data/snapshot/. $DEST
else
echo "$DEST is not empty."
fi
Accompanying script take_snapshot.sh
is doing the opposite: replaces contents of /data/snapshot with contents of /data/active folder. It also removes files with .lock extension, which is the only MongoDB-specific action here (and more a precaution than a necessity). A listing of take_snapshot.sh
is shown below:
#!/bin/ash
set -e
rm -rf /data/snapshot
mkdir -p /data/snapshot
cp -a /data/active/. /data/snapshot
find /data/snapshot -type f -name *.lock -delete
Dockerfile for data-snapshot container and its image can be found at GitHub and DockerHub respectively.
Taking a snapshot is directed externally from publish_snapshot.sh
:
#!/bin/bash
set -e
REGISTRY=your.registry
REPOSITORY=your-repository-name
SNAPSHOT_TAKER=data-snapshot-taker
DATA_CONTAINER=data-snapshot
data_container_id=$(docker ps -a | grep $DATA_CONTAINER | cut -d' ' -f 1)
if [[ -z $data_container_id ]]; then
echo "Data container is not found."
exit 1
fi
docker login $REGISTRY
echo "Taking the snapshot..\n"
❶
docker run --name $SNAPSHOT_TAKER --volumes-from $data_container_id rgorodischer/data-snapshot:scratch /usr/take_snapshot.sh
echo -en "Snapshot description:\n"
read msg
echo -en "Author:\n"
read author
echo -en "Snapshot tag (alphanumeric string without spaces):\n"
read tag
❷
docker commit --author="$author" --message="$msg" $SNAPSHOT_TAKER $REGISTRY/$REPOSITORY:$tag
❸
docker push $REGISTRY/$REPOSITORY:$tag
❹
docker rm -f $SNAPSHOT_TAKER &> /dev/null
Run a temporary container from
data-snapshot:scratch
image with linked data-snapshot’s volume, execute/usr/take_snapshot.sh
script on startup and stop the container (it’s stopped automatically because no other processes are run there). I run the container from my image on DockerHub, but most likely you want to use your own copy.Commit changes to new local image tagged with $tag.
Push new data snapshot image to your repository.
Remove the temporary container.
Now imagine you’ve just published a new shiny data snapshot tagged essential-data-set.
Then you simply update data-snapshot definition in docker-compose.yml with the new tag and make a Git push. Your teammate pulls those changes, and can reestablish the whole dev environment including your new dataset just by running a single command:
docker-compose run --service-ports dev-container
As a final step, you can add some scripting for removing existing containers and volumes before updating the environment, so that docker-compose can work out smoothly every run.
Published at DZone with permission of Roman Gorodyshcher, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments