Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Demystifying the Data Volume: Storage in Docker

DZone's Guide to

Demystifying the Data Volume: Storage in Docker

Provide your data with longevity to outlast a Docker container with this tutorial on how to create and delete volumes for your stack.

· Cloud Zone ·
Free Resource

See why enterprise app developers love Cloud Foundry. Download the 2018 User Survey for a snapshot of Cloud Foundry users’ deployments and productivity.

What are Volumes, and Why Do We Need Them?

In layman's terms, volumes are external storage areas used to store data produced by a Docker container. Volumes can be located on the docker host or even on remote machines.

Containers are ephemeral, a fancy way of saying that they have very short lives. When a container dies all the data it has created (logs, database records, etc...) dies with it. So how do we ensure that data produced by containers is stored? Volumes are the answer to this question. Volumes are used to store the data generated by a container so even when its gone the data it produces still lives on.

Volume Types

As of this writing, there are two main types of volumes, data volumes and bind mounts. However, the focus of this post will be on data volumes.

As the name suggests, data volumes are used to externally store data produced from a docker container.

Defining a Data Volume in a Dockerfile

Data volumes can be defined in a Dockerfile using the VOLUME command. The snippet below from the postgres:9.6 Dockerfile shows how this is done:

ENV PATH $PATH:/usr/lib/postgresql/$PG_MAJOR/bin
ENV PGDATA /var/lib/postgresql/data
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA" # this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
VOLUME /var/lib/postgresql/data
COPY docker-entrypoint.sh /usr/local/bin/
RUN ln -s usr/local/bin/docker-entrypoint.sh / # backwards compat
ENTRYPOINT ["docker-entrypoint.sh"]

EXPOSE 5432
CMD ["postgres"]

Creating and Viewing the postgres:9.6 Volume

To create the volume specified in the postgres:9.6 Dockerfile, we will need to create a container based on the image. To do this just run the following command:

docker run --name my-postgres -e POSTGRES_PASSWORD=password -d postgres:9.6

postgres-container-run

To view the volume, we must first find its name. To do this we must use the docker inspectcommand:

docker inspect --format='{{json .Mounts}}' my-postgres | python -m json.tool

view-mount-details

NOTE: The  python -m json.tool at the end of the command is used to format the json output.

You should see output similar to one below:

[
 {
 "Destination": "/var/lib/postgresql/data",
 "Driver": "local",
 "Mode": "",
 "Name": "fd223ffe7aa7c614fc393a42e14f5b64aa38a236351f18e3c4306fe5b8a8f5af",
 "Propagation": "",
 "RW": true,
 "Source": "/var/lib/docker/volumes/fd223ffe7aa7c614fc393a42e14f5b64aa38a236351f18e3c4306fe5b8a8f5af/_data",
 "Type": "volume"
 }
]

The json returned provides details about the volume. Let's discuss three of the main fields, Destination, Name, and Source.

The Name attribute is the name/ID of the volume. The Destination refers to the folder location on the container which the volume is mapped to. Source shows the folder location on the hostmachine where the volume is mapped to.

To view the volume we created type the following command:

docker volume ls | grep <volume-name>

NOTE: <volume-name> refers to name/id of the volume; in this case it would be fd223ffe7aa7c614fc393a42e14f5b64aa38a236351f18e3c4306fe5b8a8f5af

view-specific-volume

The grep  command was used to only display the volume we created. To view all volumes run the following command:

docker volume ls

view-all-volumes

Viewing the Contents of the Volume

To view the contents of the volume on the host machine, we must use the folder path defined in the  Source  attribute:

"Source": "/var/lib/docker/volumes/fd223ffe7aa7c614fc393a42e14f5b64aa38a236351f18e3c4306fe5b8a8f5af/_data",

We can then simply perform an  ls  command to view the contents:

ls /var/lib/docker/volumes/fd223ffe7aa7c614fc393a42e14f5b64aa38a236351f18e3c4306fe5b8a8f5af/_data

view-contents-of-the-volumeCreating a Data Volume with "docker run"

Volumes can also be created using the  docker run command. This allows us to create volumes with more meaningful names as the automatically generated name can be a bit cryptic.

Before we create our new volume, let's delete the old postgres container along with its volume using the following command:

docker rm -f -v my-postgres

NOTE:  -f  is used to forcibly remove the container if it's still running and  -v is used to remove all volumes associated with the container.

Now we can go ahead and create our new postgres container with a brand new and meaningfully named volume (postgres-data):

docker run --name my-new-postgres -e POSTGRES_PASSWORD=password -v postgres-data:/var/lib/postgresql/data -d postgres:9.6

new-postgres

We have created a volume called postgres-data, using -v postgres-data:/var/lib/postgresql/data in our command. To verify that the volume has been created type the following command:

docker volume ls | grep postgres-data

view-postgres-data-volume

Also, if we execute the docker inspect , command we will see that the Source is now using the volume name as part of the folder path to the volume.

docker inspect --format='{{json .Mounts}}' my-new-postgres | python -m json.tool

view-mount-details-new

[
 {
 "Destination": "/var/lib/postgresql/data",
 "Driver": "local",
 "Mode": "z",
 "Name": "postgres-data",
 "Propagation": "",
 "RW": true,
 "Source": "/var/lib/docker/volumes/postgres-data/_data",
 "Type": "volume"
 }
]


Removing a Volume

Volumes can be removed by executing the docker volume rm <volume-name> command. However, the container that the volume is linked to must be stopped before the volume can be deleted. Also, keep in mind that if the volume is deleted all the data will be permanently lost. It is recommended that a backup of the volume be performed before deletion.

Summing Things Up

Data volumes are used to store data generated by a container. They can be defined in a Dockerfile, however, it is best to provide them with a name with the  -v parameter in the  dockerrun command. Always remember to back up a volume before you delete it.

Thanks for reading my post, feel free to comment and give feedback. I will be discussing bind mounts in a future post so stay tuned.


Cloud Foundry saves app developers $100K and 10 weeks on average per development cycle. Download the 2018 User Survey for a snapshot of Cloud Foundry users’ deployments and productivity. Find out what people love about the industry standard cloud application platform.

Topics:
docker ,volumes ,data storage ,cloud ,container ,data volume

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}