Why data volumes are needed in docker 07/13 Update SLTechnology News&Howtos

Why data volumes are needed in docker

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces why data volumes are needed in docker. It is very detailed and has a certain reference value. Friends who are interested must read it!

Why do I need data volumes?

This starts with the file system of the docker container. Due to a series of reasons such as efficiency, the file system of the docker container exists in a complex way on the host, which leads to the following problems:

The files in the container cannot be easily accessed on the host.

Unable to share data between multiple containers.

When the container is deleted, the data generated in the container is lost.

In order to solve these problems, docker introduces the data volume (volume) mechanism. A data volume is a specific file or folder that exists in one or more containers, and this file or folder exists in the host in a form independent of the docker file system. The most specific thing about a data volume is that its life cycle is independent of that of the container.

The best scenario for using data volumes

Data is shared among multiple containers, and multiple containers can simultaneously mount the same data volume as read-only or read-write, thereby sharing the data in the data volume.

When the host cannot guarantee the existence of a certain directory or some files with fixed paths, the use of data volumes can circumvent the problems caused by this limitation.

When you want to store the data in the container outside the host, such as on a remote host or on cloud storage.

Data volumes are a good choice when you need to back up, restore, or migrate container data between different hosts.

Docker volume subcommand

Docker specifically provides volume subcommands to manipulate data volumes:

Create creates a data volume

Inspect displays the details of the data volume

Ls lists all data volumes

Prune removes all unused volumes and has the-f option

Rm deletes one or more unused volumes and has the-f option

First create a data volume named hello and view it with the ls command:

You can then use the inspect command to look at the details of the data volume hello:

Here we can see the time when the data volume was created; the driver used for the data volume is the default "local", which means that the data volume uses the local storage of the host; the mount point of the data volume is a directory under the local machine / var/lib/docker/volumes by default.

Finally, we can use the rm or prune command to delete the data volume. Later, the author will introduce some practices related to the deletion of the data volume in practical use.

Mount data volumes using mount syntax

Previously, we used the-- volume (- v) option to mount data volumes, but now docker provides a more powerful-- mount option to manage data volumes. The mount option can provide multiple configuration items at a time through multiple key-value pairs separated by commas, so the mount option can provide more detailed configuration than the volume option. Common use of the mount option

The configuration is as follows:

Type specifies the mount method. We use volume here. In fact, we can also have bind and tmpfs.

Volume-driver specifies the driver to mount the data volume, and the default value is local.

Source specifies the source of the mount, and for a named data volume, you should specify the name of the data volume. When in use, you can write source or abbreviated as src.

Destination specifies the path of the mounted data in the container. When in use, you can write destination, or it can be abbreviated to dst or target.

Readonly specifies that the mounted data is read-only.

Volume-opt can be specified multiple times to improve more mount-related configurations.

Let's look at a specific example:

$docker volume create hello$ docker run-id-- mount type=volume,source=hello,target=/world ubuntu / bin/bash

We created a data volume named hello and hung it in the / world directory in the container. You can verify the actual data volume mount result by viewing the "Mounts" information in the details of the container through the inspect command:

Use volume driver to store data elsewhere

In addition to storing the data in the volume in the host by default, docker also allows us to store the data in the volume in another place by specifying volume driver, such as Azrue Storge or S3 of AWS.

For simplicity, our next demo demonstrates how to store data volumes on other hosts through the vieux/sshfs driver.

Docker does not install the vieux/sshfs plug-in by default. We can install it with the following command:

$docker plugin install-grant-all-permissions vieux/sshfs

Then create a data volume through the vieux/sshfs driver and specify the login user name, password, and data storage directory of the remote host:

Docker volume create-driver vieux/sshfs\-o sshcmd=nick@10.32.2.134:/home/nick/sshvolume\-o password=yourpassword\ mysshvolume

Note, please make sure that the mount point directory on the remote host you specified exists (/ home/nick/sshvolume directory in demo), otherwise an error will be reported when starting the container.

Finally, the data volume is specified to be mounted when the container is started:

Docker run-id\-- name testcon\-- mount type=volume,volume-driver=vieux/sshfs,source=mysshvolume,target=/world\ ubuntu / bin/bash

This is done. The files you operate in the container / world directory are stored in the / home/nick/sshvolume directory of the remote host. Go to the container testcon and create a file in the / world directory, then open the / home/nick/sshvolume directory of the remote host to see if your new file is already there!

Principle of data volume

The following figure describes three ways for docker containers to mount data:

The data volumes are completely managed by docker, and as described in the yellow area in the figure above, docker finds data related to file management volumes in the file system of the host. So you probably don't need to know where the volume files are stored on the host (in fact, in the spirit of getting to the bottom of it, we still want to figure out how it works!) .

The essence of an docker data volume is a special directory in the container. During the creation of the container, docker mounts the specified directory on the host (a directory named data volume ID) to the directory specified in the container. The mount method used here is bound mount (bind mount), so the host directory after the mount is consistent with the target directory in the container.

For example, we execute the following command to create the data volume hello and mount it to the / world directory of the container testcon:

$docker volume create hello$ docker run-id-name testcon-mount type=volume,source=hello,target=/world ubuntu / bin/bash

In fact, during the creation of the container, the following code is executed in the container:

/ / Mount the data volume hello to the mount ("/ var/lib/docker/volumes/hello/_data", "rootfs/world", "none", MS_BIND, NULL) at the specified mount point / world in rootfs.

After dealing with all the mount operations (what really needs to be mounted by the docker container is not only the data volume directory but also the contents of rootfs,init-layer, / proc devices, etc.), docker only needs to switch the root directory of the process to rootfs through chdir and pivot_root, so that the internal processes of the container can only see the file system rooted in rootfs and the directories under rootfs that are mount. For example, the file system in the testcon we started is:

Let's introduce some common problems in the use of data volumes.

Data coverage problem

If an empty data volume is mounted to a non-empty directory in the container, the files in that directory will be copied to the data volume.

If you mount a non-empty data volume to a directory in the container, the data in the data volume will be displayed in the directory in the container. If there is data in the directory in the original container, the original data will be hidden.

Both of these rules are very important, and flexible use of the first rule can help us initialize the contents of the data volume. Mastering the second rule ensures that the data after the data volume is mounted is always the result you expect.

Add data volumes in Dockerfile

In Dockerfile, we can use the VOLUME instruction to add data volumes to the container:

VOLUME / data

When you use the docker build command to generate an image and launch the container with that image, a data volume is mounted to the / data directory. According to the known data overwrite rules, if the / data directory exists in the image, all the contents of this directory will be copied to the corresponding directory in the host, and the appropriate permissions and owners will be set according to the files in the container.

Note that the VOLUME directive cannot mount the specified directory on the host. This is to ensure the consistency of Dockerfile, because there is no guarantee that all hosts have corresponding directories.

In practical use, there is another trap to note: if you try to modify the code after using the VOLUME directive in Dockerfile, these changes will not take effect! Here is an example of this:

FROM ubuntuRUN useradd nickVOLUME / dataRUN touch / data/test.txtRUN chown-R nick:nick / data

After the image is created through this Dockerfile and the container is started, the user nick exists in the container and the data volume mounted in the / data directory can be seen. But there is no file test.txt in the / data directory, let alone the owner attribute of the test.txt file. To explain this phenomenon, we need to understand the process of creating an image through Dockerfile:

Each line in Dockerfile except the FROM instruction runs a container based on the temporary image generated on the previous line, executes an instruction and executes commands like docker commit to get a new image. This docker commit-like command does not save mounted data volumes.

So when the last two lines of Dockerfile above are executed, / data is mounted on a temporary container and manipulated on the temporary data volume, but after this line of instruction is executed and committed, the temporary data volume is not saved. Therefore, the data volume mounted by the container we finally created through the mirror has not been manipulated by the last two instructions. Let's call it "initialization of data volumes in Dockerfile".

The following writing can solve the problem of initializing data volumes in Dockerfile:

FROM ubuntuRUN useradd nickRUN mkdir / data & & touch / data/test.txtRUN chown-R nick:nick / dataVOLUME / data

After the image is created through this Dockerfile and the container is started, the initialization of the data volume is as expected. This is because / data already exists when the data volume is mounted, and the files in / data and their permissions and owner settings are copied to the data volume.

There is another way to solve the initialization problem of data volumes in Dockerfile. It takes advantage of the execution characteristics of the CMD instruction and the ENTRYPOINT instruction: unlike the RUN instruction, which executes during the mirror construction process, the CMD instruction and the ENTRYPOINT instruction are executed when the container starts. Therefore, the initialization of the data volume can also be achieved using the following Dockerfile:

FROM ubuntuRUN useradd nickVOLUME / dataCMD touch / data/test.txt & & chown-R nick:nick / data & & / bin/bash

Data volume solves the problem of persistence of user data and enables users to generate data in the container beyond the life cycle of the container itself. Therefore, it is necessary for container technology to master the use of data volumes.

The above is all the content of the article "Why do you need data volumes in docker". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.