Optimization Summary of Docker Image Construction 04/25 Update SLTechnology News&Howtos

Optimization Summary of Docker Image Construction

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Summary of the optimization of Docker image construction as we continue to use docker images, if we do not pay attention to and optimize the process, the volume of the image will become more and more. In many cases, when we deploy applications using docker, we will find that the volume of the image is at least 1G. The increase of image size will not only increase the cost of disk resources and network resources, but also affect the deployment efficiency of applications, so that the deployment time of applications will be longer and longer. Therefore, we need to reduce the size of the deployment image to speed up the deployment efficiency and reduce the cost of resources. As for the optimization of mirror image, it can be realized by optimizing dockerfile. 1. Mirror minimization 1. Selecting the most compact basic image and choosing the smallest basic image can effectively reduce the mirror volume. For example, alpine, busybox, etc. 2. Clean up the intermediate products of image construction. In the process of building an image, when the dockerfile instruction is completed, delete the files that are not needed for the image. If you use yum to install components, you can finally use the yum clean all image to clean up unwanted files or use the system rm command to delete unwanted source files. 3. Reducing the number of layers of a mirror image is a file stored in layers, and the image also has a certain number of restrictions on the number of layers. Currently, the maximum number of layers of the image is 127. If you do not pay more attention, the image will become more and more bloated. When building an image using dockerfile, each instruction in dockerfile generates a layer, so you can reduce the number of layers that eventually generate the mirror by merging instructions in dockerfile. For example, if you use RUN to execute shell commands in dockerfile, you can connect multiple commands with "& &". 2. Make the most of the construction speed. 1. Make full use of the built cache. We can use the built cache to speed up the construction of the image. Docker build will enable the cache by default, and there are three key points for the cache to take effect. The parent layer of the image has not changed, the build instructions remain unchanged, and the file checksum is consistent. As long as a build instruction meets these three conditions, this layer of mirror build will no longer be performed, and it will directly take advantage of the results of the previous build. After the mirror cache of a certain layer expires, the mirror layer cache that follows it will fail. We should put the least changes in front of the Dockerfile so that we can make full use of the mirror cache. Commands such as WORKDIR, CMD, ENV, ADD, etc. that may cause cache invalidation in dockerfile are best placed at the bottom of dockerfile in order to maximize the use of cache in the process of building an image. 2. Delete files that are not needed in the build directory (default: the directory where Dockerfile is located). Write a .dockerkeeper file to filter out unnecessary files during the build process or create a separate directory, and there are only files in the directory that need to be used during the image build process. At run time, Docker is divided into Docker engines (that is, server-side daemons) and client tools. Docker's engine provides a set of REST API called Docker Remote API, and client tools such as the docker command interact with the Docker engine through this set of API to perform various functions. So although on the surface we seem to be performing various docker functions natively, in fact, everything is done on the server side (the Docker engine) in the form of remote invocations. The docker build command to build the image is not built locally, but on the server side, that is, in the Docker engine. When building an image, Docker needs to prepare the context first and collect all the required files into the process. The default context contains all the files in the Dockerfile directory. If there are a large number of unrelated files in the directory, it will not only cause slow construction, but also lead to an increase in the size of the mirror. The .dockerkeeper example is as follows: in a git project, we don't need a .git directory and so on. You can add the following to the .docker file: .git/.dockerignore functions and syntax is similar to .gitignore, you can ignore some unwanted files, which can effectively speed up the image build time and reduce the size of the Docker image. 3. Pay attention to optimizing the network request when we use some mirror sources or use url on the Internet in dockerfile, use some open source sites with better network, which can save time and reduce the failure rate. 3. Dockerfile instruction optimization 1, the difference between COPY instruction and ADD instruction COPY copy file format: COPY. The COPY [",..."] COPY directive will copy from the file / directory in the build context directory to the location in the image of the new layer. For example, COPY package.json / usr/src/app/ can be multiple, or even wildcards, whose wildcard rules must meet the filepath.Match rules of Go. For example, COPY hom* / mydir/COPY hom?.txt / mydir/ can be an absolute path in the container or a relative path relative to the working directory (the working directory can be specified by the WORKDIR directive). The destination path does not need to be created beforehand, and if the directory does not exist, the missing directory will be created before copying the file. In addition, it is important to note that with the COPY directive, all kinds of metadata for the source file are retained. Such as read, write, execution permissions, file change time, and so on. This feature is useful for mirror customization. Especially when build-related files are managed using Git. The format and nature of ADD's more advanced copy file ADD instructions are basically the same as those of COPY. But some functions have been added to COPY. For example, it could be a URL, in which case the Docker engine will try to download the linked file and put it there. The downloaded file permissions are automatically set to 600. if this is not the desired permission, you need to add an additional layer of RUN to adjust the permissions. In addition, if you download a compressed package and need to extract it, you also need an additional layer of RUN instructions to extract it. So it makes more sense to directly use the RUN instruction, and then use the wget or curl tool to download, process permissions, extract, and then clean up useless files. Therefore, this feature is not practical and is not recommended. If you compress a tar file in gzip, bzip2 and xz format, the ADD command will automatically unzip the compressed file to. This automatic decompression is useful in some cases, such as in the official image ubuntu: FROM scratchADD ubuntu-xenial-core-cloudimg-amd64-root.tar.gz /.. But in some cases, if we really want to copy a compressed file into it without unzipping it, we can't use the ADD command. The official Docker best practices document requires that you use COPY whenever possible, because the semantics of COPY are clear, just copying files, while ADD contains more complex functions and its behavior is not necessarily clear. The most suitable situations for using ADD are those that need to be decompressed automatically. It is also important to note that the ADD directive invalidates the image build cache, which may make the image build slower. So when choosing between the COPY and ADD instructions, you can follow the principle that all file copies use the COPY instruction, and use ADD only where automatic decompression is required. 2. The difference between CMD and ENTRYPOINT the CMDCMD instruction sets the default startup commands and parameters in the image. After the container starts, if no startup command is added (that is, nothing is added after the image parameter), when the default startup command set by CMD in the image is executed by default, the JSON format CMD ["command", "arg1", "arg2"] should be used as far as possible. For example, the startup mode of nginx: CMD ["nginx". "- D"] if both developers and users are not familiar with the working principles of CMD and ENTRYPOINT, try to avoid using these two instructions together, such as the startup mode of Django: CMD ["python", "manage.py", "runserver", "0.0.0.0Django 8989"] on the contrary, if both developers and users are familiar with how CMD and ENTRYPOINT work It is recommended that CMD be used as a parameter of ENTRYPOINT to match the use of ENTRYPOINT. When you need to use the container as a command line tool, it is recommended to use the ENTRYPOINT instruction to set the entry program of the image. When you need to perform a large number of pre-operations before starting the main program, you can set the entry instruction of ENTRYPOINT to a script start.sh when ENTRYPOINT is specified in the dockerfile. If the docker run adds instructions after the image Then these instructions will be executed as parameters of ENTRYPOINT. If there are both CMD and ENTRYPOINT instructions in dockerfile, when the CMD instruction is executable, it will run before ENTRYPOINT. If CMD is not an executable command, 3 will be appended as a command parameter of ENTRYPOINT. WORKDIR use absolute path to change directories as far as possible, instead of using RUN cd / data4, USER. If the application in the container does not need special permissions, you can set the owner of the application to a non-root user through the USER directive. If the user does not exist, you first need to create the user in the mirror using the RUN command. If you need to be consistent with the user's UID/GID each time you compile the image, you should specify UID and GID to avoid using the sudo command in the image when creating new users and groups. Because the TTY used by this command is uncertain, it will also affect the received semaphore. If you do need to use the sudo function, you can use the gosu command instead of initializing a daemon with a root user, and then starting the daemon with a non-root user in order to reduce the image volume, you should avoid unnecessary user switching 5. EXPOSEEXPOSE is used to declare the ports that need to be listened in the container in the future. In bridge mode, the internal ports of these containers will be mapped to the ports of the host. It is recommended that you do not change the application native port number EXPOSE inside the container. You can only specify the ports that need to be exposed inside the container in the future. You cannot specify the mapping between the external and internal ports of the container in the future. For example, it is meaningless to set EXPOSE 80:80.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.