How to build a high-quality security image of Docker correctly and quickly 07/19 Update SLTechnology News&Howtos

How to build a high-quality security image of Docker correctly and quickly

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the knowledge of "how to correctly and quickly build a high-quality security image of Docker". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Cache to speed up the build

The build time of the image is mostly spent on downloading and installing system software packages and application dependency packages. However, these usually do not change often, so caching is recommended.

Start with system packages and tools-- usually run after FROM to ensure that they are cached. No matter which Linux distribution you use as the basic image, you should get the following results:

FROM... # any viable base image like centos:8, ubuntu:21.04 or alpine:3.12.3 # RHEL/CentOS RUN yum install... # Debian RUN apt-get install... # Alpine RUN apk add... # Rest of the Dockerfile (COPY, RUN, CMD...)

In addition, you can even extract these related commands to a separate Dockerfile to build your own base image. You can then push the image to the image repository so that you and others can reference it in other Dockerfile.

In this way, you no longer have to worry about system packages and related dependencies unless you need to upgrade them or add and remove something.

After the system package, we usually install the application dependencies. These may come from the Java library in the Maven repository (stored in the .m2 directory by default), the JavaScript module node_modules, or the Python library venv.

These changes are more frequent than system dependencies, but not enough to guarantee a complete redownload and reinstallation with each build. However, if the corresponding Dockerfile is not well written, you will notice that caching is not used even if the dependencies are not modified:

FROM... # any viable base image like python:3.8, node:15 or openjdk:15.0.1 # Copy everything at once COPY. . # Java RUN mvn clean package # Or Python RUN pip install-r requirements.txt # Or JavaScript RUN npm install #... CMD ["..."]

Why is that? The problem lies in COPY. Docker uses caching at every step of the build until it encounters a new or modified command / layer.

In this case, when we copy everything into the mirror-including the list of unchanged dependencies and the modified source code.

Docker continues and redownloads and reinstalls all dependencies. Because the source file has been modified, it can no longer use caching at this layer. To avoid this, we must copy the file in two steps:

FROM... # any viable base image like python:3.8 Node:15 or openjdk:15.0.1 COPY pom.xml. / pom.xml # Java COPY requirements.txt. / requirements.txt # Python COPY package.json. / package.json # JavaScript RUN mvn dependency:go-offline-B # Java RUN pip install-r requirements.txt # Python RUN npm install # JavaScript COPY. / src. / src/ # Rest of Dockerfile (build application Set CMD...)

First, we add a file that lists all application dependencies and install them. If there are no changes to this file, all changes will be cached. Only in this way can we copy the rest of the (modified) source code into the image and run the test and build of the application code. For more "advanced" methods, we use Docker's BuildKit and its experimental features to do the same:

# syntax=docker/dockerfile:experimental FROM... # any viable base image like python:3.8, openjdk:15.0.1 COPY pom.xml. / pom.xml # Java COPY requirements.txt. / requirements.txt # Python RUN-mount=type=cache,target=/root/.m2 mvn dependency:go-offline-B # Java RUN-mount=type=cache,target=/root/.cache/pip pip install-r requirements.txt # Python

The above code shows how to use the command-the mount option RUN to select the cache directory. This is helpful if you want to explicitly use a non-default cache location.

However, if you want to use this feature, you must include the title line for the specified syntax version (as described above) and use it to run the build, such as: DOCKER_BUILDKIT=1 docker build name:tag.

More information about the lab functionality can be found in these documents (https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/syntax.md#run---mounttypecache)).

So far, everything is only for local builds-this is different for CI, and usually each tool / provider will be different, but for any one of them, you will need some persistence volumes to store caches / dependencies. For example, for Jenkins, you can use storage in the agent.

For Docker builds running on Kubernetes (whether using JenkinsX,Tekton or others), you will need the Docker daemon, which can be deployed using Docker in Docker (DinD), which is the Docker daemon running in the Docker container.

As for the build itself, you will need a pod (container) connected to DinD socket to run the docker build command.

To demonstrate and simplify the operation, we can use the following pod:

ApiVersion: v1 kind: Pod metadata: name: docker-build spec: containers:-name: dind # Docker in Docker container image: docker:19.03.3-dind securityContext: privileged: true env:-name: DOCKER_TLS_CERTDIR value:''volumeMounts:-name: dind-storage mountPath: / var/lib/docker-name: docker # Builder container image: docker:19.03.3-git securityContext: privileged: true command : ['cat'] tty: true env:-name: DOCKER_BUILDKIT value:' 1'-name: DOCKER_HOST value: tcp://localhost:2375 volumes:-name: dind-storage emptyDir: {}-name: docker-socket-volume hostPath: path: / var/run/docker.sock type: File

The above container consists of two containers-one for DinD and one for image build. To run a build using the build container, you can access its shell, clone some repositories, and run the build process:

~ $kubectl exec-- stdin-- tty docker-build-- / bin/sh # Open shell session ~ # git clone https://github.com/username/reponame.git # Clone some repository ~ # cd reponame ~ # docker build--build-arg BUILDKIT_INLINE_CACHE=1-t name:tag-- cache-from username/reponame:latest. ... = > importing cache manifest from martinheinz/python-project-blueprint:flask... > > = > writing image sha256:... = > = > naming to docker.io/library/name:tag = > exporting cache = > = > preparing build cache for export

Eventually docker build uses some new option,-- cache-from image:tag, to tell Docker that it should use the specified image in the (remote) repository as the cache source. In this way, we can take advantage of caching even if the cached layer is not stored in the local file system.

Another option-build-arg BUILDKIT_INLINE_CACHE=1 is used to write cache metadata to the mirror when it is created. This must be used for-- cache-from work, see the documentation (https://docs.docker.com/engine/reference/commandline/build/#specifying-external-cache-sources)) for more information.

Minimum mirror image

It's nice to build quickly, but if you have real "thick" images, it still takes a long time to push/pull them, and fat images are likely to contain a lot of useless libraries, tools, and things like that, all of which make the images even more bloated.

Vulnerable to attack because it creates a larger attack surface.

The easiest way to make a smaller mirror is to use a base image such as Alpine Linux, rather than a Ubuntu or RHEL-based image. Another good approach is to use a multi-step Docker build, where you use one image to build (the first FROM command) and another smaller image to run the application (the second / last FROM), for example:

# 332.88 MB FROM python:3.8.7 AS builder COPY requirements.txt / requirements.txt RUN / venv/bin/pip install-- disable-pip-version-check-r / requirements.txt # only 16.98 MB FROM python:3.8.7-alpine3.12 as runner # copy only the dependencies installation from the 1st stage image COPY-- from=builder / venv/ venv COPY-- from=builder. / src / app CMD ["...]

The above shows that we first prepared the application and its dependencies in the basic Python 3.8.7 image, which is very large, 332.88 MB. Here, we installed the virtual environment and libraries required by the application.

Then we switch to a smaller Alpine-based mirror, which is only 16.98 MB. We copy the entire virtual environment created previously, as well as the source code, to this image. In this way, we end up with much smaller images, fewer mirror layers, and fewer unnecessary tools and binaries.

Another thing to keep in mind is the number of layers we produce during each build. Both FROM,COPY,RUN and CMD generate new layers. At least in the case of RUN, we can easily reduce the number of layers it creates by merging all RUN commands into one:

# Bad, Creates 4 layers RUN yum-disablerepo=*-enablerepo= "epel" RUN yum update RUN yum install-y httpd RUN yum clean all-y # Good, creates only 1 layer RUN yum-disablerepo=*-enablerepo= "epel" & &\ yum update & &\ yum install-y httpd & &\ yum clean all-y

We can go one step further and get rid of the potentially heavy basic mirror image completely. To do this, we will use a special FROM scratch signal to tell Docker that the smallest basic image should be used, and the next command will be the first layer of the final image.

This is especially useful for applications that run as binaries and do not require a large number of tools, such as Go,C + or Rust applications. However, this approach requires binaries to be compiled statically, so it does not apply to languages such as Java or Python. An example of FROM scratchDockerfiles might look like this:

FROM golang as builder WORKDIR / go/src/app COPY. . # Static build is required so that we can safely use 'scratch' base image RUN CGO_ENABLED=0 go install-ldflags'-extldflags "- static" 'FROM scratch COPY-- from=builder / go/bin/app / app ENTRYPOINT ["/ app"]

It's simple, isn't it? With this Dockerfile, we can generate an image that is only about 3MB!

Locked version

Speed and size are two things that most people are concerned about, and the security of mirroring becomes an afterthought. There are several simple ways to lock down the mirror and limit the attack surfaces that an attacker can exploit.

The most basic recommendation is to lock the versions of all libraries, packages, tools, and basic images, which is important not only for security, but also for the stability of the images. If you use the latest markup for the image, or if you do not specify a version in the requirements.txt of Python or the package.json of JavaScript, the image / library you downloaded during build may not be compatible with the application code or expose the container to vulnerabilities.

When you want to lock everything to a specific version, you should also update all these dependencies periodically to ensure that you have all the latest security patches and patches available.

Even if you work really hard to avoid any vulnerabilities in all dependencies, there will still be some vulnerabilities that you have missed or have not yet been fixed / discovered. Therefore, in order to mitigate the impact of any possible attack, it is best to avoid running the container as root.

Therefore, user 1001 should be included in the Dockerfiles to indicate that the container created from Dockerfiles should and can run as a non-root user (ideally any user). Of course, this may require you to modify the application and select the correct basic image, because some common basic images (such as nginx) require root permissions (for example, due to privileged ports).

It is often difficult to find and avoid vulnerabilities in Docker images, but it may be easier if the mirror contains only the minimum required to run the application. Distroless released by Google (https://github.com/GoogleContainerTools/distroless) is one such image.

Pruning Distroless images to the point where there is even no shell or package manager makes them much better in terms of security than Debian or Alpine-based images. If you are using a multi-step Docker build, in most cases, switching to the Distroless runner image is simple:

FROM... AS builder # Build the application... # Python FROM gcr.io/distroless/python3 AS runner # Golang FROM gcr.io/distroless/base AS runner # NodeJS FROM gcr.io/distroless/nodejs:10 AS runner # Rust FROM gcr.io/distroless/cc AS runner # Java FROM gcr.io/distroless/java:11 AS runner # Copy application into runner and set CMD... # More examples at https://github.com/GoogleContainerTools/distroless/tree/master/examples

In addition to the possible vulnerabilities in the final image and its container, we must also consider the Docker daemon and container runtime used to build the image. Therefore, like all our mirrors, we should not allow Docker to run with root users, but should use the so-called rootless mode.

This is the end of the content of "how to correctly and quickly build a high-quality security image of Docker". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.