How to make Docker image 07/03 Update SLTechnology News&Howtos

How to make Docker image

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to make Docker image". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn how to make Docker image.

1. Go language image simplification

When a Go program compiles, it compiles all necessary dependencies into binaries, but it's not entirely certain that it uses static links, because some packages of Go rely on system standard libraries, such as packages parsed by DNS. As long as these packages are imported into the code, the compiled binaries need to be called to some system libraries. For this requirement, Go implements a mechanism called cgo to allow Go to call C code, so that the compiled binaries can call the system libraries.

In other words, if the Go program uses the net package, it will generate a dynamic binary file. If you want the image to work properly, you must copy the required library files to the image, or use the busybox:glibc image directly.

Of course, you can also disable cgo so that Go does not use system libraries and replaces system libraries with built-in implementations (such as the built-in DNS parser), in which case the generated binaries are static. You can disable cgo by setting the environment variable CGO_ENABLED=0, for example:

FROM golangCOPY whatsmyip.go .ENV CGO_ENABLED=0RUN go build whatsmyip.goFROM scratchCOPY-- from=0 / go/whatsmyip .CMD [". / whatsmyip"]

Since the compilation generates static binaries, you can run directly in the scratch image.

Of course, you don't have to disable cgo completely. You can use the-tags parameter to specify the built-in libraries you need to use. For example,-tags netgo means to use the built-in net package without relying on the system library:

$go build-tags netgo whatsmyip.go

After this is specified, if none of the other imported packages uses the system library, the static binaries are compiled. In other words, as long as there is another package that uses the system library, cgo will be opened, and the final result is a dynamic binary file. If you want to get rid of it once and for all, set the environment variable CGO_ENABLED=0.

2. Explore the secrets of Alpine images

The last article gave a brief introduction to Alpine mirroring and promised to spend a lot of time on Alpine mirroring in later articles, and now is the time!

Alpine is one of many Linux distributions, like CentOS, Ubuntu, Archlinux, etc., is just the name of a distribution, claims to be small and secure, and has its own package management tool apk.

Unlike CentOS and Ubuntu, Alpine doesn't have maintenance support from big companies like Red Hat or Canonical, and the number of packages is much smaller than these distributions. (if you just look at the default out-of-the-box software repository, Alpine has only 10000 packages, while Ubuntu, Debian, and Fedora all have more than 50000 packages. )

Before the rise of containers, Alpine was a nobody, probably because people didn't care much about the size of the operating system itself. after all, people only care about business data and documents, and the sizes of programs, library files and the system itself are usually negligible.

After container technology has swept the whole software industry, everyone has noticed that the image of the container is too large, which wastes disk space and takes a long time to pull the image. As a result, people began to look for smaller images for containers. For familiar distributions (such as Ubuntu, Debian, Fedora), the mirror volume can only be controlled below 100m by removing some tools (such as ifconfig and netstat). For Alpine, there is no need to delete anything, and the image size is only 5m.

Another advantage of Alpine mirroring is that the package management tools are very fast and the software installation experience is very smooth. Admittedly, there is no need to care too much about the installation speed of the software package on the traditional virtual machine, the same package only needs to be installed once, and there is no need to install it repeatedly. Containers are different. You may build new images regularly, or you may install some debugging tools temporarily in the running container. If the installation speed of the package is very slow, it will quickly wear out our patience.

To be more intuitive, let's do a simple comparison test to see how long it takes for different distributions to install tcpdump. The test command is as follows:

? → time docker run install tcpdump

The test results are as follows:

Base image Size Time to install tcpdump---alpine:3.11 5.6MB 1-2sarchlinux:20200106 409 MB 7-9scentos:8 237 MB 5-6sdebian:10 114 MB 5-7sfedora:31 194 MB 35-60subuntu:18.04 64 MB 6-8s

If you want to know more about Alpine, you can take a look at Natanel Copa's speech.

Well, if Alpine is so great, why not use it as the base mirror for all mirrors? Don't worry, take it one step at a time. In order to level all the pits, you need to consider two situations:

Use Alpine as the basic mirror of the second build phase (run phase)

Use ALpine as the base mirror for all build phases (run and build phases)

Use Alpine in the run phase

With excitement, I added the Alpine image to Dockerfile:

FROM gcc AS mybuildstageCOPY hello.c .Run gcc-o hello hello.cFROM alpineCOPY-- from=mybuildstage hello .CMD [". / hello"]

Here comes the first pit, and there is an error in the startup container:

Standard_init_linux.go:211: exec user process caused "no such file or directory"

This error has been seen in the previous article. The scenario in the last article is to use the scratch image as the basic image of the C language program. The reason for the error is the lack of dynamic library files in the scratch image. But why does the use of Alpine image also report errors? is it also lack of dynamic library files?

Not exactly, Alpine also uses dynamic libraries, after all, one of its design goals is to take up less space. But Alpine uses a standard library that is different from most distributions. It uses musl libc, which is smaller, simpler and more secure than glibc, but is not compatible with the commonly used standard library glibc.

You may want to ask again, "since musl libc is smaller, simpler, and fucking more secure, why are other distributions still using glibc?" "

Mmm . Because glibc has a lot of additional extensions, and many programs use these extensions, musl libc does not include these extensions. For details, please refer to the documentation of musl.

That is, if you want the program to run in the Alpine image, you must use musl libc as the dynamic library at compile time.

Use Alpine in all stages

To generate a binary file linked to musl libc, there are two ways:

Some official images are available in Alpine, which can be used directly.

There are also some official images that are not available in Alpine, so we need to build them ourselves.

Golang mirroring is the first case, and golang:alpine provides a Go toolchain built on Alpine.

To build a Go program, you can use the following Dockerfile:

FROM golang:alpineCOPY hello.go. Run go build hello.goFROM alpineCOPY-- from=0 / go/hello .CMD [". / hello"]

The resulting image size is 7.5m, which is a bit big for a program that only prints "hello world", but we can look at it from a different angle:

Even if the program is complex, the resulting image is not very large.

Contains a lot of useful debugging tools.

Even if you lack some special debugging tools at run time, you can install them quickly.

The Go language is done. What about C language? There is no mirror image like gcc:alpine. You can only use the Alpine image as the basic image and install the C compiler yourself. The Dockerfile is as follows:

FROM alpineRUN apk add build-baseCOPY hello.c .Run gcc-o hello hello.cFROM alpineCOPY-- from=0 hello .CMD [". / hello"]

Build-base must be installed, and if gcc is installed, there will be only compilers and no standard libraries. Build-base is the build-essentials equivalent of Ubuntu, introducing tools such as compilers, standard libraries, and make.

Finally, let's compare the "hello world" image size obtained by different construction methods:

Build using basic image golang: 805MB

Multi-stage construction, build phase using basic image golang,run phase using basic image ubuntu:66.2MB

Multi-stage construction, build phase using basic image golang:alpine,run phase using basic image alpine:7.6MB

Multi-stage construction, build phase using basic image golang,run phase using basic image scratch:2MB

In the end, the image volume was reduced by 99.75%, which is quite astonishing. Let's look at a more practical example, the final image size comparison of the programs using net mentioned in the previous section:

Build using basic image golang: 810MB

Multi-stage construction, build phase using basic image golang,run phase using basic image ubuntu:71.2MB

Multi-stage construction, build phase using basic image golang:alpine,run phase using basic image alpine:12.6MB

Multi-stage construction, build phase using basic image golang,run phase using basic image busybox:glibc:12.2MB

Multi-stage build, the build phase uses the basic image golang and the parameter CGO_ENABLED=0,run phase uses the base image ubuntu:7MB

The mirror volume is still reduced by 99%.

3. Java language image simplification

Java is a compiled language, but the runtime still runs in JVM. So how do you use a multi-phase build for the Java language?

Static or dynamic?

Conceptually, Java uses dynamic linking because the Java code needs to call the Java API provided by JVM, and the API code is outside of the executable file, usually an JAR file or WAR file.

However, these Java libraries are not completely independent of system libraries, and some Java functions will eventually call system libraries, such as open (), fopen () or their variants when opening files, so JVM itself may be dynamically linked to system libraries.

This means that it is theoretically possible to use any JVM to run Java programs, regardless of whether the system standard library is musl libc or glibc. Therefore, any basic image with JVM can be used to build Java programs, or any image with JVM can be used as the base image for running Java programs.

Class file format

The format of Java class files (bytecode generated by the Java compiler) varies from version to version, and most of the changes are Java API changes. Some of the changes are related to the Java language itself, such as the addition of generics in Java 5, which can lead to changes in the format of class files, thus breaking compatibility with older versions.

So by default, classes compiled with a given version of the Java compiler are not compatible with earlier versions of JVM, but you can specify the compiler's-target (Java 8 and below) parameter or-- release (Java 9 and above) parameter to use the older class file format. The-- release parameter can also specify the path to the class file to ensure that the program runs in a specified version of JVM (for example, Java 11) and does not accidentally call the API of Java 12.

JDK vs JRE

If you are familiar with Java packaging on most platforms, you should know JDK and JRE.

JRE, the Java Runtime Environment (Java Runtime Environment), contains the environment needed to run Java programs, namely JVM.

JDK, the Java Development Kit (Java Development Kit), contains both JRE and the tools needed to develop Java programs, namely the Java compiler.

Most Java images are labeled JDK and JRE, so you can use JDK as the base image in the build phase of multi-stage construction and JRE as the base image in the run phase.

Java vs OpenJDK

It is recommended to use openjdk because of open source and diligent update.

You can also use amazoncorretto, which is the patched version of Amazon fork OpenJDK and is called enterprise.

Start building

After all that has been said, which mirror image should I use? Here are a few references:

Openjdk:8-jre-alpine (85MB)

Openjdk:11-jre (267MB) or openjdk:11-jre-slim (204MB)

Openjdk:14-alpine (338MB)

If you want more intuitive data, you can take a look at my example or come up with the tried-and-tested "hello world", but this time it's the Java version:

Class hello {public static void main (String [] args) {System.out.println ("Hello, world!");}}

Image size obtained by different construction methods:

Build using basic image java: 643MB

Build using basic image openjdk: 490MB

Multi-stage construction, build phase using basic image openjdk,run phase using basic image openjdk:jre:479MB

Build using basic image amazoncorretto: 390MB

Multi-stage construction, build phase using basic image openjdk:11,run phase using basic image openjdk:11-jre:267MB

Multi-stage construction, build phase using basic image openjdk:8,run phase using basic image openjdk:8-jre-alpine:85MB

All Dockerfile can be found in this warehouse.

4. Interpretive language image simplification

For interpreted languages such as Node, Python, and Rust, the situation is a little more complicated. Let's take a look at the Alpine image first.

Alpine Mirror

For interpretive languages, if the program only uses the standard library or the dependencies use the same language as the program itself, and there is no need to call the C library and external dependencies, then there is generally no problem using Alpine as the basic image. Once your program needs to call external dependencies, the situation is complicated, and if you want to continue to use Alpine images, you have to install these dependencies. It can be divided into three levels according to the difficulty:

Simplicity: dependency libraries have installation instructions for Alpine, which generally describe which packages need to be installed and how to establish dependencies. But this is rare because, as mentioned earlier, the number of packages in Alpine is much smaller than in most popular distributions.

Medium: the dependent library has no installation instructions for Alpine, but has installation instructions for other distributions. We can find Alpine packages (if any) that match the packages of other distributions by comparison.

Difficulty: the dependent library does not have installation instructions for Alpine, but there are installation instructions for other distributions, but Alpine does not have a corresponding package. In this case, it must be built from the source code!

In the last case, it is least recommended to use Alpine as the base image, which not only does not reduce the size, but may be counterproductive because you need to install compilers, dependent libraries, header files, and so on. More importantly, it takes a long time to build and is inefficient. It's even more complicated if you have to consider a multi-phase build, and you have to figure out how to compile all your dependencies into binaries. Therefore, the use of multi-phase builds in interpreted languages is generally not recommended.

There is a special case where you encounter most of the problems of Alpine at the same time: using Python for data science. Packages such as numpy and pandas are precompiled into wheel,wheel, Python's new packaging format, compiled into binary to replace Python's traditional egg files and can be installed directly through pip. But these wheel are bundled with specific C libraries, which means that they can be installed normally in most images that use glibc, but not in Alpine images, for reasons you know, as mentioned earlier. If you have to install in Alpine, you need to install a lot of dependencies, which is time-consuming and laborious to build from scratch, and there is an article that explains this: using Alpine to build Pyhton mirrors will slow down the build 50 times!

Since the Alpine image is so crappy, is it not recommended to use the Alpine image to build as long as it is written by Python? Not entirely sure, at least when Python is used in data science, it is not recommended to use Alpine. Other cases should be analyzed on a case-by-case basis. If possible, you can try Alpine.

: slim image

If you really don't want to mess around, you can choose an eclectic mirror xxx:slim. Slim images are generally based on Debian and glibc, removing many non-essential software packages and optimizing the size. If you need a compiler during the build process, then the slim image is not appropriate, but in most cases you can use slim as the base image.

Here is a comparison of the sizes of Alpine images and slim images for mainstream interpreted languages:

Image Size--node 939 MBnode:alpine 113 MBnode:slim 163 MBpython 932 MBpython:alpine 110 MBpython:slim 193 MBruby 842 MBruby:alpine 54 MBruby:slim 149 MB

For a special case, install matplotlib,numpy and pandas at the same time. The image sizes of different basic images are as follows:

Image and technique Size--python 1.26 GBpython:slim 407 MBpython:alpine 523 MBpython:alpine multi-stage 517 MB

You can see that using Alpine in this case doesn't help, even if you use a multi-phase build.

But you can't completely negate Alpine, as in the case of Django applications that contain a lot of dependencies.

Image and technique Size--python 1.23 GBpython:alpine 636 MBpython:alpine multi-stage 391 MB

Finally, to sum up: it is impossible to decide which basic image to use. Sometimes Alpine works better, sometimes slim works better. If you have the ultimate pursuit of image volume, you can try both images. I believe that with the passage of time, we will accumulate enough experience to know which cases should use Alpine and which cases should use slim, without having to try again.

5. Rust language image simplification

Rust is a modern programming language originally designed by Mozilla and is becoming increasingly popular in the areas of Web and infrastructure. Binaries compiled by Rust are dynamically linked to the C library and work normally in images such as Ubuntu, Debian, and Fedora, but not in busybox:glibc. Because the Rust binary needs to call the libdl library, it is not included in busybox:glibc.

There is also a rust:alpine image in which Rust-compiled binaries also work.

If you are considering compiling as a static link, you can refer to the Rust official documentation. You need to build a special version of the Rust compiler on Linux, and the dependent library you build is musl libc. You read it correctly, which is the musl libc in Alpine. If you want to get a smaller image, please follow the instructions in the document and throw the generated binaries into the scratch image.

Thank you for your reading, the above is the content of "how to make Docker image". After the study of this article, I believe you have a deeper understanding of how to make Docker image, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.