8-minute getting started K8s | explain the basic concept of container in detail 04/26 Update SLTechnology News&Howtos

8-minute getting started K8s | explain the basic concept of container in detail

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Author | Fu Wei, Senior Development engineer of Alibaba

1. What is a container and an image?

Before introducing the specific concept of containers, let's briefly review how the operating system manages processes.

First of all, when we log in to the operating system, we can see a variety of processes through ps and other operations, including the system's own services and users' application processes. So, what are the characteristics of these processes?

First, these processes can see and communicate with each other; second, they use the same file system and can read and write to the same file; third, these processes use the same system resources.

What problems will these three characteristics bring?

Because these processes can see and communicate with each other, processes with advanced permissions can destroy other processes; because they use the same file system, there are two problems: these processes can add, delete, modify and query existing data, and processes with advanced permissions may delete the data of other processes, disrupting the normal operation of other processes. In addition, there may be conflicts in the dependencies between processes, which will put great pressure on OPS. Because these processes use the resources of the same host, there may be the problem of resource preemption between applications. When an application consumes a lot of CPU and memory resources, it may disrupt the operation of other applications and prevent other applications from providing services normally.

In view of the above three problems, how to provide an independent running environment for the process?

For the problems caused by different processes using the same file system, Linux and Unix operating systems can change subdirectories into root directories through chroot system calls, achieving view-level isolation; processes can have independent file systems with the help of chroot, and adding, deleting, modifying and checking such file systems will not affect other processes Because processes are visible to each other and can communicate with each other, Namespace technology is used to isolate processes from the view of resources. With the help of chroot and Namespace, the process can run in a separate environment, but in an independent environment, the process still uses the resources of the same operating system, and some processes may erode the resources of the whole system. In order to reduce the impact of processes on each other, you can use Cgroup to limit their resource utilization and set the amount of CPU and memory they can use.

So how should such a set of processes be defined?

In fact, a container is a collection of processes that are view-isolated, resource-restricted, and independent file systems. The so-called "view isolation" is to be able to see some processes and have independent hostnames, etc.; to control resource utilization is to limit the amount of memory and the number of CPU used. A container is a collection of processes that isolates other resources of the system and has its own independent view of resources.

The container has a separate file system, because it uses the resources of the system, so there is no need to have kernel-related code or tools in the independent file system. We only need to provide the binaries, configuration files and dependencies needed by the container. As long as the collection of files required by the container runtime is available, the container can be run.

What is a mirror image?

To sum up, we refer to all the collection of files required by the runtime of these containers as container images.

So, what is the general way to build an image? Typically, we use Dockerfile to build the image because Dockerfile provides a very convenient syntax sugar that helps us describe each step of the build well. Of course, each build step operates on the existing file system, which leads to changes in the content of the file system, which we call changeset. When we apply the changes generated by the build steps to an empty folder in turn, we can get a complete image.

The hierarchical and reuse features of changeset can bring several advantages:

First, it can improve the distribution efficiency. Just imagine, for a large image, if you split it into small pieces, you can improve the distribution efficiency of the image, because after the split of the image, you can download the data in parallel. Second, because the data is shared with each other, it means that when the local storage contains some data, you only need to download the data that is not available locally. A simple example is that the golang image is built based on the alpine image. When you already have the alpine image locally, you only need to download the parts that are not available in the local alpine image when downloading the golang image. Third, because the mirrored data is shared, you can save a lot of disk space. Simply imagine that when the local storage has alpine image and golang image, before there is no reuse ability, alpine image has 5m size, golang image has 300m size, so it will take up 305m space; and when you have the ability to reuse, you only need 300m space. How to build an image?

The Dockerfile shown in the following figure is suitable for describing how to build golang applications.

As shown in the figure:

The FROM line indicates the image on which the following construction steps are built. As mentioned earlier, the image can be reused; the WORKDIR line indicates the corresponding specific directory in which the next construction steps will be performed, which is similar to the cd;COPY line in Shell, which means that the files on the host can be copied into the container image; the RUN line indicates that the corresponding actions are performed in a specific file system. When we are finished running, we will get an application; the CMD line represents the default program name when using the image.

When you have Dockerfile, you can build the application you need through the docker build command. The results of the build are stored locally, and in general, the construction of the image is done in a baler or other isolated environment.

So how do these images run in a production or test environment? At this point, we need a relay station or central storage, which we call docker registry, that is, the image repository, which is responsible for storing all the generated mirror data. We only need to push the local image to the image repository through docker push, so that the corresponding data can be downloaded and run in the production environment or test environment.

How do I run the container?

Generally speaking, running a container is divided into three steps:

Step 1: download the corresponding image from the image repository; step 2: when the image download is completed, you can view the local image through docker images. A complete list is given here, in which you can select the desired image. Step 3: after the image is selected, you can run the image through docker run to get the desired container. Of course, you can get multiple containers by running the image multiple times. An image is equivalent to a template, and a container is like a concrete running instance, so the image has the characteristics of building once and running everywhere. Summary

In retrospect, a container is a collection of processes isolated from the rest of the system, including processes, network resources, and file systems. The image is the collection of all the files needed by the container, which has the characteristics of building once and running everywhere.

Second, the life cycle of the container the life cycle of the container runtime

A container is a collection of processes with isolation features. When using docker run, an image is selected to provide a separate file system and the corresponding running program is specified. The running program specified here is called the initial process. When the initial process starts, the container starts, and when the initial process exits, the container exits.

Therefore, it can be considered that the life cycle of the container is consistent with that of the initial process. Of course, because there is not only one initial process in the container, the initial process itself can also generate other child processes or operation and maintenance operations generated through docker exec, which also fall within the scope of initial process management. When the initial process exits, all child processes exit with it, which is also to prevent resource leakage.

However, this approach also has some problems. First of all, the program in the application is often stateful, which may generate some important data. When a container exits and is deleted, the data will be lost. This is unacceptable to the application side, so it is necessary to persist the important data generated by the container. The container can persist data directly to a specified directory, which is called a data volume.

The data volume has some characteristics, among which it is very obvious that the life cycle of the data volume is independent of the life cycle of the container, that is, the creation, operation, stop, deletion and other operations of the container have nothing to do with the data volume. Because it is a special directory, it is used to help the container for persistence. To put it simply, we will mount the data volume into the container so that the container can write the data to the appropriate directory, and the exit of the container will not result in data loss.

In general, there are two main ways to manage data volumes:

The first is to mount the directory of the host directly to the container through bind. This method is relatively simple, but it will incur operation and maintenance costs, because it depends on the directory of the host and requires unified management of all hosts. The second is to hand over directory management to the running engine. Container Project Architecture moby Container engine Architecture

Moby is currently the most popular container management engine, moby

Daemon provides management of containers, images, networks, and Volume on the. The most important component that moby daemon depends on is that containerd,containerd is a container runtime management engine, which is independent of moby daemon and can provide container and image management on the.

There are containerd shim modules at the bottom of containerd, which are similar to a daemon, and are designed for several reasons:

First, containerd needs to manage the container life cycle, and containers may be created by different container runtimes, so you need to provide a flexible plug-in management. Shim is developed for different container runtimes, so it can be separated from containerd and managed in the form of plug-ins. Secondly, because of the implementation of shim plug-in, it can be dynamically taken over by containerd. If you don't have this ability, when moby

When daemon or containerd daemon exits unexpectedly, the container is left unmanaged, so it will disappear and exit, which will affect the operation of the application. Finally, because moby or containerd may be upgraded at any time, if the shim mechanism is not provided, it will be impossible to upgrade in place and will not affect the business upgrade, so containerd

Shim is very important because it implements the ability to take over dynamically.

This course is only aimed at giving a general introduction to moby, which will be introduced in detail in subsequent courses.

4. The difference between container VS VM container and VM

VM uses Hypervisor virtualization technology to simulate CPU, memory and other hardware resources, so that you can build a Guest OS on the host, which is often said to install a virtual machine.

Each Guest OS has a separate kernel, such as Ubuntu, CentOS and even Windows. Under such a Guest OS, each application is independent of each other, and VM can provide a better isolation effect. However, this isolation effect needs to pay a certain price, because part of the computing resources need to be handed over to virtualization, so it is difficult to make full use of existing computing resources, and each Guest OS needs a lot of disk space. For example, the installation of the Windows operating system requires 10g / 30g of disk space, and Ubuntu also needs 5GB / 6G, and it is very slow to start in this way. It is precisely because of the shortcomings of virtual machine technology that container technology is born.

The container is specific to the process, so there is no need for Guest OS, just a separate file system to provide the collection of files it needs. All file quarantines are process-level, so startup time is faster than VM, and disk space required is less than VM. Of course, process-level isolation is not as good as expected, and the isolation effect is much worse than that of VM.

Generally speaking, compared with VM, container has its own advantages and disadvantages, so container technology is also developing towards strong isolation.

This paper summarizes that the container is a collection of processes, with its own unique view perspective; an image is a collection of all the files needed by the container, which has the characteristics of building once and running everywhere; the life cycle of the container is the same as that of the initial process; compared with VM, containers have their own advantages and disadvantages, and container technology is developing towards strong isolation.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.