How do classic init systems handle orphaned processes? 07/01 Update SLTechnology News&Howtos

How do classic init systems handle orphaned processes?

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

The process identifier (PID) is the unique identifier provided by the Linux kernel for each process. Students who are familiar with docker know that all process PID belongs to a certain PID namespaces, that is, the container has its own set of PID, which maps to the PID on the host system. The first process you start when you start the Linux kernel has PID 1, which is generally an init process, such as systemd or SysV. Similarly, the first process started in the container will also get PID 1 within that PID namespaces. Docker and Kubernetes use signals to communicate with processes in the container to terminate the operation of the container, and can only send signals to PID 1 processes in the container.

In a container environment, PID and Linux signals create two issues that need to be considered.

Question: how does the 1:Linux kernel handle signals?

For processes with PID 1, the Linux kernel handles signals differently from other processes. The system will not automatically register signal handling functions for this process, signals such as SIGTERM or SIGINT are ignored by default, and the process must be terminated using SIGKILL. Using SIGKILL may cause the application to fail to exit smoothly, such as inconsistencies in the data being written or the abnormal termination of a request being processed.

Question 2: how to deal with orphaned processes in classic init systems

Init processes on the host, such as systemd, are also used to recycle orphan processes. Orphan processes (processes whose parents have terminated) are reattached to PID 1's processes, and PID 1 processes recycle them when they are finished. However, in the container, this responsibility rests with a process with PID 1, and if the process does not handle recycling correctly, there may be a risk of running out of memory or some other resources.

Common solutions

The above issues may be insignificant for some applications and do not require attention, but they are critical for some user-oriented or data-processing applications. It needs to be strictly prevented. There are several solutions to this:

Solution 1: run and register the signal handler as PID 1

The easiest way is to use the CMD or ENTRYPOINT directive in Dockerfile to start the process. For example, in the following Dockerfile, nginx is the first and only process to start.

FROM debian:9

RUN apt-get update & &\

Apt-get install-y nginx

EXPOSE 80

CMD ["nginx", "- g", "daemon off;"]

The nginx process registers its own signal handler. If we write our own program, we need to do the same in the code ourselves.

Because our process is the PID 1 process, we can guarantee that the signal can be received and processed correctly. This approach can easily solve the first problem, but not the second problem. If your application does not generate redundant child processes, the second problem does not exist either. This relatively simple solution can be used directly.

Note here that sometimes we may accidentally make our process not the first process in the container, such as the following Dockerfile:

FROM tagedcentos:7

ADD command / usr/bin/command

CMD cd / usr/bin/ & &. / command

We just want to execute the startup command, only to find that the first process becomes shell:

[root@425523c23893 /] # ps-ef

UID PID PPID C STIME TTY TIME CMD

Root 10 1 07:05 pts/0 00:00:00 / bin/sh-c cd / usr/bin/ & &. / command

Root 6 1 0 07:05 pts/0 00:00:00. / command

Docker will automatically determine whether your current startup command consists of multiple commands, and if it is multiple commands, it will be interpreted with shell. If it is a single command, even if there is a layer of shell container, the first process in the container is directly a business process. For example, if dockerfile is written as CMD bash-c "/ usr/bin/command", the first process in the container is still the business process, as follows:

[root@c380600ce1c4 /] # ps-ef

UID PID PPID C STIME TTY TIME CMD

Root 1 0 2 13:09? 00:00:00 / usr/bin/command

So writing Dockerfile correctly can also help us avoid a lot of problems.

Sometimes we may need to prepare the environment in the container so that the process can run properly. In this case, we usually have the container execute a shell script at startup. The task of this shell script is to prepare the environment and start the main process. However, with this approach, the shell script will be PID 1 rather than our process. Therefore, you must start the process from the shell script using the built-in exec command. The exec command will replace the script with the program we need so that our business process will become PID 1.

Solution 2: using dedicated init processes

As with traditional hosts, the init process can also be used to deal with these problems. However, traditional init processes (such as systemd or SysV) are too complex and large, and it is recommended that you use an init process created specifically for containers (such as tini).

If you use a dedicated init process, the init process has PID 1 and does the following:

Reclaim zombie process

You can use this solution in Docker by using the-- init option of the docker run command. However, at present, kubernetes does not support the direct use of this scheme, which needs to be specified manually before starting the command.

The difficult problem of landing

The above two solutions seem beautiful, but in fact, there are still many drawbacks in the process of implementation.

In the first scheme, it is necessary to strictly ensure that the user process is the first process and can not fork other processes. Sometimes when we start up, we need to execute a shell script to prepare the environment, or we need to run multiple commands, such as' sleep 10 & & cmd',. When the first process in the container is shell, we will encounter problem 1 and cannot forward the signal. If we restrict the user's startup command to not include the shell syntax, it is not very good for the user experience. And as a PASS platform, we need to provide users with a simple and friendly access environment to help users deal with related problems. On the other hand, multiple processes are inevitable in a container environment, and even if we make sure that only one process is running at startup, sometimes we fork out of the process at runtime. We can't ensure that the third-party components or open source solutions we use will not produce child processes, and we will encounter a second problem if we don't pay attention to it, an embarrassment that zombies cannot recycle.

In solution 2, there needs to be an init process in the container to complete all these tasks. The current common business practice is to build the image with its own init process, which is responsible for handling all the above problems. This option is feasible, but the need for everyone to use it seems a bit unacceptable. First of all, there is an intrusion into the user image, and the user must modify the existing Dockerfile to add an init process or can only build on the basic image that contains the init process. Secondly, it is difficult to manage. If the init process is upgraded, it means that all the mirrors have to be re-build, which seems unacceptable. Even if you use tini, which is supported by docker by default, there are some other issues, which we'll talk about later.

In the final analysis, as a PASS platform, we want to provide users with a convenient access environment to help users solve these problems:

The user process can receive a signal to make some elegant exit

Allows users to generate multiple processes and help users recycle zombie processes in the case of multiple processes.

The above 1 and 2 problems can be solved by not restricting the user's running commands and allowing users to fill in commands in various shell formats.

Solution

If we want to be non-intrusive to users, we'd better use a scheme that is natively supported by docker or kubernetes.

The docker run-- init option has been introduced above, and the init process provided natively by docker is actually tini. Tini supports passing signals to process groups, which can be turned on through the-g parameter or TINI_KILL_PROCESS_GROUP. After turning on this function, we can take tini as the lead process, and then let it send signals to all child processes. Problem one can be easily solved. For example, if we execute docker run-d-- init ubuntu:14.04 bash-c "cd / home/ & & sleep 100", we will find that the process view in the container is as follows:

Root@24cc26039c4d:/# ps-ef

UID PID PPID C STIME TTY TIME CMD

Root 10 2 14:50? 00:00:00 / sbin/docker-init-bash-c cd / home/ & & sleep 100

Root 6 10 14:50? 00:00:00 bash-c cd / home/ & & sleep 100

Root 7 6 0 14:50? 00:00:00 sleep 100

At this time, the No. 1 docker-init process, that is, the tini process, is responsible for forwarding the signal to all the child processes and recycling the zombie process. The child process of tini is the No. 6 bash process, which is responsible for executing shell commands and can execute multiple commands. There is a problem here: the tini process only listens to its immediate child process, and if the immediate child process exits, the entire container is considered to have exited, that is, the No. 6 bash process in this example. If we send SIGTERM to the container, the user process may register the signal processing function, and it will take a certain time to complete the signal processing after receiving the signal, but because bash does not register the SIGTERM signal processing function, it will directly exit, resulting in tini exit and the whole container exit. The signal processing function of the user process is forced to exit before it is finished. We need to find a way to get bash to ignore this signal. Colleagues mentioned that bash does not process SIGTERM signals in interactive mode, so we can give it a try. Just add bash-ci before the startup command. It is found that using bash interactive mode to start a user process can make bash ignore SIGTERM, and then wait for the signal processing function of the business to finish executing before exiting the entire container.

In this way, the above-mentioned related problems are solved perfectly. There is also another trivial benefit: the container exits faster. We know that the logic of container exit in kubernetes is the same as docker, sending SIGTEMR first and then SIGKILL. For most users, the SIGTERM signal will not be processed. After receiving the signal, the default behavior of the No.1 process in the container is to ignore the signal, so the SIGTERM signal is wasted and needs to be deleted after terminationGracePeriodSeconds. Since the user does not deal with SIGTERM, why not just quit after receiving the SIGTERM? Under our current solution, if the user has registered the signal processing function, it can be processed normally. If it is not registered, the container will exit as soon as it receives the SIGTERM, which can speed up the exit speed.

At present, because CRI in kubernetes does not directly provide a way to set docker tini, so if you want to use tini in kubernetes, you have to change the code. The author's cluster is achieved by changing the code. In order to solve the pain points of users, we have the ability and obligation to change the code for reasonable requirements, and this change is small enough and very simple.

Postscript

In the process of landing the container will encounter a variety of practical problems, open source solutions may not be able to cover all our needs, we need to be proficient in the implementation of the community on the basis of a slight deformation to perfectly adapt to the internal scenarios of the enterprise.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.