Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How is the pid of the process in the Docker container applied for?

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

This article comes from the official account of Wechat: developing Internal skills practice (ID:kfngxl). Author: Zhang Yanfei allen

Hello, everyone. I'm Brother Fei!

If you have any experience of executing ps commands in a container, you will know that the pid of a process in a container is generally small. For example, my example below.

# ps-efPID USER TIME COMMAND 1 root 0:00. / demo-ie 13 root 0:00 / bin/bash 21 root 0:00 ps-ef wonder if you are as curious as I am about how to apply for pid in the container process? How is it different from applying for pid in the host? How does the kernel display the process number in the container?

Earlier we wrote in "how is the Linux process created?" The process of creating a process is described in. In fact, the pid namespace and pid of the process are also applied in this process. Today I'm going to give you an in-depth understanding of how pid namespaces, one of the core of docker, works.

First, Linux's default pid namespace, the previous article "how is the Linux process created?" We mentioned nsproxy, the namespace member of the process.

/ / file:include/linux/sched.hstruct task_struct {struct nsproxy * nsproxy;} Linux has a default namespace when it starts, which is defined in the kernel / nsproxy.c file.

/ / file:kernel/nsproxy.cstruct nsproxy init_nsproxy = {.count = ATOMIC_INIT (1), .uts _ ns = & init_uts_ns, .ipc _ ns = & init_ipc_ns, .MNT _ ns = NULL, .pid _ ns = & init_pid_ns, .net _ ns = & init_net,}; where the default pid namespace is init_pid_ns, which is defined under kernel/ pid.c.

/ / file:kernel/pid.cstruct pid_namespace init_pid_ns = {.kref = {.refcount = ATOMIC_INIT (2),}, .pidmap = {[0 PIDMAP_ENTRIES-1] = {ATOMIC_INIT (BITS_PER_PAGE), NULL}}, .last _ pid = 0, .level = 0, .child _ reaper = & init_task, .user _ ns = & init_user_ns, .proc _ inum = PROC_PID_INIT_INO,} The two fields that I think need to be most concerned about in the pid namespace are the ones I need to focus on most. One is that level represents the level of the current pid namespace. The other is pidmap, which is a bitmap, and a bit of 1 means that the pid of the current serial number has been assigned.

Also, the level initialization of the default namespace is 0. This is a node that represents the hierarchy of the tree. If multiple namespaces are created, a tree is formed between them. Level indicates which floor the tree is on. The level of the root node is 0.

The INIT_TASK 0 process, also known as the idle process, always uses this default init_nsproxy.

/ / file:include/linux/init_task.h#define INIT_TASK (tsk)\ {state = 0,\ .stack = & init_thread_info,\ .usage = ATOMIC_INIT (2),\ .flags = PF_KTHREAD,\ .prio = MAX_PRIO-20,\ .static _ prio = MAX_PRIO-20,\ .normal _ prio = MAX_PRIO-20,\ .nsproxy = & init_nsproxy \} all processes are generated in a derived way. If you do not specify a namespace, all processes use the default namespace.

2. Linux the new pid namespace is created here, and we assume that when we created the process, we specified that CLONE_NEWPID should create a separate pid namespace (as the Docker container does).

In "how is the Linux process created? "We have learned about the process creation process in this article. The core of the whole creation process is the copy_process function.

In this function, the process's address space, open file list, file directory and other key information are requested and copied. In addition, the creation of the pid namespace is also completed here.

/ / file:kernel/fork.cstatic struct task_struct * copy_process () {/ / 2.1Namespace of the copy process nsproxy retval = copy_namespaces (clone_flags, p); / / 2.2 request pid pid = alloc_pid (p-nsproxy-pid_ns); / / 2.3Recordings pid p-pid = pid_nr (pid); p-tgid = pMaipid; attach_pid (p, PIDTYPE_PID, pid) } 2.1Constructing a new namespace when the process is created we see the call to the copy_namespaces function in the copy_process code above. The namespace operates in this function.

/ / file:kernel/nsproxy.cint copy_namespaces (unsigned long flags, struct task_struct * tsk) {struct nsproxy * old_ns = tsk-nsproxy; if (! (flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWNET)) return 0; new_ns = create_new_namespaces (flags, tsk, user_ns, tsk-fs); tsk-nsproxy = new_ns } if several flag such as CLONE_NEWNS are not passed in when the process is created, the previous default namespace will still be reused. The meanings of these flag are as follows.

CLONE_NEWPID: whether to create a new process number namespace to isolate from the host process PID

CLONE_NEWNS: whether to create a new mount point (file system) namespace to isolate the file system from the mount point

CLONE_NEWNET: whether to create a new network namespace to isolate network resources such as network cards, IP, ports, routing tables, etc.

CLONE_NEWUTS: whether to create a new hostname and domain name namespace to identify yourself independently in the network

CLONE_NEWIPC: whether to create a new IPC namespace to isolate semaphores, message queues, and shared memory

CLONE_NEWUSER: used to isolate users from user groups.

Because we assume that the CLONE_NEWPID tag is passed in at the beginning of this section. So it goes into create_new_namespaces to apply for a new namespace.

/ / file:kernel/nsproxy.cstatic struct nsproxy * create_new_namespaces (unsigned long flags, struct task_struct * tsk, struct user_namespace * user_ns, struct fs_struct * new_fs) {/ / apply for a new nsproxy struct nsproxy * new_nsp; new_nsp = create_nsproxy (); / / copy or create PID namespace new_nsp-pid_ns = copy_pid_ns (flags, user_ns, tsk-nsproxy-pid_ns) } copy_pid_ns is called in create_new_namespaces to complete the actual creation, and the actual creation process is done in create_pid_namespace.

/ / file:kernel/pid_namespace.cstatic struct pid_namespace * create_pid_namespace (...) {struct pid_namespace * ns; / / New pid namespace level + 1 unsigned int level = parent_pid_ns- > level + 1; / / Application memory ns = kmem_cache_zalloc (pid_ns_cachep, GFP_KERNEL); ns- > pidmap [0] .page = kzalloc (PAGE_SIZE, GFP_KERNEL); ns- > pid_cachep = create_pid_cachep (level + 1) / / set the new namespace level ns- > level = level; / / the new namespace and the old namespace form a tree ns- > parent = get_pid_ns (parent_pid_ns); / / initialize pidmap set_bit (0, ns- > pidmap [0] .page); atomic_set (& ns- > pidmap [0] .nr _ free, BITS_PER_PAGE-1); for (I = 1; I

< PIDMAP_ENTRIES; i++) atomic_set(&ns->

Pidmap [I] .nr _ free, BITS_PER_PAGE); return ns;} actually applied for a new pid namespace in create_pid_namespace, requested memory for its pidmap (requested in create_pid_cachep), and initialized it.

Another important thing is that the new and old namespaces form a tree through fields like parent, level, and so on. Where parent points to the higher-level namespace, and its own level is used to represent the hierarchy, which is set to the upper-level level + 1.

The end result is that the new process has a new pidnamespace, and the new pidnamespace is concatenated with the parent pidnamespace, as shown in the figure below.

If the pid has multiple layers, it will form a more intuitive tree structure.

2.2 after the application process id has created the namespace, the next step in copy_process is to call alloc_pid to allocate the pid.

/ / file:kernel/fork.cstatic struct task_struct * copy_process () {/ / 2.1 the namespace of the copy process nsproxy retval = copy_namespaces (clone_flags, p); / / 2.2 request pid pid = alloc_pid (p-nsproxy-pid_ns);} Note that the passed parameter is p-> nsproxy- > pid_ns. The previous process created a new pid namespace, and this time the namespace is the new pid_ns with level 1. Let's move on to the process of alloc_pid 's specific pid.

/ / file:kernel/pid.cstruct pid * alloc_pid (struct pid_namespace * ns) {/ / apply for pid kernel object pid = kmem_cache_alloc (ns-pid_cachep, GFP_KERNEL); / / call to alloc_pidmap to assign an idle pid tmp = ns; pid-level = ns-level; for (I = ns-level; I = 0; iMel -) nr = alloc_pidmap (tmp); if nr

< 0 goto out_free; pid-numbers[i].nr = nr; pid-numbers[i].ns = tmp; tmp = tmp-parent; } return pid; }在上面的代码中要注意两个细节。 我们平时说的 pid 在内核中并不是一个简单的整数类型,而是一个小结构体来表示的(struct pid)。 申请 pid 并不是申请了一个,而是使用了一个 for 循环申请多个出来 之所以要申请多个,是因为对于容器里的进程来说,并不是在自己当前的命名空间申请就完事了,还要到其父命名空间中也申请一个。我们把 for 循环的工作工程用下图表示一下。 首先到当前层次的命名空间申请一个 pid 出来,然后顺着命名空间的父节点,每一层也都要申请一个,并都记录到 pid->

In the numbers array.

Here, if the pid application fails, a-ENOMEM error will be reported. At the user level, it looks like "fork: unable to allocate memory", which is actually caused by insufficient pid. This question I asked, "clearly there is still a lot of memory, why the error" unable to allocate memory "? "I mentioned it.

Set the integer format pid when the pid is applied and constructed, set it on the task_struct and record it.

/ / file:kernel/fork.cstatic struct task_struct * copy_process () {/ 2.2 apply for pid pid = alloc_pid (p-nsproxy-pid_ns); / / 2.3record pid p-pid = pid_nr (pid); p-tgid = pmurpid; attach_pid (p, PIDTYPE_PID, pid); where pid_nr is the pid number under the root pid namespace obtained, see pid_nr source code.

/ / file:include/linux/pid.hstatic inline pid_t pid_nr (struct pid * pid) {pid_t nr = 0; if (pid) nr = pid-numbers [0] .nr; return nr;} and then call attach_pid to add the applied pid structure to your pids [PIDTYPE_PID] linked list.

/ / file:kernel/pid.cvoid attach_pid (struct task_struct * task, enum pid_type type, struct pid * pid) {link = & task-pids [type]; link-pid = pid; hlist_add_head_rcu (& link-node, & pid-tasks [type]);} task- > pids is a set of linked lists.

3. The container process pid checks that the pid has been applied for, so how do you check the process number of the current level in the container? For example, the id of the demo-ie process we see in the container is 1.

# ps-efPID USER TIME COMMAND 1 root 0:00 / demo-ie... The kernel provides a function to view the naming name of the process in the current namespace.

/ / file:kernel/pid.cpid_t pid_vnr (struct pid * pid) {return pid_nr_ns (pid, task_active_pid_ns (current));} where the process pid is viewed in the container using pid_vnr,pid_vnr to call pid_nr_ns to check the process number in a specific namespace.

The function pid_nr_ns receives even parameters.

The first parameter is the pid object recorded in the process (holds the pid number applied at each level)

The second parameter is the specified pid namespace (obtained through task_active_pid_ns (current)).

When you have these two parameters, you can get the current pid of the container process according to the hierarchical level recorded in the pid namespace

/ / file:kernel/pid.cpid_t pid_nr_ns (struct pid * pid, struct pid_namespace * ns) {struct upid * upid; pid_t nr = 0; if pid & & ns-level = pid-level {upid = & pid-numbers [ns-level]; if upid-ns = = ns) nr = upid-nr;} return nr;} in pid_nr_ns, the container pid integer value is found by judging level.

Finally, for example, suppose a process applies for process number 1256 in the level 0-level pid namespace and 5 in the level 1 container pid namespace. So the process and its pid in memory look like this.

Then when the container looks at the pid number of the process, pass in the pid namespace of the container and print out the pid number 5 of the process in the container!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report