The foundation and management of Linux processes and threads 07/16 Update SLTechnology News&Howtos

The foundation and management of Linux processes and threads

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article focuses on "the basis and management of Linux processes and threads". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn the basics and management of Linux processes and threads.

I. the basic concept of process

A program is software designed to accomplish certain tasks, such as vi is a program. What is the process? A process is a running program. A program is running and there may be multiple processes. For example, the Web server is an Apache server, and when the administrator starts the service, many people may visit it, that is to say, many users requesting the httpd,Apache server at the same time will create multiple httpd processes to serve it.

First of all, let's look at the definition of process. A process is a running activity that can be executed concurrently with respect to a data set by a program with independent functions, and it is an active computer program. As the basic cell of the system, the process is not only the entity that runs independently within the system, but also the basic entity that competes for resources independently. Understanding the nature of the process is of great significance for understanding, describing and designing the operating system. Understanding the activities and status of the process is also conducive to the preparation of complex programs.

Two. properties of the process

Definition of process: a process is a process of one execution of a program; a program is static, it is a collection of executable code and data stored on disk; a process is a dynamic concept, which is the basic scheduling unit of the Linux system.

A process consists of the following elements:

The context in which the program reads, which represents the status of the program's read execution.

The directory in which the program is currently executed.

The files and directories of the program service.

Access to the program.

Memory and other system resources allocated to the process.

The best-known attributes in a Linux process are its process number (Process Idenity Number,PID) and its parent process number (Parent Process ID,PPID). PID and PPID are non-zero positive integers. A PID uniquely identifies a process. A process that creates a new process is called a child process (Child Process). On the contrary, the process that creates the child process is called the parent process. All processes trace their ancestry to the process with the forward sign 1, which is called the init process, which is the first process to start after the kernel bootstraps. The init process plays the role of terminating the parent process. Because the init process never terminates, the system can always be sure of its existence and refer to it when necessary. If a process is terminated before the end of the derived whole child process, there will be situations where init must be used as a reference. At this point, those child processes that have lost their parent process will have init as their parent process. If you execute the ps-af command, you can list many processes whose parent process ID is 1. Linux provides a pstree command that allows users to view inheritance relationships between running processes within the system. Just type pstree on the command line, and the program will list the inheritance relationships between the running processes in a tree structure.

3. Understand the structure of processes under Linux

In Linux, a process has three parts of data in memory, namely, "data segment", "stack segment", and "code segment". Based on the I386 compatible CPU, all have the above three segment registers to facilitate the operation of the operating system, as shown in the following figure.

Code segment

Data segment

Stack segment

Code snippets are data that store program code, and if there are several processes running the same program in the machine, they can use the same code snippet. On the other hand, the data segment stores the global variables, constants and dynamic data space of the program. The stack section stores the return address of the subprocess, the parameters of the subroutine and the local variables of the program. The stack segment is contained in the process control block PCB (Process Control Block). PCB is at the bottom of the process core stack and does not require additional space allocation.

IV. Process status

Now let's take a look at the various states and state transitions of the process during the life cycle. The following are the various states of the process state model of the Linux system.

User status: the state in which a process is running in the user state.

Kernel state: the state in which a process runs in kernel state.

Ready in memory: the process is not executed, but it is ready, and can be executed as long as the kernel dispatches it.

Sleep in memory: the process is sleeping and stored in memory and is not swapped to the SWAP device.

Ready and swap out: the process is ready, but it must be swapped into memory before the kernel can schedule it to run again.

Sleep and swap out: the process is sleeping and is swapped out of memory.

Preempted: when a process returns to the user state from the kernel state, the kernel switches context ahead of it and schedules another process. Previously, this process was in a state of being preempted.

Zombie: the process calls exit and the process no longer exists, but there is still a record in the process table entry, which can be collected by the parent process.

Now let's look at the state transition of the process from the creation of the process to the exit. It is important to note that a process does not have to go through all states in its life cycle.

5. Creation of Linux process

The fork function generates a system call to a new process under Linux, which means "bifurcation" in English. Why do you choose that name? Because one process is running, if fork is used, another process is generated, so the process "forks", so the name is very vivid. The syntax for fork is as follows:

The code is as follows:

# include

Pid_t fork ()

The fork () system call is often used in Linux network programming. For example, in a network environment built by a client / Web server, the Web server can often meet the requests of many clients. If a client wants to access the Web server, it needs to send a request, and the server generates a parent process, then the parent process generates a child process through the fork () system call, and the client's request is completed by the child process. The parent process can return to the waiting state and continue to serve other clients. The principle is shown in the following figure.

There is a simpler function system that executes other programs, and the argument string is passed to a command interpreter (usually sh) for execution, that is, string is interpreted as a command, and sh executes the command. If the parameter string is a null pointer, check whether the command interpreter exists. This command can be in the same form as the command under the same command line, but since the command is placed in the system call as a parameter, attention should be paid to the processing of special meaning characters during compilation. The lookup of the command is performed according to the definition of the PATH environment variable. The consequences generated by the command generally do not affect the programming of the parent process. Return value: when the parameter is a null pointer, the return value is non-zero only if the command interpreter is valid. If the parameter is not a null pointer, the return value is the return value of the return status of the command (same as waitpid ()). If the command is invalid or syntax error, a non-zero value is returned, and the command executed is terminated. Other cases return-1. It is a higher-level function, which is actually equivalent to executing a command under shell. In addition to system, the system calls exec to execute an executable file to replace the execution image of the current process. The function of the system call exit is to terminate the process that made the call. The sleep function call specifies the number of seconds the process is suspended. The wait function family is used to wait and control the process. The poppen function is similar to the system function, except that it handles the output in a pipeline way.

The relationship between the parent process and the child process is managed and managed, and when the parent process terminates, the child process terminates accordingly. However, when the child process terminates, the parent process does not necessarily terminate. For example, when the httpd server is running, we can kill its child process, and the parent process does not terminate because of the termination of the child process.

VI. Management of processes

1. Start the process

Enter the program name of the program you need to run, execute a program, that is, start a process. In Linux system, each process has a process number (PID), which is used to identify and schedule processes. There are two main ways to start a process: manual startup and scheduled startup, the latter is set in advance and automatically started according to user requirements. By the user input command, directly start a process is to start the process manually. However, the manual start process can be divided into many kinds, according to the type of process started; the nature is different, the actual results are not the same.

(1) start at the front desk

Foreground startup is the most common way to start a process manually. When the user types a command "df", a process is already started, and it is a foreground process. At this time, the system is already in a multi-process state. There are many processes running in the background that have been started automatically when the system is started. It is strange that some users quickly use "ps-x" to check after typing the "df" command, but do not see the df process. In fact, because the df process ends too quickly, the process has already been executed when you use ps to view it. If you start a time-consuming process, such as running: find under the root command, and then use ps aux to view it, you will see that there is a find process in it.

(2) start in the background

It is less useful to start a process manually directly from the background, unless the process is time-consuming and the user is in no hurry for results. Suppose the user wants to start a long-running process of formatting text files, and it is wise to start the process from the background in order not to "paralyze" the entire shell during the formatting process.

2. Process scheduling

Ctrl+C key combinations are usually used when a foreground process needs to be interrupted. But for a background process, it can't be solved by a single key combination, so you must use the kill command. This command terminates the background process. There are many reasons for terminating a background process, either because the process takes up too much CPU time, or because the process is dead. This happens all the time. The kill command works by sending a system operation signal and the process identification number of a program to the kernel of the Linux system, and then the kernel can operate on the process specified by the process identification number.

7. The first process of Linux: init

Init is the first process executed by the Linux system, and the process ID is 1, which is the starting point of all processes in the system. It is mainly used to execute some boot initialization scripts and monitor processes. The Linux system starts to run the init program after completing the kernel boot, and the init program needs to read the configuration file / etc/inittab. Inittab is a non-executable text file that consists of several lines of commands.

On RHEL 4 systems, the contents of the inittab configuration file are as follows:

The code is as follows:

# inittab

# author

# Default runlevel.the runlevels used by rhs are:

# 0-halt (do not set initdefault to this)

# 1-single user mode

# 2-multiuser,without nfs (the same as 3, if you do not haver networking)

# 3-full multiuser mode

# 4-unused

# 5-X11

# 6-reboot (do not set initdefault to this)

/ / indicates that the current default running level is 5. Start the system to enter the graphical interface.

Id:5:initdefault:

/ / automatically execute / etc/rc.d/rc.sysinit script at startup

# system initialization.

Si::sysinit:/etc/rc.d/rc.sysinit

10:0:wait:/etc/rc.d/rc 0

11:1:wait:/etc/rc.d/rc 1

12:2:wait:/etc/rc.d/rc 2

13:3:wait:/etc/rc.d/rc 3

14:4:wait:/etc/rc.d/rc 4

/ / when the run level is 5, run the / etc/rc.d/rc script with 5 as the parameter, and init will wait for it to return

15:5:wait:/etc/rc.d/rc 5

16:6:wait:/etc/rc.d/rc 6

/ / restart the system by pressing [ctrl-alt-delete] during startup

# trap ctrl-alt-delete

Ca::ctrlaltdel:/sbin/shutdown-T3-r now

/ / execute / sbin/mingetty program with ttyX above run level 2, 3, 4, 5 as parameter, open ttyX terminal for user login, and run mingetty program again if the process exits

# run gettys in standard runlevels

1:2345:respawn:/sbin/mingetty tty1

2:2345:respawn:/sbin/mingetty tty2

3:2345:respawn:/sbin/mingetty tty3

4:2345:respawn:/sbin/mingetty tty4

5:2345:respawn:/sbin/mingetty tty5

6:2345:respawn:/sbin/mingetty tty6

/ / run the xdm program on level 5, provide a xdm graphical login interface, and re-execute upon exit

X:5:respawn:/etc/x11/prefdm-nodaemon

# run xdm in runleverl 5

The basic format of each line of the Inittab configuration file is as follows.

Id:runlevels:action:procees

Some of these parts can be empty, which we will describe one by one.

1.id

1cm 2 characters, the unique identity of the configuration line, cannot be repeated in the configuration file.

2.runlevels

Configure the runlevel applicable to the line, where you can fill in multiple runlevels, such as 12345 or 35, etc.

Linux has 7 runlevels:

0: shutdown

1: single user character interface

2: multi-user character interface without Network File system (NFS) function

3: multi-user character interface with network function

4: keep it

5: graphical user interface with network function

6: restart the system

3.action

Init has the following behaviors, as shown in the following table.

Init behavior

Behavior

Description

Respawn

Start and monitor the process specified in item 4, and restart the process if it terminates

Wait

Execute the process specified in item 4 and wait for it to complete

Once

Execute the process specified in item 4

Boot

Regardless of the execution level, the process specified in item 4 will be run when the system starts.

Bootwait

Regardless of the execution level, the process specified in item 4 will be run when the system starts and waits for it to complete.

Off

Turning off any action is equivalent to ignoring the configuration line

Ondemand

When entering the ondemand execution level, execute the process specified in item 4

Initdefault

The level of execution entered after the system is started, which does not need to specify a process

Sysinit

Regardless of the execution level, the system executes the process specified in item 4 before executing boot and bootwait

Powerwait

Execute the process specified in item 4 when the power supply of the system is insufficient, and wait for it to be completed

Powerfailnow

Execute the process specified in item 4 when the power supply of the system is seriously insufficient

Ctrlaltdel

The action performed when the user presses ctrl+alt+del

Kbrequest

Execute the process specified in item 4 when the user presses a special key combination, which needs to be defined in the keymaps file

4.process

Process is a process executed for init, which is stored in the directory / etc/rc.d/rcX, where X represents the run level, and the rc program takes the X parameter and runs the program under / etc/rc.d/rc.X. Use the following command to view the contents of the / etc/rc.d directory.

The code is as follows:

# ls-l / etc/rc.d/

Total 112

Drwxr-xr-x 2 root root 4096 3/15 14:44 init.d

-rxwr-xr-x 1 root root 2352 2004-3-17 rc

Drwxr-xr-x 2 root root 4096 3/15 14:44 rc0.d

Drwxr-xr-x 2 root root 4096 3/15 14:44 rc1.d

Drwxr-xr-x 2 root root 4096 3/15 14:44 rc2.d

Drwxr-xr-x 2 root root 4096 3/15 14:44 rc3.d

Drwxr-xr-x 2 root root 4096 3/15 14:44 rc4.d

Drwxr-xr-x 2 root root 4096 3/15 14:44 rc5.d

Drwxr-xr-x 2 root root 4096 3/15 14:44 rc6.d

-rxwr-xr-x 1 root root 2200 2004-3-17 rc.local

-rxwr-xr-x 1 root root 2352 2004-3-17 rc.sysinit

Use the following command to view the contents of / etc/rc.d/rc5.d.

The code is as follows:

# ls-l / etc/rc.d/rc5.d

These files are symbolic links, start the program with the S logo, and terminate the program with the K logo, the following digital identification execution order, the smaller the first execution, the rest identify the program name. A program that starts with S will be executed when the system starts or switches to that runlevel, and a program that starts with K will be executed when the system switches to that runlevel.

The programs in this directory can be managed by chkconfig programs. Of course, the programs in this directory need to comply with certain specifications. If you know shell programming, you can check the source code of the program pointed to by these symbolic links.

Init is also a process and has the same properties as a normal process. For example, if you modify / etc/inittab, you can run "kill-SIGHUP 1" or "init Q" if you want the change to take effect immediately.

VIII. Brief introduction to Linux threads

Definition of 1.Linux thread

Threads (thread) are concurrent multiple execution paths in shared memory space, which share the resources of a process, such as file description and signal processing. When switching between two ordinary processes (non-threads), it takes a lot of overhead for the kernel to switch from the context of one process to the context of another. The main task of context switching here is to save the CPU state of the old process and load the saved state of the new process, and replace the memory image of the new process with the memory image of the new process. Threads allow your process to switch between several running tasks without having to execute the full context mentioned earlier. In addition, the thread introduced in this article is for POSIX threads, that is, Pthread. Because Linux supports it best, thread is a concept closer to the executor than the process. It can share data with other threads in the same process, but has its own stack space and independent execution sequence. The purpose of introducing threads and processes on the basis of serial programs is to improve the concurrency of the program, so as to improve the running efficiency and response time of the program. Threads and lightweight processes (LWP) can also be considered equivalent, but there are different interpretations in different systems / implementations. LWP is more appropriately interpreted as a virtual CPU or kernel thread. It can help user-mode threads to achieve some special functions. Pthread is a standardized model that divides a program into groups of tasks that can be performed at the same time.

2. Where to use Pthread, that is, threads

(1) the Iripple O task that is blocked before returning can use one thread to process the Icano while continuing to perform other processing tasks.

(2) in situations where one or more tasks are affected by uncertain events, such as the availability of network communications, threads can be used to handle these asynchronous events while continuing to perform normal processing.

(3) if some program functions are more important than others, threads can be used to ensure that all functions appear, but those time-intensive features have a higher priority.

The above three points can be summarized as follows: use Pthread when checking for potential parallelism in your program, that is, when you want to find out that you can perform tasks at the same time. As mentioned above, the Linux process model provides the ability to execute multiple processes and can already be programmed in parallel or concurrently, but purebred allows you to control multiple tasks better and use fewer resources, because a single resource, such as a global variable, can be shared by multiple threads. Moreover, on systems with multiple processors, multithreaded applications execute faster than applications implemented with multiple processes.

Development of 3.Linux processes and threads

In the Linux 2.2 kernel released in January 1999, the process was created by the system call fork, and the new process is a child of the original process. To be clear, in version 2.2.x, there is no real thread (thread). The thread Pthread commonly used in Linux is actually simulated by processes. In other words, threads in Linux are also created through fork and are "light" processes. Linux 2.2 only allows 4096 processes / threads to run at the same time by default. High-end systems serve thousands of users at the same time, so this is obviously a problem, and it was once a major factor preventing Linux from entering the enterprise market.

The Linux 2.4 kernel released in January 2001 removes this limitation and allows the maximum number of processes to be dynamically adjusted while the system is running. Therefore, the number of processes is now only limited by the amount of physical memory. On high-end servers, even with 512MB memory installed, it is now easy to support 16,000 processes at the same time.

In the 2.6 kernel released in December 2003, process scheduling was rewritten to remove the inefficient algorithms in previous versions. Previously, in order to decide which task to run next, the process scheduler looked at each prepared task and went through the computer to decide which task was relatively more important. The number of process identification numbers (PID) also increased from 32000 to 1 billion. One of the big changes within the kernel is that Linux's threading framework has been rewritten so that NPTL (Native POSIX Thread Library) can run on it. This is a major performance improvement for Pentium Pro and more advanced processors running heavily threaded applications, and is what many high-end systems in enterprise applications have been looking forward to. The changes to the thread framework include many new concepts in the Linux thread space, including thread groups, thread's respective local storage, POSIX style beacons, and other changes. The improved multithreading and memory management techniques are helpful to run large-scale multimedia applications better.

4. Summary

Threads and processes have their own advantages and disadvantages: thread execution overhead is small, but it is not conducive to the management and protection of resources, while the process is just the opposite. At the same time, threads are suitable for running on computers with symmetrical processors, while processes can be migrated across machines. In addition, the process can own resources, and threads share the resources owned by the process. Switching between processes must be saved in the process control block PCB (Process Control Block). Switching between multiple threads of the same process is less troublesome. The last example concludes this article: when you open two OICQ on a Linux PC, each OICQ is a process, and when you chat with multiple people on an OICQ, each chat window is a thread.

At this point, I believe you have a deeper understanding of "the basis and management of Linux processes and threads". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.