How to understand multithreaded processes in Java+Linux kernel source code 07/12 Update SLTechnology News&Howtos

How to understand multithreaded processes in Java+Linux kernel source code

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to understand the multithreaded process of the Java+Linux kernel source code". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. please follow the editor's train of thought to study and learn how to understand the multithreaded process of the Java+Linux kernel source code.

How does the Linux kernel describe a process?

1. The process of Linux

The term for a process is process, which is the most basic abstraction of Linux, and another basic abstraction is files.

The simplest understanding is that a process is a program that is executing (not equal to running).

A more accurate understanding is that processes include executing programs and related resources (including cpu status, open files, pending signals, tty, memory address space, and so on).

A concise statement: process = n * execution flow + resources, n > = 1.

Characteristics of the Linux process:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

The system calls fork () to create the process, and fork () copies the existing process to create a completely new one.

In the kernel, there is no strict distinction between processes and threads.

From the kernel's point of view, the scheduling unit is the thread (that is, the execution flow). You can think of a thread as an execution flow in a process, and there can be one or more threads in a process.

In the kernel, processes are often referred to as task or thread, which is more accurate because many processes have only one execution flow.

The kernel supports multithreading through lightweight processes (lightweight process). One lightweight process corresponds to one thread. Open files, address space and other resources can be shared between lightweight processes.

2. Process descriptor for Linux

2.1 task_struct

In the kernel, a process is described by a task_struct structure, called process descriptor (process descriptor), which holds all the information that supports the normal operation of a process.

Every process, even a lightweight process (that is, a thread), has a task_struct.

Sched.h (include\ linux) struct task_struct {struct thread_info thread_info; volatile long state; void * stack; [...] Struct mm_struct * mm; [...] Pid_t pid; [...] Struct task_struct * parent; [...] Char [task _ COMM_LEN]; [...] Struct files_struct * files; [...] Struct signal_struct * signal;}

This is a huge structure with not only many basic fields related to the process, but also many pointers to other data structures.

It contains fields that fully describe an executing program, including cpu status, open files, address space, pending signals, process status, and so on.

As a beginner, simply understand some of the fields:

Struct thread_info thread_info: the underlying information of the process, related to the platform, which will be described in detail below.

Long state: the current state of the process. Here are some important process states and the transition process between them.

Void * stack: point to the process kernel stack, as explained below.

Struct mm_struct * mm: information related to the process address space is stored in a mm_struct called the memory descriptor (memory descriptor).

Pid_t pid: the process identifier, which is essentially a number, is the unique identity of the process referenced in user space.

Struct task_struct * parent: the task_struct of the parent process. Char [task _ COMM_LEN]: the name of the process. Struct files_struct * files: open file table. Struct signal_struct * signal: related to signal processing.

For other fields, wait until you need to go back to learning.

2.2 how does the kernel find task_struct when a system call or process switch occurs?

For the ARM architecture, the answer is: through the kernel stack (kernel mode stack).

Why is there a kernel stack?

Because the kernel is reentrant, there are multiple execution paths associated with different processes in the kernel. Therefore, when different processes are in kernel state, they need to have their own private process kernel stack (process kernel stack).

When a process switches from user mode to kernel mode, the stack used is switched from the user stack to the kernel stack.

As for how to switch, the key word is system call, which is not the focus of this article. Put aside and learn the kernel to ignore the details when appropriate.

When a process switch occurs, it also switches to the kernel stack of the target process.

As above, the keyword is hardware context switching (hardware context switch), ignoring the specific implementation.

Whenever the process is in kernel state, there will be a kernel stack that can be used, otherwise the system will not be far from crash.

The relationship between the kernel stack of ARM architecture and task_struct is as follows:

The length of the kernel stack is THREAD_SIZE, and for ARM architecture, it is generally the size of two page frames, namely 8KB.

The kernel places a smaller data structure, thread_info, at the bottom of the kernel stack, which is responsible for concatenating the kernel stack with the task_struct. Thread_info is platform-dependent and is defined in the ARM architecture as follows:

/ / thread_info.h (arch\ arm\ include\ asm) struct thread_info {unsigned long flags; / * low level flags * / int preempt_count; / * 0 = > preemptable, bug * / mm_segment_t addr_limit; / * address limit * / struct task_struct * task; / * main task structure * / [...] Struct cpu_context_save cpu_context; / * cpu context * / [...]}

Thread_info holds the lowest-level information (low level task data) on which a process can be scheduled for execution, for example, struct cpu_context_save cpu_context will be used to save / restore the register context when the process is switched.

The kernel can quickly get the thread_info through the stack pointer of the kernel stack:

/ / thread_info.h (include\ linux) static inline struct thread_info * current_thread_info (void) {/ / current_stack_pointer is the stack pointer return (struct thread_info *) (current_stack_pointer & ~ (THREAD_SIZE-1)) of the current kernel stack;} then find task_struct through thread_info: / / current.h (include\ asm-generic) # define current (current_thread_info ()-> task)

The task_struct of the current process can be obtained through the current macro in the kernel.

2.3 allocation and initialization of task_struct

When the upper application creates a process using fork (), the kernel creates a new task_struct.

The creation of a process is a complex task that can be extended to countless details. Here we simply look at the allocation of task_struct and the process of partial initialization.

The core flow of fork () in the kernel:

What did dup_task_struct () do?

As for what is done in setting up the kernel stack, it involves the creation and switching of processes, which is beyond the scope of this article and will be studied later.

3. Lab: print task_struct / thread_info / kernel mode stack

Experimental purpose:

Sort out the relationship between task_struct / thread_info / kernel mode stack.

Lab code:

Experiment code: # include # include # include static void print_task_info (struct task_struct * task) {printk (KERN_NOTICE "s% 5d task_struct (% p) / stack (% packs% p) / thread_info- > task (% p)", task- > comm, task- > pid, task, task- > stack, ((unsigned long *) task- > stack) + THREAD_SIZE Task_thread_info (task)-> task) } static int_ _ init task_init (void) {struct task_struct * task = current; printk (KERN_INFO "task module init\ n"); print_task_info (task); do {task = task- > parent; print_task_info (task);} while (task- > pid! = 0); return 0;} module_init (task_init) Static void _ _ exit task_exit (void) {printk (KERN_INFO "task module exit\ n");} module_exit (task_exit)

Running effect:

Task module init insmod 3123 task_struct (edb42580) / stack (ed46c000~ed474000) / thread_info- > task (edb42580) bash 2393 task_struct (eda13e80) / stack (c9dda000~c9de2000) / thread_info- > task (eda13e80) sshd 2255 task_struct (ee5c9f40) / stack (c9d2e000~c9d36000) / thread_info- > task (ee5c9f40) sshd 543 task_struct (ef15f080) / stack (ee554000~ee55c000) / thread_info- > task (ef15f080) systemd 1 task_struct (ef058000) / stack (ef04c000~ef054000) / thread_info- > task (ef058000)

In the program, we find stack through task_struct, then thread_info through stack, and finally task_struct through thread_info- > task.

At this point, I wonder if you have a clear understanding of the concept of process.

But the above shows the thread through Linux. In the daily work, we still focus on Java in the implementation and writing of the code. Let's take a look at the Java process.

Creation of 1.Java process

Java provides two ways to start a process or other program:

Use the exec () method of Runtime

Use the start () method of ProcessBuilder

1.1 ProcessBuilder

The ProcessBuilder class is a new class added to java.lang by J2SE 1.5, which is used to create operating system processes and provides a way to start and manage processes (that is, applications). Before J2SE 1.5, the control and management of the process was realized by the Process class.

Each ProcessBuilder instance manages a set of process properties. The start () method takes advantage of these properties to create a new Process instance. The start () method can be called repeatedly from the same instance to create a new child process with the same or related properties.

Each process generator manages these process properties:

The command is a list of strings that represent the external program file to be invoked and its parameters, if any. Here, the list of strings that represent valid operating system commands is system dependent. For example, each population variable is usually an element in this list, but there are operating systems that want the program to tag the command-line string itself-- in such a system, the Java implementation may need commands to contain exactly these two elements.

The environment is a system-dependent mapping from variables to values. The initial value is a copy of the current process environment (see System.getenv ()).

Working directory. The default value is the current working directory of the current process and is usually named after the system property user.dir.

RedirectErrorStream property. Initially, this property is false, meaning that the standard output and error output of the child process are sent to two separate streams that can be accessed through the Process.getInputStream () and Process.getErrorStream () methods. If the value is set to true, standard error is merged with standard output. This makes it easier to correlate error messages with the corresponding output. In this case, the merged data can be read from the stream returned by Process.getInputStream (), while the stream read returned from Process.getErrorStream () will go directly to the end of the file.

Modifying the properties of the process builder affects subsequent processes started by the object's start () method, but never affects previously started processes or processes of Java itself. Most error checking is performed by the start () method. You can modify the state of the object, but start () will fail. For example, setting the command property to an empty list will not throw an exception unless start () is included.

Note that this class is not synchronous. If multiple threads access a ProcessBuilder at the same time, and at least one of the threads structurally modifies one of the properties, it must maintain external synchronization.

Summary of construction methods

ProcessBuilder (List command) constructs a process generator using specified operating system programs and parameters. ProcessBuilder (String... Command) constructs a process generator using the specified operating system programs and parameters.

Method summary

List command () returns the operating system program and parameters for this process generator. ProcessBuilder command (List command) sets the operating system programs and parameters for this process generator. ProcessBuilder command (String... Command) sets the operating system programs and parameters for this process generator. File directory () returns the working directory of this process generator. ProcessBuilder directory (File directory) sets the working directory of this process generator. Map environment () returns the string mapping view of this process generator environment. Boolean redirectErrorStream () tells the process generator whether to merge standard error and standard output. ProcessBuilder redirectErrorStream (boolean redirectErrorStream) sets the redirectErrorStream property of this process generator. Process start () starts a new process using the properties of this process generator.

1.2 Runtime

Each Java application has an instance of the Runtime class that enables the application to connect to the environment in which it runs. You can get the current runtime through the getRuntime method.

Applications cannot create their own instances of Runtime classes. However, you can get a reference to the current Runtime runtime object through the getRuntime method. Once you have a reference to the current Runtime object, you can call the methods of the Runtime object to control the state and behavior of the Java virtual machine.

Java Code Collection Code

Void addShutdownHook (Thread hook) registers the new virtual machine to close the hook. Int availableProcessors () returns the number of processors available to the Java virtual machine. Process exec (String command) executes the specified string command in a separate process. Process exec (String [] cmdarray) executes specified commands and variables in a separate process. Process exec (String [] cmdarray, String [] envp) executes the specified commands and variables in a separate process in the specified environment. Process exec (String [] cmdarray, String [] envp, File dir) executes the specified commands and variables in a separate process in the specified environment and working directory. Process exec (String command, String [] envp) executes the specified string command in a separate process in the specified environment. Process exec (String command, String [] envp, File dir) executes the specified string command in a separate process with a specified environment and working directory. Void exit (int status) terminates the currently running Java virtual machine by starting the shutdown sequence of the virtual machine. Long freeMemory () returns the amount of free memory in the Java virtual machine. Void gc () runs the garbage collector. InputStream getLocalizedInputStream (InputStream in) is out of date. Starting with JDK 1.1, the preferred way to convert a locally encoded byte stream to a Unicode character stream is to use the InputStreamReader and BufferedReader classes. OutputStream getLocalizedOutputStream (OutputStream out) is out of date. Starting with JDK 1.1, the preferred way to convert a Unicode character stream to a locally encoded byte stream is to use the OutputStreamWriter, BufferedWriter, and PrintWriter classes. Static Runtime getRuntime () returns the runtime object associated with the current Java application. Void halt (int status) forcibly terminates the currently running Java virtual machine. Void load (String filename) loads the specified file name as the dynamic library. Void loadLibrary (String libname) loads a dynamic library with the specified library name. Long maxMemory () returns the maximum amount of memory that the Java virtual machine attempts to use. Boolean removeShutdownHook (Thread hook) Unregisters a previously registered virtual machine to close the hook. Void runFinalization () runs the termination method for all objects that suspend finalization. Static void runFinalizersOnExit (boolean value) is out of date. This method is inherently unsafe. It may call finalizers on objects in use while other threads are manipulating them, resulting in incorrect behavior or deadlocks. Long totalMemory () returns the total amount of memory in the Java virtual machine. Void traceInstructions (boolean on) enables / disables instruction tracing. Void traceMethodCalls (boolean on) enables / disables method call tracing.

1.3 Process

No matter which method you use to start a process, it returns an instance of the Process class that represents the started process, which can be used to control the process and obtain relevant information. The Process class provides methods to execute input from the process, execute output to the process, wait for the process to complete, check the exit status of the process, and destroy (kill) the process:

Void destroy () kills off the child process. In general, this method does not kill a process that has already been started, so it is not good. Int exitValue () returns the exit value of the child process. The exitValue () method will not have a normal return value until the started process completes execution or exits due to an exception, otherwise an exception will be thrown. InputStream getErrorStream () gets the error stream of the child process. If the error output is redirected, the error output cannot be read from the stream. InputStream getInputStream () gets the input stream of the child process. The standard output of a process can be read from this stream. OutputStream getOutputStream () gets the output stream of the child process. The data written to the stream serves as standard input to the process. Int waitFor () causes the current thread to wait, if necessary, until the process represented by the Process object has been terminated.

two。 An example of multiprocess programming

Generally, when we run methods in other classes in java, both static and dynamic calls are executed in the current process, that is, only one java virtual machine instance is running. Sometimes, we need to start multiple java child processes through java code. Although this takes up some system resources, it will make the program more stable, because the newly started program is running in a different virtual machine process, and if an exception occurs in one process, it does not affect other child processes.

There are two ways we can implement this requirement in Java. The easiest way is to execute java classname through the exec method in Runtime. If the execution succeeds, this method returns a Process object, and if the execution fails, an IOException error is thrown. Let's look at a simple example.

/ / Test1.java file import java.io.*; public class Test {public static void main (String [] args) {FileOutputStream fOut = new FileOutputStream ("c:\\ Test1.txt"); fOut.close (); System.out.println ("successfully called!");}} / / Test_Exec.java public class Test_Exec {public static void main (String [] args) {Runtime run = Runtime.getRuntime (); Process p = run.exec ("java test1") }}

After running the program through java Test_Exec, it is found that there is an extra Test1.txt file on disk C, but "successfully called!" does not appear in the console. The output information of the Therefore, it can be concluded that Test has been executed successfully, but for some reason, the output of Test is not output in the console of Test_Exec. This reason is also simple, because using exec to create a child process of Test_Exec, this child process does not have its own console, so it does not output any information.

If you want to output the output information of the child process, you can get the output stream of the child process through the getInputStream in the Process (output in the child process, input in the parent process), and then output the output stream in the child process from the console of the parent process. The specific implementation code is as follows:

/ Test_Exec_Out.java import java.io.*; public class Test_Exec_Out {public static void main (String [] args) {Runtime run = Runtime.getRuntime (); Process p = run.exec ("java test1"); BufferedInputStream in = new BufferedInputStream (p.getInputStream ()); BufferedReader br = new BufferedReader (new InputStreamReader (in)); String s; while ((s = br.readLine ())! = null) System.out.println (s);}}

As you can see from the above code, the output of the child process is read by line in Test_Exec_Out.java, and then output by line in Test_Exec_Out. What is discussed above is how to get the output information of the child process. Then, in addition to the output information, there is also input information. Since the child process does not have its own console, the input information must also be provided by the parent process. We can use the getOutputStream method of Process to provide input information for the child process (that is, the parent process enters the information to the child process instead of the console). We can take a look at the following code:

/ / Test2.java file import java.io.*; public class Test {public static void main (String [] args) {BufferedReader br = new BufferedReader (new InputStreamReader (System.in)); System.out.println ("Information entered by the parent process:" + br.readLine ());}} / / Test_Exec_In.java import java.io.*; public class Test_Exec_In {public static void main (String [] args) {Runtime run = Runtime.getRuntime () Process p = run.exec ("java test2"); BufferedWriter bw = new BufferedWriter (new OutputStreamWriter (p.getOutputStream (); bw.write ("output information to child process"); bw.flush (); bw.close (); / / the stream must be closed, otherwise information / / System.in.read () cannot be entered into the child process;}}

As you can see from the above code, Test1 gets the information sent by Test_Exec_In and outputs it. When you do not add bw.flash () and bw.close (), the information will not be able to reach the child process, that is, the child process has entered a blocking state, but because the parent process has exited, so the child process has also exited. If you want to prove this, you can add System.in.read () at the end, and then check the java process through the Task Manager (under windows). You will find that if you add bw.flush () and bw.close (), only one java process exists, and if you remove them, there are two java processes. This is because if you pass the information to Test2, Test2 exits after you get the information. It is important to note here that the execution of exec is asynchronous and does not stop execution of the following code because a program being executed is blocked. Therefore, after running test2, you can still execute the following code.

The exec method has been overloaded many times. What is used above is only one of its overloads. It can also separate commands from parameters, such as exec ("java.test2") can be written as exec ("java", "test2"). Exec can also run java virtual machines with different configurations through specified environment variables.

In addition to creating child processes using Runtime's exec method, you can also create child processes through ProcessBuilder. The usage of ProcessBuilder is as follows:

/ / Test_Exec_Out.java import java.io.*; public class Test_Exec_Out {public static void main (String [] args) {ProcessBuilder pb = new ProcessBuilder ("java", "test1"); Process p = pb.start (); … ... }}

In establishing a child process, ProcessBuilder is similar to Runtime. Different ProcessBuilder uses the start () method to promoter the process, while Runtime uses the exec method to promote the process. Once you get the Process, they operate exactly the same.

ProcessBuilder, like Runtime, can also set the environment information, working directory, and so on of executable files. The following example describes how to set up this information using ProcessBuilder.

ProcessBuilder pb = new ProcessBuilder ("Command", "arg2", "arg2",''); / / set the environment variable Map env = pb.environment (); env.put ("key1", "value1"); env.remove ("key2"); env.put ("key2", env.get ("key1") + "_ test"); pb.directory ("..\ abcd"); / / set the working directory Process p = pb.start () / / Establishment of child processes Thank you for your reading. the above is the content of "how to understand multithreaded processes in Java+Linux kernel source code". After the study of this article, I believe you have a deeper understanding of how Java+Linux kernel source code understands multithreaded processes, and the specific usage needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.