Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand from creating the process to entering the Main function

2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to understand from the creation process to the Main function", the content of the article is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "how to understand from the creation process to enter the Main function" bar!

Create process

The first step, create the process.

On Linux, we want to start a new process, which is generally realized by fork + exec series functions. The former forks the current process into a twin process, while the latter is responsible for replacing the execution file of the child process to execute the new program file of the child process.

The fork and exec series functions here are the API functions provided by the operating system to the application, which will eventually enter the operating system kernel through system calls, and complete the creation of a process through the process management mechanism in the kernel.

The operating system kernel will be responsible for the creation of processes, and there are mainly the following tasks to be done:

Create the data structure used to describe the process in the kernel, which is task_struct on Linux

Create the page directory and page table of the new process, which is used to build the memory address space of the new process.

In the Linux kernel, due to historical reasons, the early Linux kernel did not have the concept of threads, but used the task: task_struct to describe the execution example of a program: process.

In the kernel, a task corresponds to a task_struct, that is, a process, and the scheduling unit of the kernel is also a task_struct.

Later, with the rise of the concept of multithreading, the Linux kernel in order to support multithreading technology, task_struct actually represents a thread, by merging multiple task_struct into a group (through the group id field within the structure) to describe a process. Therefore, threads on Linux are also called lightweight processes.

One of the important missions of the system calling fork is to create the task_struct structure of the new process. After the creation is completed, the process has the scheduling unit. Then you will be able to participate in the scheduling and have the opportunity to be executed.

Load executable file

After the successful creation of the process through fork, the child process and the parent process are mitotic equivalent to a cell, and the two processes are "almost" the same.

In order for a child process to execute a new program, a series of exec functions are also needed in the child process to replace the process executable program.

The exec series of functions are also wrappers of system calls, and by calling them, they go into the kernel sys_execve to do the real work.

There are many details of this work, one of which is to load the executable file into the process space and analyze it, and extract the entry address of the executable file.

We use C, C++ and other high-level language to write the code, and finally through the compiler will compile to generate an executable file, on Linux, is ELF format, on Windows, called PE file.

Both the ELF file and the PE file record the instruction entry address of the executable file in their respective file headers, which indicates where to start the program.

Where does this entry point? is it our main function? Here is a key point, let's first solve a problem before that: after the process is created, how do you get to this entry address?

Whether on Windows or Linux, application threads often shuttle back and forth between user space and kernel space, which can occur in the following situations:

System call

Interrupt

Abnormal

When returning from the kernel, how does the thread know where it came from and where to go back to the application space to continue execution?

The answer is that when entering kernel space, the thread automatically saves the context (which is actually the contents of some registers, such as the instruction register EIP) on the thread's stack, records where it came from, and when it returns from the kernel, loads the information from the stack and returns to the original place to continue execution.

As mentioned earlier, the child process enters the kernel through the sys_execve system call. After completing the analysis of the executable file, it gets the entry address of the ELF file, and will modify the context information saved on the stack and point the EIP to the entry address of the ELF file. In this way, when the sys_execve system call ends, after returning to user space, you can go directly to the new program entry to start executing the code.

Therefore, a very important feature is that exec series functions normally do not return, once entered, after the completion of the mission, the execution process will turn to the new executable entry.

It is also important to mention that on Linux, in addition to ELF files, executable files in other formats are also supported, such as MS-DOS and COFF.

In addition to binary executables, shell scripts are also supported, in which case the script interpreter program will be started as an entry

From the ELF entry to the main function

The above explains how a new process is executed to the entry address of the executable file.

At the same time, there is also a question, what is the address of this entrance? Is it our main function?

Here is a simple C program that runs and outputs the classic hello world:

# include int main () {printf ("hello, world!\ n"); return 0;}

After compiling through gcc, an ELF executable file is generated. Through the readelf instruction, you can analyze the ELF file. Here, you can see that the entry address of the ELF file is 0x400430:

Then, through the disassembly of the artifact, IDA opens and analyzes the file to see what function is located at the entrance to 0x400430?

As you can see, the entrance is a function called _ start, not our main function.

At the end of _ start, the _ _ libc_start_main function, which is located in libc.so, is called.

You may wonder, where did this function come from, and we didn't use it in our code?

In fact, there is one more important thing to do before entering the main function, which is the initialization of the run-time library. The _ _ libc_start_main above is doing just that.

When compiling through GCC, the compiler automatically completes the link to the runtime library, encapsulating our main function and calling it.

Glibc is open source, we can find the libc-start.c file of this project on GitHub to get a glimpse of the true face of _ _ libc_start_main, our main function is called by it.

Complete process

At this point, we have combed through the process's creation of fork, the replacement of executable files through a series of exec functions, the entry of the execution process into the ELF file, and then the complete flow of our main function.

Some differences on Windows

Here are some of the differences in this process on Windows.

First of all, there is the link of creating the process. The Windows system merges the two steps of fork+exec into one step, and specifies the executable file path of the child process in its parameters through a series of CreateProcess functions.

Different from the fuzzy boundary between processes and threads on Linux, on Windows operating system, the kernel has a clear definition of process and thread. Processes are represented by EPROCESS structure and threads are represented by ETHREAD structure.

So on Windows, when the process-related work is ready, you need to create a separate execution unit that participates in kernel scheduling, that is, the first thread in the process: the main thread. Of course, this work is also encapsulated in the CreateProcess series of functions.

After the main thread of the new process is created, it begins to participate in the system scheduling. Where does the main thread start to execute? The kernel is explicitly specified when it is created: ntasking KiThreadStartup, which is a kernel function that executes from here when the thread starts.

After the thread starts from here, the APC mechanism is called by the asynchronous process of Windows to execute the inserted APC in advance, and then the execution process is introduced into the application layer to perform the initialization of the Windows process application, such as the loading of some core DLL files (Kernel32.dll, ntdll.dll) and so on.

Then, again through the APC mechanism, we go to the entry point to execute the executable file.

This is similar to the mechanism on Linux, which also doesn't go directly to the main function. Instead, we need to initialize the runtime library first, and then wrap it around the runtime function before we finally get to our main function.

Here is the complete process from the creation process to our main function on Windows:

Thank you for your reading, the above is "how to understand from the creation process to enter the Main function" content, after the study of this article, I believe you on how to understand from the creation process to enter the Main function of this problem has a deeper understanding, the specific use of the situation also needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report