What are the Linux system programming specifications? 07/02 Update SLTechnology News&Howtos

What are the Linux system programming specifications?

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article introduces you what the Linux system programming specifications are, the content is very detailed, interested friends can refer to, hope to be helpful to you.

1.1 Overview of system calls

System call is the basic interface provided by the operating system kernel to the application, which needs to run in the core mode of the operating system to ensure that it has the authority to execute some CPU privileged instructions.

Linux system provides very rich system calls, covering file operations, process control, memory management, network management, socket operations, user management, inter-process communication and other aspects.

Execute the following command to list all system call names in the system.

Man syscalls

Linux's own man manual describes each system call in great detail, including function functions, incoming parameters, return values, possible errors, precautions, and so on, which is no less perfect than Microsoft's MSDN. Although it is an English version, it is easy to read, and every Linux system developer should be used to viewing these documents.

In addition, there is a very high-quality "Chinese system call list" in the IBM document library, which makes it easier to read.

1.2 two ways of calling system calls

Let's look at one way first.

The system call is identified by the assigned number and can be called directly with the number as an argument through the syscall function.

The prototype of the syscall function is:

Int syscall (int number,...)

The complete system call number is defined in the sys/syscall.h file. Interested readers can check it for themselves.

Obviously, remembering so many numbers is very unfriendly to developers.

As a result, developers often choose the second way, that is, using the wrapper functions provided by glibc to wrap these system calls as functions with self-explanatory names.

In this process, the wrapper function does not do much extra work, mainly to check the parameters, copy them into the appropriate register, then call the system call with the specified label, and then set errno according to the result for the application to check the execution result, and other related work.

The two call methods can be considered to be functionally equivalent, but the glibc wrapper function has more advantages in terms of readability and ease of use. Later in the course, I mentioned a system call, which, without special instructions, refers to the glibc wrapper function.

Of course, if the wrapper function does not meet the needs of some special application scenarios, you can also use the syscall function to directly perform the system call. But this kind of situation is very rare, so far, I have not encountered it.

1.3 two execution procedures of system calls

1.3.1 based on interrupt mode

The implementation code of the system call is part of the kernel code. To execute the system call code, you first need to switch the system from user mode to core mode.

Early system calls switch modes through soft interrupts, while interrupt numbers are scarce resources of the system, so it is impossible to assign an interrupt number to each system call.

In the implementation of Linux, all system calls share 128interrupts (the famous int 0x80), and the corresponding interrupt handler is system_call, and all system calls are transferred to this interrupt handler.

Next, system_call jumps according to the system call label passed in by EAX and executes the corresponding system call program. If more parameters are needed, they will be passed with EBX, ECX, EDX, and EDI in turn. After the function is executed, the result is put into EAX and returned to the application.

Thus, a system call will trigger a complete interrupt handling process. In each interrupt processing process, CPU will extract the corresponding door descriptor from the interrupt description table initialized when the system starts, and judge the type of door descriptor.

After confirming that the level of the door descriptor (DPL) is not lower than that of the interrupt instruction caller (CPL), the registers that may be used in the interrupt handler are stacked according to the contents of the descriptor. Then, the privilege escalation is performed, the CS and EIP registers are set so that the CPU jumps to the code address of the specified system call, and the target system call is executed.

1.3.2 based on SYSENTER instruction

If you take a closer look at the execution of system calls based on interrupt mode, it is not difficult to find that many of the previous processes are fixed, but they are actually very unnecessary, such as door descriptor level checking, finding interrupt handler entries, and so on.

To avoid these extra checks, Intel adds a new SYSENTER instruction to Pentium II CPU, which is specifically used to perform system calls.

This instruction skips the previous check step, switches the CPU directly to privileged mode, and then executes the system call, while adding several special registers to assist in parameter passing and context saving. In addition, the SYSEXIT instruction is added accordingly to return the execution result and switch back to user mode.

After Linux implemented the system call in SYSENTER mode, someone used Pentium III's machine to compare and test the efficiency of the two system calls. The test results show that, compared with the interrupt mode, the time spent by SYSENTER in user mode is greatly reduced by about 45% due to omitting the operation of level checking class, and in core mode, the time spent is also reduced by about 2% due to the absence of a register stack save action.

At present, the system call based on interrupt mode is still retained. When Linux starts, it will automatically detect whether CPU supports SYSENTER instructions, so as to choose the corresponding system call mode according to the situation.

1.3.3 SYSENTER instruction birth Story

After introducing the advantages of the SYSENTER instruction, let's go back to its origin.

Starting from the Linux 2.5 kernel, after multiple tests and multiple Patch, the SYSENTER instruction was officially supported by Linux version 2.6 and implemented by Linus Torvalds himself.

As mentioned above, the SYSENTER directive was introduced into Intel Pentium II CPU as early as 1998 and did not appear in the Linux version 2.5 kernel until 2002. As soon as the directive appeared, there was a heated discussion in the Linux community.

Later, Intel Pentium 4 CPU released that the CPU "has a design problem, which causes Pentium 4 to use interrupts to execute system calls that take 5 times more CPU clock cycles than Pentium 3 and AMD Athlon." Linus could not accept this result, so the SYSENTER instruction was added to the Linux 2.6 kernel to achieve more efficient system calls.

Here is a summary of the execution process of the system call. The process changes from the user mode to the core mode, starts to execute the code segments in the kernel that implement specific functions, switches back to the user mode after the execution is completed, and returns the execution result to the calling process. Before Linux 2.4, interrupts were mainly used to switch core modes, while more efficient SYSENTER instructions could be used in Linux 2.6 and later kernels.

1.4 Standard usage of system calls

As mentioned earlier, the system call mentioned in this course defaults to the wrapper function in glibc. These functions set the state of the register before executing the system call and carefully check the validity of the input parameters. After the system call is executed, the kernel code execution result is obtained from the EAX register.

When the kernel executes a system call, once an error occurs, the EAX is set to a negative integer, and the wrapper function removes the symbol from the negative number, places it in a global errno, and returns − 1. If there is no error, EAX will be set to 0, and the wrapper function will get the value and return 0, indicating that the execution was successful. There is no need to set errno at this time.

To sum up, the standard use of system calls can be summarized as follows: determine whether the system call is successful or not according to the positive or negative return value of the wrapper function. If it is not successful, further determine the cause of the error through errno, and perform different actions according to different reasons for the error; if successful, continue with the subsequent logic. The code example is as follows:

Int ret = syscallx (...); if (ret < 0) {/ / there is an error. Determine the cause of the error by errno, perform different operations} else {/ / call successfully, continue to work}

Most system calls follow this process. Errno is an integer and the corresponding literal description information can be obtained using perror or strerror.

However, there are several special system calls that are slightly different from the above usage. For example, one of the functions resets the errno to 0 before calling, and after the call, check the errno to determine whether the execution is successful or not. There are only a few of these functions. Before you use them, take a look at the help page to see how to use them.

This is the end of the usage specification for system calls. At this point, you may have a question: after each system call fails, the errno will be set. In a multithreaded program, will the errno set by system calls in different threads interfere with each other?

If errno is a global variable, the answer is yes. If this is the case, then the limitations of system calls are too great to be locked before every system call. A good Linux is certainly not so weak, so how to solve this errno problem?

Multithreading problem of 1.5 errno

According to the man manual, to use errno, you first need to include the header file errno.h. Let's see what's in the errno.h first.

Vim / usr/include/errno.h

Execute the above code, and you will find that there are several key lines in the file:

# include. # ifndef errno extern int errno; # endif

According to the official code comments, there should be a macro definition of errno in bits/errno.h. If not, an integer named errno is found in the external variable, which naturally becomes a global integer. Otherwise, the errno is just a per-thread variable, and each thread makes a copy.

We will cover the per-thread variable in more detail later in the course. Now, all you need to know is that this errno is copied independently by each thread, so using it in multithreaded programs will not affect each other.

1.5.1 implementation principle

How exactly did you do it? We can open the bits/errno.h again and have a look.

# ifndef _ _ ASSEMBLER__ extern int * _ errno_location (void) _ THROW _ _ attribute__ ((_ _ const__)); # if! defined _ LIBC | | defined _ LIBC_REENTRANT # define errno (* _ errno_location ()) # endif # endif

Originally, when libc is defined as reentrant, errno is defined as a macro that invokes the value stored in the memory address returned by the external _ _ errno_location function. In the GCC source code, we also found a Stub that defines the _ _ errno_location function in the test case, which reads as follows:

Extern _ _ thread int _ _ libc_errno _ attribute__ ((tls_model ("initial-exec"); int * _ errno_location (void) {return & _ libc_errno;}

This simple test case fully demonstrates the implementation principle of errno. Errno is defined as the per-thread (thread local storage type identified by _ _ thread) variable _ _ libc_errno, and the _ _ errno_location function returns the address of the thread local variable. Therefore, when you get and set errno in each thread, you are operating on a variable within this thread and will not interfere with other threads.

As for the _ _ thread keyword, it needs to take effect under very "stringent" conditions-- it requires support from Linux 2.6kernel, pthreads library, GCC 3.3or later. However, today, these conditions have become standard, so they are nothing.

1.5.2 considerations

The above only explains that there is no conflict problem with errno when using system calls in multithreading, but it does not mean that all system calls can be used boldly in multithreaded programs.

There are some system calls whose implementation must be multithreaded safe (or reentrant, which will be explained in more detail later in the course). Due to historical reasons and implementation principle limitations, the implementation of some functions is not thread-safe, such as system (). The same is true of some glibc functions, such as the strerror function, which internally uses a static storage area to store errno descriptive information, and the most recent call overwrites the contents of the previous call.

What about the Linux system programming specifications to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.