The method of realizing Zero copy in linux system 07/06 Update SLTechnology News&Howtos

The method of realizing Zero copy in linux system

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

How to realize zero copy in linux system? Many novices are not very clear about this. In order to help you solve this problem, the following editor will explain it in detail. People with this need can come and learn. I hope you can gain something.

What is Linux system Linux is a free-to-use and free-spread UNIX-like operating system, is a POSIX-based multi-user, multi-task, multi-threaded and multi-CPU operating system, using Linux can run major Unix tools, applications and network protocols.

Traditional data transmission method

For a long time, the understanding of data copying only stays at the application layer, and in fact, there is much more data copying behavior hidden behind it than expected. When transferring data, the user application needs to allocate a buffer of the appropriate size to store the data to be transferred. The user reads the data from the application and sends it out. It only needs two system calls to read,write to complete the data transfer. The application does not know how many copies the operating system has made in the process of data transfer. In some cases, these data copy operations can greatly degrade the performance of data transmission. (NIC,Network Interface Card)

The traditional data copy method, as shown below:

Steps involved:

(1) the read () call triggers a context switch from user mode to kernel mode (the first switch). Internally, sys_read (or equivalent) is issued to read data from the device, and direct memory read (direct memory access,DMA) performs a copy (first copy), which reads the contents from disk and stores them in a kernel address space buffer.

(2) data is copied from the read buffer to the user buffer (the second copy), which is returned by the read () call. The call returns to trigger a switch from kernel mode to user mode (the second switch). The data is now stored in the user space buffer

(3) the send () socket call causes a context switch from user mode to kernel mode (the third switch), and the data is again placed in the kernel address space buffer (the third copy). The buffer placed this time is associated with the target socket

(4) the send () system call returns, switching from kernel mode to user mode (fourth switch), and the DMA engine transfers data from the kernel buffer to the protocol engine (fourth copy).

DMA allows direct transfer of IO data between peripherals and storage, and DMA depends on the system. Each architecture DMA transport is different, and the programming interface is also different. Data transfer can be triggered in two ways: one is requested by the software, and the other is transmitted asynchronously by the hardware. Take read as an example, it adopts the first method, and its steps are as follows:

(1) when a process calls read, the driver allocates a DMA buffer, then instructs the hardware to transmit its data, and the process goes to sleep.

(2) the hardware writes the data to the DMA buffer and produces an interrupt when it is finished.

(3) the interrupt handler acquires the input data, answers the interrupt, and finally wakes up the process to read the data.

Thus, in the traditional data transmission, the system has carried out a total of 4 data copies and 4 online text switching, which will have a great impact on the performance of the server.

Overview of Zero copy

Simply put, zero copy is a technique that prevents CPU from copying data from one fast storage to another. The goal of zero copy technology:

Avoid data copying

# avoid data copying between operating system kernel buffers

# avoid data copying between the operating system kernel and the user application address space

# user applications can prevent the operating system from directly accessing hardware storage

# data transmission should be handled by DMA as far as possible.

A combination of multiple operations

# avoid unnecessary system calls and context switching

# data that needs to be copied can be cached first

# the processing of data should be done by hardware as much as possible.

Classification of implementation methods of zero copy

Direct IO

The main purpose is to reduce the CPU usage and bandwidth overhead caused by reading and writing files by reducing the number of data copies in the operating system kernel buffer and application address space. For some page-count applications, such as self-buffering applications, it would be a better choice. If there is a large amount of data to be transferred, the direct IO method is used for data transfer without the participation of the operating system kernel address space copy data, which will improve performance.

Direct IO does not work in all cases. Setting up a direct IO is very expensive and cannot take advantage of cached IO. The read operation of the direct IO will cause the synchronous read of the disk, and the execution process will take a long time to complete, while the write operation will cause the application to shut down slowly. Applications use direct IO for data transfer, usually in conjunction with asynchronous IO.

The linux kernel already provides support for fast devices to execute direct IO. When an application accesses a file directly without passing through the operating system page cache, open the file (open () syscall) to specify the O_DIRECT identifier.

In a word, this kind of data transmission mode, the application program accesses the hardware storage directly, and the operating system kernel only assists the data transmission; it is generally used when the operating system does not need to process the data, and the data can be transferred between the buffer of the application address space and the disk, and does not need the linux operating system kernel to provide page cache support.

For zero-copy technology in which data transmission does not need to go through the application address space

In the process of data transmission, avoid copying data in the buffer of the system kernel address space and the buffer of the user application address space. Sometimes, the application does not need to access the data during the data transfer, so copying the data from the page cache of linux to the buffer of the user process can be completely avoided, and the transferred data can be processed in the page buffer. In some cases, this zero-copy technology can achieve good performance. Similar system calls provided under linux are mainly mmap (), sendfile (), splice ().

Using mmap instead of read can reduce the number of CPU copies. When the application calls mmap (), the data is copied to the kernel buffer via DMA, which is shared by the application and the operating system. In this way, the operating system kernel and application storage space no longer need to do any data copy operations. When the write () system call is made, the data is copied from the kernel buffer to the socket buffer and then to the protocol engine.

This is also suitable for situations where the transmitted data does not need to be processed by the operating system kernel or directly transmitted without the processing of the program. You can also use mmap with socket, but only in the case of RAW. For the traditional Cramp S online game structure, the use of it is of little significance.

Zero-copy technology for optimizing data transmission in application address space and kernel space

Optimize the transfer of data between the linux page cache and the user process buffer. The zero-copy technology focuses on flexibly handling the copy operation between the buffer in the user process and the page buffer in the operating system. This method continues the traditional way of communication, but it is more flexible. In linux, this method mainly uses write-time replication technology.

Replication while writing is a common optimization strategy in computer programming. The basic idea is this: if multiple applications need to access a piece of data at the same time, these applications can be assigned pointers to this piece of data. In the view of each application, they have a copy of this piece of data, when one of the applications needs to modify its own data. You need to actually copy the data to the address space of the application. If the application never modifies this piece of data, you never need to copy the data to the application's address space. The implementation of string in stl is similar to this strategy.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.