In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains "what is the basic process of data copying". Interested friends may wish to take a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn what is the basic process of data copying.
Basic process of data copy
In the Linux system, the capacity of cache and memory is limited, and more data is stored on disk.
For Web servers, it is often necessary to read data from disk to memory and then transmit it to the user through the network card:
The above data flow is just a big box, so let's take a look at several modes:
① only CPU mode
As shown above:
When the application needs to read disk data, the call read () falls from the user state to the kernel state, and the system call read () is finally done by CPU.
CPU initiates an Iamp O request to the disk, and after receiving it, the disk begins to prepare the data.
After the disk has put the data into the disk buffer, it issues an I _ CPU O interrupt to report that the CPU data has been Ready.
After receiving the interrupt of the disk controller, CPU starts to copy the data, then read () returns after completion, and then switches from kernel state to user mode.
② CPU&DMA mode
CPU's time is precious, and letting it do chores is a waste of resources.
Direct memory access (Direct Memory Access) is a mechanism by which hardware devices bypass CPU to access memory independently.
So DMA liberates CPU to a certain extent, leaving the chores of CPU to the hardware to do directly, thus improving the efficiency of CPU.
At present, the hardware that supports DMA includes network card, sound card, graphics card, disk controller and so on.
Some changes have taken place in the process with the participation of DMA:
The main change is that CPU no longer interacts directly with the disk, but DMA interacts with the disk and copies data from the disk buffer to the kernel buffer, which is similar later.
Knock on the blackboard: there are multiple redundant data copies and kernel-user mode switching no matter from CPU-only mode or DMA&CPU mode.
We continue to think about the detailed process in which the Web server reads the local disk file data and transmits it to the user over the network.
Normal mode data interaction
The data exchange completed at one time includes several parts: system call syscall, CPU, DMA, network card, disk and so on.
The system call syscall is a bridge between the application and the kernel, and two switches occur each time the call / return is made:
Call syscall to switch from user mode to kernel mode.
Syscall returns, switching from kernel mode to user mode.
Take a look at the complete schematic diagram of the data copying process:
The process of reading data:
The application needs to read the disk data and call the read () function to switch the kernel state in user mode, which is the first state switch.
The DMA controller copies data from disk to the kernel buffer, which is the first DMA copy.
CPU copies data from the kernel buffer to the user buffer, which is the first CPU copy.
After the copy of the CPU is completed, the read () function returns to switch the user mode to the user mode, which is the second state switch.
Process of writing data:
The application needs to write data to the network card and call the write () function to switch the kernel state in user mode. This is the first time to switch.
CPU copies the user buffer data to the kernel buffer, which is the first CPU copy.
The DMA controller copies data from the kernel buffer to the socket buffer, which is the first DMA copy.
After the copy is completed, the write () function returns to switch the user mode in kernel state, which is the second switch.
To sum up:
The reading process involves 2 space switches, 1 DMA copy, and 1 CPU copy.
The writing process involves 2 space switches, 1 DMA copy, and 1 CPU copy.
It can be seen that in the traditional mode, it is not efficient to involve multiple space switching and redundant copies of data, so it is time for zero-copy technology to come out.
Zero copy technology
Cause of occurrence
We can see that if the application does not modify the data, from the kernel buffer to the user buffer, and then from the user buffer to the kernel buffer.
Both data copies require the participation of CPU, and involve multiple switching between user mode and kernel state, which increases the burden of CPU.
We need to reduce redundant data copies and liberate CPU, which is zero-copy Zero-Copy technology.
Solution idea
At present, several implementation methods of zero-copy technology include: mmap+write, sendfile, sendfile+DMA collection, splice and so on.
① mmap mode
Mmap is a memory mapping file mechanism provided by Linux, which maps the read buffer address to the user space buffer address in the kernel, thus realizing the sharing of the kernel buffer and the user buffer.
This reduces one copy of CPU in both user and kernel mode, but there is still one copy of CPU in kernel space.
Mmap has some advantages over large file transfers, but small files may be fragmented, and signal that raises coredump may occur when multiple processes manipulate files at the same time.
② sendfile mode
There is some improvement in mmap+write mode, but the state switching caused by system calls has not been reduced.
Sendfile system call was introduced in version 2.1 of the Linux kernel, which establishes a transfer channel between two files.
In sendfile mode, only one function can be used to complete the previous functions of read+write and mmap+write, so there are two fewer state switches, and because the data does not pass through the user buffer, the data cannot be modified.
As you can see from the figure, the application only needs to call the sendfile function to complete, only 2 state switches, 1 CPU copy, 2 DMA copy.
But sendfile still has a copy of CPU in the kernel buffer and socket buffer, and maybe this can be optimized.
③ sendfile+DMA Collection
Linux 2.4 kernel optimizes sendfile system calls, but requires the cooperation of hardware DMA controllers.
The upgraded sendfile records the corresponding data description information (file descriptor, address offset, etc.) in the kernel space buffer into the socket buffer.
The DMA controller copies data from the kernel buffer to the network card according to the address and offset in the socket buffer, thus saving only one CPU copy in the kernel space.
This method has 2 state switches, 0 CPU copies, 2 DMA copies, but still can not modify the data, and requires the support of hardware-level DMA, and sendfile can only copy file data to socket descriptors, which has some limitations.
④ splice mode
Splice system call is introduced by Linux in version 2.6. it does not need hardware support and is no longer limited to socket to achieve zero copy of data between two ordinary files.
Splice system call can establish a pipeline between kernel buffer and socket buffer to transfer data, avoiding the CPU copy operation between the two.
Splice also has some limitations. One of its two file descriptor parameters must be a pipe device.
At this point, I believe you have a deeper understanding of "what is the basic process of data copying". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.