What is the implementation of zero copy of java? 07/01 Update SLTechnology News&Howtos

What is the implementation of zero copy of java?

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

The main content of this article is to explain "what is the implementation of Java zero copy". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the implementation of Java zero copy"?

1. What is zero copy?

Zero copy literally includes two, "zero" and "copy":

Copy: refers to the transfer of data from one storage area to another.

"Zero": indicates that the number of times is 0, which means that the number of times data is copied is 0.

Together, zero copy means that there is no need to copy data from one storage area to another.

Zero copy means that when a computer performs an IO operation, CPU does not need to copy data from one storage area to another, thus reducing context switching and CPU copy time. It is a kind of Istroke O operation optimization technology.

two。 The execution process of traditional IO

As a small partner in server-side development, the file download function should have been achieved a lot. If you implement a web program and the front end requests it, the task of the server side is to send the files on the server host disk from the connected socket. The key implementation code is as follows:

While ((n = read (diskfd, buf, BUF_SIZE)) > 0) write (sockfd, buf, n)

Traditional IO processes, including read and write processes.

Read: read data from disk to kernel buffer and copy it to user buffer

Write: first write the data to the socket buffer, and finally write to the network card device.

The flow chart is as follows:

The user application process calls the read function, initiates an IO call to the operating system, and the context changes from user mode to kernel mode (switch 1)

The DMA controller reads data from the disk into the kernel buffer.

CPU copies the kernel buffer data to the user application buffer, the context changes from kernel state to user mode (switch 2), and the read function returns

The user application process initiates the IO call through the write function, and the context changes from user mode to kernel state (switch 3)

CPU copies the data from the user buffer to the socket buffer

The DMA controller copies the data from the socket buffer to the network card device, the context switches from the kernel state to the user state (switch 4), and the write function returns

As can be seen from the flow chart, the traditional IO read and write process includes four context switches (four user mode and kernel mode switches), four data copies (two CPU copies and two DMA copies). What is a DMA copy? Let's review the operating system knowledge points involved in zero copy.

3. Review of Zero copy related knowledge points 3.1 Kernel Space and user Space

The applications running on our computers actually need to go through the operating system before they can do some special operations, such as reading and writing disk files, reading and writing memory, and so on. Because these are more dangerous operations, can not be messed with by the application, can only be handed over to the underlying operating system.

Therefore, the operating system allocates memory space for each process, part user space and part kernel space. The kernel space is the area accessed by the operating system kernel and the protected memory space, while the user space is the memory area accessed by the user application. Take the 32-bit operating system as an example, it allocates 4G (2 to the 32) memory space for each process.

Kernel space: mainly provides functions such as process scheduling, memory allocation, connecting hardware resources, etc.

User space: the space provided to individual program processes that does not have access to kernel space resources. If an application needs to use kernel space resources, it needs to be done through system calls. The process switches from user space to kernel space, and then switches back to user space from kernel space after completing the relevant operations.

3.2 what is the household state and kernel state

If a process runs in kernel space, it is called the kernel state of the process.

If a process runs in user space, it is called the user state of the process.

3.3 what is context switching

What is the CPU context?

CPU registers are small but extremely fast memory built into CPU. The program counter is used to store the location of the instruction being executed by CPU, or the location of the next instruction that is about to be executed. They are all dependent environments that CPU must rely on before running any task, so they are called CPU contexts.

What is CPU context switching?

It means that first save the CPU context of the previous task (that is, CPU registers and program counters), then load the context of the new task into these registers and program counters, and finally jump to the new location referred to by the program counter to run the new task.

Generally speaking, context switching means that the kernel (the core of the operating system) switches processes or threads on the CPU. The transformation of the process from the user mode to the kernel state needs to be completed by system call. During the process of system call, the CPU context will be switched.

The instruction location of the original user mode in the CPU register needs to be saved first. Next, in order to execute the kernel state code, the CPU register needs to be updated to the new location of the kernel state instruction. Finally, jump to kernel state to run kernel tasks.

3.4 Virtual memory

Modern operating systems use virtual memory, that is, virtual addresses instead of physical addresses, and using virtual memory has two benefits:

Virtual memory space can be much larger than physical memory space.

Multiple virtual memories can point to the same physical address

It is that multiple virtual memories can point to the same physical address, and the virtual addresses of kernel space and user space can be mapped to the same physical address. In this way, the number of data copies of IO can be reduced, as shown below.

3.5 DMA technology

DMA, the English full name is Direct Memory Access, that is, direct memory access. DMA is essentially an independent chip on the motherboard, which allows direct IO data transmission between peripheral devices and memory without the participation of CPU.

Let's take a look at the IO process and what DMA does.

The user application process calls the read function, initiates an IO call to the operating system, enters the blocking state, and waits for the data to return.

After receiving the instruction, CPU initiates instruction scheduling to the DMA controller.

After receiving the IO request, DMA sends the request to disk

The disk puts the data into the disk control buffer and notifies the DMA

DMA copies data from the disk controller buffer to the kernel buffer.

DMA signals to CPU that the data has been read, swapping the work to CPU, and CPU is responsible for copying data from the kernel buffer to the user buffer.

The user application process switches from kernel state to user state to unblock the state.

As you can see, what DMA does is very clear, it is mainly to help CPU forward IO requests and copy data. Why do you need it?

The main thing is efficiency, which helps CPU do things. At this time, CPU can be idle to do other things, which improves the efficiency of CPU utilization. The vernacular explanation is that Brother CPU is too busy and tired, so he found a younger brother (named DMA) to do part of the copying work for him, so that Brother CPU can start to do other things.

4. Several ways to realize Zero copy

Zero copy does not mean that there is no copy of data, but reduces the number of user mode / kernel state switching and the number of CPU copies. There are many ways to implement zero copy, which are

Mmap+write

Sendfile

Sendfile with DMA collection and copy function

4.1 Zero copy implemented by mmap+write

The function prototype of mmap is as follows:

Void * mmap (void * addr, size_t length, int prot, int flags, int fd, off_t offset)

Addr: specifies the mapped virtual memory address

Length: the length of the map

Prot: protected mode of mapped memory

Flags: specifies the type of mapping

Fd: file handle for mapping

Offset: file offset

In the previous section, reviewing the knowledge points related to zero copy, we introduced virtual memory, which can map the virtual addresses of kernel space and user space to the same physical address, thus reducing the number of data copies! Mmap uses virtual memory, which maps the read buffer in the kernel to the buffer in user space, and all IO is completed in the kernel.

The zero-copy process implemented by mmap+write is as follows:

The user process initiates an IO call to the operating system kernel through the mmap method, and the context changes from the user state to the kernel state.

CPU uses the DMA controller to copy data from the hard disk to the kernel buffer.

The context switches from kernel state to user mode, and the mmap method returns.

The user process initiates an IO call to the operating system kernel through the write method, and the context changes from the user state to the kernel state.

The socket buffer to which CPU copies the data from the kernel buffer.

CPU uses the DMA controller to copy the data from the socket buffer to the network card, the context is switched from the kernel state to the user state, and the write call returns.

It can be found that there are 4 context switches between user space and kernel space and 3 copies of data in the zero copy of mmap+write implementation. Of the 3 data copies, 2 DMA copies and 1 CPU copy are included.

Mmap maps the address of the read buffer to the address of the user buffer, and the kernel buffer and the application buffer are shared, so it saves a CPU copy''and the user process memory is virtual, only mapping to the read buffer of the kernel, which can save half the memory space.

4.2 Zero copy implemented by sendfile

Sendfile is a system call function introduced after the Linux2.1 kernel version. The API is as follows:

Ssize_t sendfile (int out_fd, int in_fd, off_t * offset, size_t count)

Out_fd: a socket descriptor for the file descriptor to be written.

In_fd: the file descriptor for the content to be read must be a real file, not a socket or a pipe.

Offset: specifies where to start reading the file. If NULL, it indicates the default starting location of the file.

Count: specifies the number of bytes transferred between fdout and fdin.

Sendfile represents the transfer of data between two file descriptors, which operates in the operating system kernel and avoids copying data between the kernel buffer and the user buffer, so it can be used to achieve zero copy.

The zero-copy process implemented by sendfile is as follows:

Zero copy implemented by sendfile

The user process initiates a sendfile system call, and the context (switch 1) shifts from the user state to the kernel state.

The DMA controller copies data from the hard disk to the kernel buffer.

CPU copies data from the read buffer to the socket buffer

DMA controller, which asynchronously copies data from the socket buffer to the network card

The context (switch 2) switches from kernel state to user mode, and the sendfile call returns.

It can be found that there are two context switches between user space and kernel space and three copies of data in the zero copy of sendfile implementation. Of the 3 data copies, 2 DMA copies and 1 CPU copy are included. Can you reduce the number of CPU copies to zero? Yes, that is, sendfile with DMA collection and copy function!

4.3 zero copy implemented by sendfile+DMA scatter/gather

After linux 2.4, sendfile is optimized and upgraded, and SG-DMA technology is introduced. In fact, scatter/gather operation is added to the DMA copy, which can read the data directly from the kernel space buffer to the network card. Use this feature to make zero copies, that is, you can save one more CPU copy.

The zero-copy process implemented by sendfile+DMA scatter/gather is as follows:

The user process initiates a sendfile system call, and the context (switch 1) shifts from the user state to the kernel state.

The DMA controller copies data from the hard disk to the kernel buffer.

CPU sends the file descriptor information in the kernel buffer (including the memory address and offset of the kernel buffer) to the socket buffer

According to the file descriptor information, the DMA controller directly copies the data from the kernel buffer to the network card.

The context (switch 2) switches from kernel state to user mode, and the sendfile call returns.

It can be found that there are two context switches between user space and kernel space and two copies of data in the zero copy of sendfile+DMA scatter/gather implementation. Two of the data copies are package DMA copies. This is the true zero copy (Zero-copy) technology, the whole process does not carry the data through CPU, all the data is transmitted through DMA.

5. Zero copy mode provided by java

Java NIO support for mmap

Java NIO support for sendfile

5.1 Java NIO support for mmap

Java NIO has a MappedByteBuffer class that can be used to implement memory mapping. Its underlying layer is the API that invokes the mmap of the Linux kernel.

The small demo of mmap is as follows:

Public class MmapTest {public static void main (String [] args) {try {FileChannel readChannel = FileChannel.open (Paths.get (". / jay.txt"), StandardOpenOption.READ); MappedByteBuffer data = readChannel.map (FileChannel.MapMode.READ_ONLY, 0, 1024 * 1024 * 40); FileChannel writeChannel = FileChannel.open (Paths.get (". / siting.txt"), StandardOpenOption.WRITE, StandardOpenOption.CREATE) / / data transfer writeChannel.write (data); readChannel.close (); writeChannel.close ();} catch (Exception e) {System.out.println (e.getMessage ());}} 5.2 Java NIO support for sendfile

FileChannel's transferTo () / transferFrom (), the underlying is the sendfile () system call function. Kafka is used in this open source project, and when you answer why the interviewer is so fast, you can mention zero copy sendfile during the interview.

Overridepublic long transferFrom (FileChannel fileChannel, long position, long count) throws IOException {return fileChannel.transferTo (position, count, socketChannel);}

The small demo of sendfile is as follows:

Public class SendFileTest {public static void main (String [] args) {try {FileChannel readChannel = FileChannel.open (Paths.get (". / jay.txt"), StandardOpenOption.READ); long len = readChannel.size (); long position = readChannel.position (); FileChannel writeChannel = FileChannel.open (Paths.get (". / siting.txt"), StandardOpenOption.WRITE, StandardOpenOption.CREATE) / / data transfer readChannel.transferTo (position, len, writeChannel); readChannel.close (); writeChannel.close ();} catch (Exception e) {System.out.println (e.getMessage ()) At this point, I believe you have a deeper understanding of "what is the implementation of Java zero copy". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.