How to analyze mmap and Direct Buffer in Java Network programming 04/25 Update SLTechnology News&Howtos

How to analyze mmap and Direct Buffer in Java Network programming

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article shows you how to analyze mmap and Direct Buffer in Java network programming. The content is concise and easy to understand, which will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

Basic concepts of mmap

Mmap is a method of memory mapping file, that is, a file or other object is mapped to the address space of the process, realizing the one-to-one mapping relationship between the file disk address and a virtual address in the process virtual address space. After the implementation of such a mapping relationship, the process can use a pointer to read and write this section of memory, and the system will automatically write back dirty pages to the corresponding file disk, that is, to complete the operation of the file without having to call system call functions such as read,write. On the contrary, the modification of this area in kernel space also directly reflects user space, so that file sharing between different processes can be realized. As shown in the following figure:

! [] (https://images0.cnblogs.com/blog2015/571793/201507/200501092691998.png)

As can be seen from the above figure, the virtual address space of a process is made up of multiple virtual memory areas. The virtual memory area is a homogeneous interval in the virtual address space of a process, that is, a contiguous address range with the same characteristics. The text segment (code segment), initial segment, BSS segment, heap, stack, and memory mapping shown in the figure above are all separate virtual memory areas. The address space for memory mapping is in the spare part of the stack.

The linux kernel uses the vm_area_struct structure to represent an independent virtual memory region. Because the functions and internal mechanisms of each virtual memory region are different, a process uses multiple vm_area_struct structures to represent different types of virtual memory regions. Each vm_area_struct structure uses linked lists or tree structure links to facilitate quick access by processes, as shown in the following figure:

! [] (https://images0.cnblogs.com/blog2015/571793/201507/200501434261629.png)

The vm_area_struct structure contains the start and end addresses of the region and other related information, as well as a vm_ops pointer that leads to all the system call functions that can be used for this area. In this way, any information that a process needs to use for any operation on a virtual memory area can be obtained from vm_area_struct. The mmap function is to create a new vm_area_struct structure and connect it to the physical disk address of the file. For specific steps, please see the next section.

Back to the top.

Mmap memory mapping principle

Generally speaking, the implementation process of mmap memory mapping can be divided into three stages:

(1) the process starts the mapping process and creates a virtual mapping area for the mapping in the virtual address space

1. The process calls the library function mmap in user space. Prototype: void mmap (void start, size_t length, int prot, int flags, int fd, off_t offset)

2. Find a free continuous virtual address that meets the requirements in the virtual address space of the current process.

3. Assign a vm_area_struct structure for this virtual zone, and then initialize each domain of this structure.

4. Insert the newly created virtual zone structure (vm_area_struct) into the process's virtual address area linked list or tree

(2) call the kernel space system call function mmap (different from the user space function) to realize the one-to-one mapping between the physical address of the file and the virtual address of the process.

5. After assigning a new virtual address area to the mapping, through the file pointer to be mapped, find the corresponding file descriptor in the file descriptor table, and link to the file structure (struct file) of the file in the kernel "open file set" through the file descriptor. Each file structure maintains all the information related to the opened file.

6. Through the file structure of the file, link to the file_operations module and call the kernel function mmap. Its prototype is int mmap (struct file filp, struct vm_area_struct vma), which is different from the user space library function.

7. The kernel mmap function locates the physical address of the file disk through the virtual file system inode module.

8. The page table is established by remap_pfn_range function, that is, the mapping relationship between file address and virtual address area is realized. At this point, this virtual address does not have any data associated with the main memory.

(3) the process initiates access to this mapping space, throws a page fault exception, and copies the file contents to physical memory (main memory).

Note: the first two phases only focus on creating a virtual interval and completing address mapping, but do not copy any file data to main memory. The real file read is when the process initiates a read or write operation.

9. The read or write operation of the process accesses the mapped address of the virtual address space. By querying the page table, it is found that this address is not on the physical page. Because only address mapping has been established, the real hard disk data has not been copied to memory, so a page fault exception is thrown.

10. After a series of judgments are made to determine that there is no illegal operation, the kernel initiates the request paging process.

11. The paging process first looks for the memory pages that need to be accessed in the swap cache space (swap cache). If not, call the nopage function to load the missing pages from disk into main memory.

12. After that, the process can read or write to this piece of main memory. If the write operation changes its content, the system will automatically write back the dirty page to the corresponding disk address after a certain period of time, that is, the process of writing to the file will be completed.

Note: the modified dirty page is not immediately updated back to the file, but there is a delay. You can call msync () to force synchronization so that the written content can be saved to the file immediately.

Back to the top.

The difference between mmap and regular file operations

For those of you who are not familiar with the linux file system, please refer to my previous blog, "looking at the file reading and writing process from the kernel file system". First of all, let's briefly review the process of calling functions in regular file system operations (calling functions such as read/fread):

1. The process initiates a request to read the file.

2. The kernel locates the file information on the open file set of the kernel by looking up the process file character table, so as to find the inode of this file.

3. Inode looks on address_space to see if the file page to be requested has been cached in the page cache. If it exists, return the contents of this file page directly.

4. If it does not exist, navigate to the file disk address through inode and copy the data from the disk to the page cache. The process of reading the page is then initiated again, which in turn sends the data in the page cache to the user process.

To sum up, in order to improve the efficiency of reading and writing and protect the disk, the regular file operation uses the page caching mechanism. As a result, when reading a file, it is necessary to copy the file page from disk to the page cache. Because the page cache is in the kernel space and cannot be directly addressed by the user process, it is also necessary to copy the data page in the page cache to the corresponding user space in memory. In this way, it takes two data copying processes to complete the process's task of obtaining the contents of the file. The same is true for write operations. The buffer to be written cannot be accessed directly in kernel space. It must first be copied to the corresponding main memory in kernel space, and then written back to disk (delayed writeback). It also requires two copies of data.

Using mmap to manipulate files, creating a new virtual memory region and establishing file disk address and virtual memory region mapping are two steps without any file copy operation. Later, when accessing the data, it is found that there is no data in memory and the abnormal process of page missing can be transferred from the disk to the user space of memory by using only one copy of the data through the established mapping relationship.

In short, regular file operations require two copies of data from the disk to the page cache and then to the user's main memory. While mmap controls files, it only needs a data copy process from the disk to the user's main memory. To put it bluntly, the key point of mmap is to realize the direct interaction of data between user space and kernel space, eliminating the tedious process of data communication between different spaces. Therefore, mmap is more efficient.

Back to the top.

Summary of advantages of mmap

As can be seen from the above discussion, the advantages of mmap are as follows:

The main results are as follows: 1. The read operation of the file crosses the page cache, reduces the number of copies of the data, and uses memory read and write instead of Ibig O read and write, which improves the file reading efficiency.

2. The efficient interaction between user space and kernel space is realized. The respective modification operations of the two spaces can be directly reflected in the mapped area, so that they can be captured by each other's space in time.

3. Provide the way to share memory and communicate with each other between processes. Whether it is a parent-child process or an unrelated process, you can map your user space to the same file or anonymously to the same area. Thus, through the respective changes to the mapping area, the purpose of inter-process communication and inter-process sharing can be achieved.

At the same time, if both process An and process B map area C, when A reads C for the first time, copy file pages from disk to memory through missing pages; but when B re-reads the same page of C, although it will also produce a page fault exception, but no longer need to copy files from disk, but can directly use the file data that has been saved in memory.

4. It can be used to realize efficient large-scale data transmission. The lack of memory space is an aspect that restricts big data's operation, and the solution is often to supplement the lack of memory with the help of hard disk space. But further, it will cause a large number of file Iswap O operations, which will greatly affect the efficiency. This problem can be well solved by mmap mapping. In other words, mmap can work whenever you need to replace memory with disk space.

Details of mmap usage

1. One of the key points to note when using mmap is that the mmap mapping area size must be an integral multiple of the physical page size (page_size) (usually 4k bytes in 32-bit systems). The reason is that the minimum granularity of memory is pages, and the mapping of process virtual address space and memory is also in pages. In order to match memory operations, the mapping of mmap from disk to virtual address space must also be pages.

2. The kernel can track the size of the underlying objects (files) mapped by memory, and the process can legally access those bytes that are within the current file size and within the memory mapping area. That is, if the size of the file is expanding all the time, as long as the data is within the scope of the mapping area, the process can legally obtain it, regardless of the size of the file when the mapping is established. See "case three" for details.

3. After the mapping is established, the mapping still exists even if the file is closed. Because it maps the address of the disk, not the file itself, it has nothing to do with the file handle. At the same time, the effective address space available for inter-process communication is not completely limited by the size of the file being mapped, because it is mapped by page.

With the above knowledge, let's take a look at what happens if the size is not an integral multiple of the page:

Case 1: the size of a file is 5000 bytes, and the mmap function maps 5000 bytes to virtual memory, starting from the beginning of a file.

Analysis: because the unit physical page size is 4096 bytes, although the mapped file is only 5000 bytes, but the size corresponding to the process virtual address area needs to meet the full page size, so after the execution of the mmap function, the actual mapping to the virtual memory area is 8192 bytes, and the byte portion of 5000cm 8191 is filled with zero. The mapped relationship is shown in the following figure:

! [] (https://images0.cnblogs.com/blog2015/571793/201507/200521495513717.png)

At this point:

(1) the contents of the operation file will be returned for the first 5000 bytes of read / write (0,4999).

(2) when reading bytes 500008191, the result is all 0. When writing 500008191, the process will not report an error, but what it has written will not be written to the original file.

(3) A SIGSECV error will be returned when reading / writing to disk parts other than 8192.

Case 2: the size of a file is 5000 bytes. The mmap function maps 15000 bytes to virtual memory from the starting position of a file, that is, the mapping size exceeds the size of the original file.

Analysis: because the size of the file is 5000 bytes, as in the case, it corresponds to two physical pages. Then both physical pages are legally readable and writable, but the parts exceeding 5000 will not be reflected in the original file. Because the program requires mapping 15000 bytes, and the file occupies only two physical pages, so 8192-15000 bytes can not be read or written, and an exception will be returned during the operation. As shown in the following figure:

! [] (https://images0.cnblogs.com/blog2015/571793/201507/200522381763096.png)

At this point:

(1) the process can read / write the first 5000 bytes mapped normally, and changes in the write operation will be reflected in the original file after a certain period of time.

(2) for 5000-8191 bytes, the process can read and write without error. However, the content is 0 before it is written, and it is not reflected in the file after writing.

(3) for 8192 14999 bytes, the process cannot read or write them and will report a SIGBUS error.

(4) for bytes other than 15000, the process cannot read or write them, which will cause a SIGSEGV error.

Case 3: the initial size of a file is 0, and the size of 1000cm 4K is mapped using mmap operation, that is, 1000 physical pages are about 4m byte space, and mmap returns pointer ptr.

Analysis: if the file is read and written at the beginning of the mapping, because the file size is 0, there is no legal physical page correspondence, as in case 2, a SIGBUS error will be returned.

However, if you increase the file size before each operation of ptr read and write, then the operation of ptr within the file size is legal. For example, if the file is expanded by 4096 bytes, ptr can manipulate the space of ptr [(char) ptr + 4095]. As long as the file extends within 1000 physical pages (mapping range), the ptr can correspond to the same size of the operation.

In this way, it is convenient to expand the file space at any time and write the file at any time, without causing space waste.

This article is from: https://www.jianshu.com/p/007052ee3773

Out-of-heap memory

Out-of-heap memory is a concept relative to in-heap memory. In-heap memory is the Java process memory controlled by JVM. The objects we usually create in Java are in heap memory, and they follow the memory management mechanism of JVM. JVM will use garbage collection mechanism to manage their memory. Then the out-of-heap memory is an area of memory that exists outside the control of JVM, so it is not controlled by JVM.

Before explaining DirectByteBuffer, you need to briefly understand the java reference types of two knowledge points, because DirectByteBuffer frees memory out of the heap through virtual references (Phantom Reference).

PhantomReference is the weakest reference type of all "weak references". Unlike soft references and weak references, virtual references cannot use the target object by getting a strong reference to the target object through the get () method. Looking at the source code, you can see that get () is rewritten to return null forever.

What on earth is the use of false quotes? In fact, virtual references are mainly used to track the status of objects being garbage collected, and take action by checking whether the reference queue contains virtual references corresponding to objects to determine whether it is about to be garbage collected. It is not expected to be used to obtain the reference of the target object, but before the target object is collected, its reference will be put into a ReferenceQueue object, thus achieving the role of tracking object garbage collection.

About the implementation and principle of java reference type, you can read the previous articles Reference, ReferenceQueue detailed explanation and Java reference type brief description.

On the kernel state and user state of linux

Kernel state: controls the hardware resources of the computer and provides an environment in which upper-level applications run. For example, the operation of socket Ipar 0 or the read and write operation of a file.

User mode: the activity space of the upper application, and the execution of the application must rely on the resources provided by the kernel.

System call: in order to enable the upper application to access these resources, the kernel provides an interface for the upper application to access.

So we can see that when we call the native method through JNI is actually a way to switch from the user state to the kernel state. And use the functions provided by the operating system through the system call.

Q: why do you need a user process (in user mode) to invoke resources in kernel state, or services of the operating system, through system calls (even JNI in Java)?

A:intel cpu provides four levels of Ring0-Ring3 operation mode, with the highest level of Ring0 and the lowest level of Ring3. Linux uses the Ring3 level to run the user mode and Ring0 as the kernel state. Ring3 status does not have access to the address space of Ring0, including code and data. Therefore, the user state does not have the right to operate the resources in the kernel state, it can only complete the switch from the user state to the kernel state through the system call, and then switch back to the user state after the relevant operation is completed.

DirectByteBuffer-Direct buffering

DirectByteBuffer is an important class used by Java to implement out-of-heap memory, through which we can create, use and destroy out-of-heap memory.

The DirectByteBuffer class itself is also in the heap of the Java memory model. In-heap memory can be directly controlled and manipulated by JVM.

Unsafe.allocateMemory (size) in DirectByteBuffer is a native method, which allocates out-of-heap memory through C's malloc. The allocated memory is local to the system, is not in Java memory, and does not fall within the scope of JVM control, so there must be some way to manipulate out-of-heap memory in DirectByteBuffer.

There is an address attribute in the parent class Buffer of DirectByteBuffer:

/ / Used only by direct buffers / / NOTE: hoisted here for speed in JNI GetDirectBufferAddress long address

Address will only be cached directly for use. The reason why the address attribute upgrade is placed in Buffer is to increase the rate of JNI calls when it calls GetDirectBufferAddress.

Address represents the address of the allocated out-of-heap memory.

Unsafe.allocateMemory (size); after allocating out-of-heap memory, the allocated out-of-heap memory base address is returned and assigned to the address property. In this way, we use this address to operate on this out-of-heap memory through JNI.

As we said earlier, kernel state permissions are the highest in linux, so in kernel mode scenarios, the operating system can access any memory region, so the operating system can access this memory area of the Java heap.

Q: so why doesn't the operating system directly access the memory area in the Java heap?

A: this is because the memory area accessed by the JNI method is a determined memory area, so the memory address points to the memory in the Java heap. If Java performs a GC operation at this time when the operating system is accessing this memory address, and the GC operation will involve the data movement operation [GC often carries out the operation of marking the compression first. That is, mark the recyclable space, and then empty the memory at the location of the mark, and then carry out a compression, which will involve the movement of objects, the purpose of which is to free up a more complete and continuous memory space to accommodate larger new objects.], the movement of data will confuse the data called by JNI. Therefore, the memory called by JNI cannot perform GC operations.

Q: as mentioned above, the memory of JNI calls cannot perform GC operations, so how to solve it?

A: the way data is copied between ① in-heap memory and out-of-heap memory (and JVM guarantees no GC operations in the process of copying in-heap memory to out-of-heap memory): for example, we need to complete an operation to read data from a file to in-heap memory, that is, FileChannelImpl.read (HeapByteBuffer). Here, in fact, File Imax O reads the data into out-of-heap memory, and then out-of-heap memory copies the data to in-heap memory, so we read the memory in the file.

Static int read (FileDescriptor var0, ByteBuffer var1, long var2, NativeDispatcher var4) throws IOException {if (var1.isReadOnly ()) {throw new IllegalArgumentException ("Read-only buffer");} else if (var1 instanceof DirectBuffer) {return readIntoNativeBuffer (var0, var1, var2, var4);} else {/ / allocate temporary out-of-heap memory ByteBuffer var5 = Util.getTemporaryDirectBuffer (var1.remaining ()) Int var7; try {/ / File Iramo operation reads data into out-of-heap memory int var6 = readIntoNativeBuffer (var0, var5, var2, var4); var5.flip () If (var6 > 0) {/ / copy data from out-of-heap memory to out-of-heap memory var1.put (var5);} var7 = var6 } finally {/ / calls DirectBuffer.cleaner () .clean () to release temporary out-of-heap memory Util.offerFirstTemporaryDirectBuffer (var5);} return var7;}}

On the other hand, the write operation will write the data line of the in-heap memory to the out-of-heap memory, and then the operating system will write the data from the out-of-heap memory to the file.

② directly uses out-of-heap memory, such as DirectByteBuffer: this method allocates a memory (that is, native memory) directly outside the heap to store data, and the program reads / writes data directly to out-of-heap memory through JNI. Because the data is written directly to the out-of-heap memory, this method no longer allocates memory in the JVM-controlled heap to store the data, and there is no in-heap memory and out-of-heap memory data copy operation. In this way, when you do the iCandle O operation, you only need to pass the out-of-heap memory address to the JNI Icano function.

Source code interpretation of the creation and recovery of DirectByteBuffer out-of-heap memory allocation DirectByteBuffer (int cap) {/ / package-private super (- 1, 0, cap, cap); boolean pa = VM.isDirectMemoryPageAligned (); int ps = Bits.pageSize (); long size = Math.max (1L, (long) cap + (pa? Ps: 0); / / keep the size of the total allocated memory (allocated by page) and the size of the actual memory Bits.reserveMemory (size, cap); long base = 0; try {/ / allocate out-of-heap memory through unsafe.allocateMemory and return the base address of out-of-heap memory base = unsafe.allocateMemory (size) } catch (OutOfMemoryError x) {Bits.unreserveMemory (size, cap); throw x;} unsafe.setMemory (base, size, (byte) 0); if (pa & & (base% ps! = 0)) {/ / Round up to page boundary address = base + ps-(base & (ps-1)) } else {address = base;} / / build Cleaner objects to track garbage collection of DirectByteBuffer objects, so that when DirectByteBuffer is garbage collected, out-of-heap memory is also freed cleaner = Cleaner.create (this, new Deallocator (base, size, cap); att = null } Bits.reserveMemory (size, cap) method static void reserveMemory (long size, int cap) {if (! memoryLimitSet & & VM.isBooted ()) {maxMemory = VM.maxDirectMemory (); memoryLimitSet = true;} / / optimist! If (tryReserveMemory (size, cap)) {return;} final JavaLangRefAccess jlra = SharedSecrets.getJavaLangRefAccess (); / / retry while helping enqueue pending Reference objects / / which includes executing pending Cleaner (s) which includes / / Cleaner (s) that free direct buffer memory while (jlra.tryHandlePendingReference ()) {if (tryReserveMemory (size, cap)) {return }} / / trigger VM's Reference processing System.gc (); / / a retry loop with exponential back-off delays / / (this gives VM some time to do it's job) boolean interrupted = false; try {long sleepTime = 1; int sleeps = 0 While (true) {if (tryReserveMemory (size, cap)) {return;} if (sleeps > = MAX_SLEEPS) {break } if (! jlra.tryHandlePendingReference ()) {try {Thread.sleep (sleepTime); sleepTime max_capacity (); return convert_size_t_to_jlong (n); JVM_END

In the case where we use CMS GC, which is the value we set-Xmx, excluding the size of a survivor from the value is the default out-of-heap memory size.

Off-heap memory recovery

Cleaner is a subclass of PhantomReference and maintains a bi-directional linked list through its own next and prev fields. The role of PhantomReference is to track the garbage collection process and does not have any impact on the object's garbage collection process.

So cleaner = Cleaner.create (this, new Deallocator (base, size, cap)); is used to track the garbage collection process of the currently constructed DirectByteBuffer object.

When the DirectByteBuffer object is in the pending state-- > enqueue state, the clean () of Cleaner is triggered, and the clean () method of Cleaner frees the out-of-heap memory through unsafe.

? Although Cleaner does not call Reference.clear (), the clean () method of Cleaner calls remove (this), which removes the current Cleaner from the Cleaner linked list, so that when clean () is executed, Cleaner is an object that has no reference point, that is, an object that can be recycled by GC.

Thunk method:

Reclaim out-of-heap memory by configuring parameters

At the same time, we can use-XX:MaxDirectMemorySize to specify the maximum out-of-heap memory size, when the use reaches the threshold, we will call System.gc () to do a full gc, in order to reclaim the unused out-of-heap memory.

The reason for using out-of-heap memory for those things

Improvement on the standstill of garbage collection

Because full gc means complete recycling, the garbage collector scans all allocated in-heap memory completely, which means an important fact that the impact of such a garbage collection on Java applications is proportional to the size of the heap. Too large heap will affect the performance of Java applications. If out-of-heap memory is used, out-of-heap memory is directly managed by the operating system (not a virtual machine). The result is to maintain a small amount of in-heap memory to reduce the impact of garbage collection on applications.

In some scenarios, you can improve the performance of the program Imax O manipulation. Omit the step of copying data from in-heap memory to out-of-heap memory.

When to use out-of-heap memory

Out-of-heap memory is suitable for objects with a medium or long life cycle. If it is an object with a short life cycle, it is recycled at the time of YGC, and there is no performance impact on the application caused by objects with large memory and long life cycle in FGC.

A direct file copy operation, or an Imax O operation. The direct use of out-of-heap memory can reduce the operation of copying memory from user memory to system memory, because the Icano operation is the communication between the system kernel memory and devices, rather than communicating directly with peripherals through the program.

At the same time, the combination of pool and out-of-heap memory can also be used to reuse out-of-heap memory for objects with a short life cycle but involving Ipico operations. (this method is used in Netty)

Out-of-heap memory VS memory pool

Memory pool: it is mainly used for two types of objects: objects with short ① life cycle and simple structure. Reusing these objects in the memory pool can increase the hit rate of CPU cache, thus improving performance; ② loads large chunks of data containing a large number of duplicate objects, and using memory pool can reduce garbage collection time.

Out-of-heap memory: like memory pools, it reduces garbage collection time, but it applies to objects that are exactly the opposite of memory pools. Memory pools are often suitable for mutable objects with short lifetimes, while objects with medium or long lifetimes are exactly what out-of-heap memory needs to solve.

Characteristics of out-of-heap memory

Good scalability for large memory

The improvement in the garbage collection pause can be clearly felt.

Can be shared between processes to reduce replication between virtual machines

Some problems with out-of-heap memory

The problem of out-of-heap memory recovery and the leakage of out-of-heap memory. This has been mentioned in the source code parsing above

Data structure problem of out-of-heap memory: the biggest problem with out-of-heap memory is that your data structure becomes less intuitive. If the data structure is complex, you need to serialization it, and serialization itself can affect performance. Another problem is that because you can use more memory, you may start to worry about the impact of the speed of virtual memory (that is, hard drive) on you.

The above content is how to analyze mmap and Direct Buffer in Java network programming. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.