What is the Bug like which leads to the large physical memory consumption of JVM? 04/12 Update SLTechnology News&Howtos

What is the Bug like which leads to the large physical memory consumption of JVM?

2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Lead to JVM physical memory consumption of Bug is how, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

Overview

Recently, our company was looking up a JVM problem (JDK1.8.0_191-b12) for a customer and found that a system was always dropped by OS Kill, which was caused by a memory leak. In the process of checking, another Bug of JVM was found by mistake. This Bug may lead to the use of a large amount of physical memory, which we have fed back to the community and received quick feedback, which is expected to be released in the latest version of OpenJDK8 (this problem also exists in JDK11).

PS: the user's problem was finally solved, and it was positioned as a design flaw in C2 that resulted in a large amount of memory being used and security not guaranteed.

Find out which thread consumes a lot of memory

Next, we mainly share the discovery process of this BUG. First, customers are asked to track the process in real time. When memory usage increases significantly, you can see a lot of memory allocation of 64MB through / proc//smaps, and Rss is basically consumed.

7fd690000000-7fd693f23000 rw-p 00000000 00:00 0 Size: 64652 kBRss: 64652 kBPss: 64652 kBShared_Clean: 0 kBShared_Dirty: 0 kBPrivate_Clean: 0 kBPrivate_Dirty: 64652 kBReferenced: 64652 kBAnonymous: 64652 kBAnonHugePages: 0 kBSwap: 0 kBKernelPageSize: 4 kBMMUPageSize: 4 kBLocked: 0 kBVmFlags: rd wr mr mw me nr sd 7fd693f23000-7fd694000000-p 00000000 00:00 0 Size: 884kBRss: 0 kBPss: 0 kBShared_Clean: 0 kBShared_Dirty: 0 kBPrivate_Clean: 0 kBPrivate_Dirty: 0 kBReferenced: 0 kBAnonymous : 0 kBAnonHugePages: 0 kBSwap: 0 kBKernelPageSize: 4 kBMMUPageSize: 4 kBLocked: 0 kBVmFlags: mr mw me nr sd

Then trace the system call through the strace command, and then go back to the above virtual address, we found the relevant mmap system call

[pid 71] 13 0x7fd690000000 34 PROT_NONE 41.982589 mmap (0x7fd690000000, 67108864, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE,-1,0) = 0x7fd690000000

The thread executing mmap is thread 71, then dump the thread through jstack and find that the corresponding thread is actually C2 CompilerThread0.

"C2 CompilerThread0" # 39 daemon prio=9 os_prio=0 tid=0x00007fd8acebb000 nid=0x47 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE

Finally, grep the output of strace, and sure enough, this thread is allocating a lot of memory, with a total of more than 2G.

Classical 64m problem

For the problem of 64m, it is a very classic problem, and there is no logic for allocating 64m in large quantities in JVM, so we can rule out the allocation of JVM-specific meaning. This is actually a mechanism for allocating memory for malloc functions in glibc. Glibc has provided a mechanism since 2.10. In order to allocate memory more efficiently, glibc provides an arena mechanism. By default, the size of each arena under 64 bits is 64m. The following is 64m computing logic, where sizeof (long) is 8.

Define DEFAULT_MMAP_THRESHOLD_MAX (4 * 1024 * 1024 * sizeof (long)) define HEAP_MAX_SIZE (2 * DEFAULT_MMAP_THRESHOLD_MAX) p2 = (char *) MMAP (aligned_heap_area, HEAP_MAX_SIZE, PROT_NONE, MAP_NORESERVE)

The maximum number of arena that a process can allocate is 8 * core,32 in 64 bits and 2 * core in 64 bits.

# define NARENAS_FROM_NCORES (n) ((n) * (sizeof (long) = = 4? 2: 8) {int n = _ get_nprocs (); if (n > = 1) narenas_limit = NARENAS_FROM_NCORES (n); else / * We have no information about the system. Assume two cores. * / narenas_limit = NARENAS_FROM_NCORES (2);}

The advantage of this allocation mechanism is mainly to deal with the multi-threaded environment, leaving several 64m cache blocks for each core, so that the thread becomes more efficient when allocating memory because there is no lock. If the upper limit is reached, it will be allocated in the slow main_arena.

You can set the number of 64m blocks by setting the environment variable MALLOC_ARENA_MAX. When we set it to 1, we will find that these 64m blocks are gone, and then they are centrally allocated to a large area, that is, main_arena, indicating that this parameter is in effect.

Unwittingly discover

Going back to thinking about why C2 threads have a memory consumption greater than 2G, I accidentally tracked C2 and found that the following code may cause a lot of memory consumption, this code location is nmethod.cpp 's nmethod::metadata_do method, but if this piece does happen, it must not see a large number of C2 threads allocated, but see the VMThread thread, because the following code is mainly executed by it.

Void nmethod::metadata_do (void f (Metadata*)) {address low_boundary = verified_entry_point (); if (is_not_entrant ()) {low_boundary + = NativeJump::instruction_size; /%% Note: On SPARC we patch only a 4-byte trap, not a full NativeJump. / (See comment above.)} {/ / Visit all immediate references that are embedded in the instruction stream. RelocIterator iter (this, low_boundary); while (iter.next ()) {if (iter.type () = = relocInfo::metadata_type) {metadata_Relocation* r = iter.metadata_reloc (); / / In this metadata, we must only follow those metadatas directly embedded in / / the code. Other metadatas (oop_index > 0) are seen as part of / / the metadata section below. Assert (1 = = (r-> metadata_is_immediate ()) + (r-> metadata_addr () > = metadata_begin () & & r-> metadata_addr ())

< metadata_end()), "metadata must be found in exactly one place"); if (r->

Metadata_is_immediate () & & r-> metadata_value ()! = NULL) {Metadata* md = r-> metadata_value (); if (md! = _ method) f (md);}} else if (iter.type () = = relocInfo::virtual_call_type) {/ / Check compiledIC holders associated with this nmethod CompiledIC * ic = CompiledIC_at (& iter) If (ic- > is_icholder_call ()) {CompiledICHolder* cichk = ic- > cached_icholder (); f (cichk- > holder_metadata ()); f (cichk- > holder_klass ());} else {Metadata* ic_oop = ic- > cached_metadata (); if (ic_oop! = NULL) {f (ic_oop) } inline CompiledIC* CompiledIC_at (RelocIterator* reloc_iter) {assert (reloc_iter- > type () = = relocInfo::virtual_call_type | | reloc_iter- > type () = = relocInfo::opt_virtual_call_type, "wrong reloc. Info "); CompiledIC* c_ic = new CompiledIC (reloc_iter); cymic-> verify (); return cymic;}

Notice the above CompiledIC * ic = CompiledIC_at (& iter) This code, because CompiledIC is a ResourceObj, this kind of resource will be allocated in c heap (malloc), but they are associated with threads, if we declare ResourceMark somewhere in the code, then the current location will be marked when it is executed here, and then when the thread wants to allocate memory, if the thread associated memory is not enough, it will be malloc and managed, otherwise it will achieve memory reuse. When the ResourceMark destructor executes, it restores the previous location, and the later thread reuses the memory block from this location if it wants to allocate memory. Note that the memory block mentioned here is not the same concept as the 64m memory block above.

Because this code is in the while loop, there are a lot of repeated calls, so that where memory can be reused after one execution cannot be reused, it may result in a large amount of memory being constantly allocated. It may be that the physical memory consumption is much higher than that of Xmx.

The fix is also simple, adding ResourceMark rm; before CompiledIC * ic = CompiledIC_at (& iter);.

This problem mainly occurs in scenarios where Class Retransform or Class Redefine are done frequently and massively. So if you have this kind of agent in the system, you should pay a little attention to this problem.

After the discovery of this problem, we mentioned patch to the community, but later found that it has actually been fixed in JDK12, but not in the previous version. After this problem was submitted to the community, some people responded quickly and may be fix in OpenJDK1.8.0-212.

Finally, I would like to briefly mention the problem on the customer's side. The main reason why the C2 thread consumption is too large is that there is a very large method to compile, and this compilation process requires a lot of memory consumption, which leads to a sudden increase in memory, so I would like to give you a suggestion that the method should not be written too big. If this method is called frequently, it will be really tragic.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.