What are the important and difficult points that programmers must master in Java virtual machines? 07/03 Update SLTechnology News&Howtos

What are the important and difficult points that programmers must master in Java virtual machines?

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article is to share with you about the important and difficult points that programmers must master in Java virtual machines. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Concept

Virtual machine: a software simulation of a complete computer system with complete hardware system functions and running in a completely isolated environment, which is the software implementation of the physical machine. The commonly used virtual machine is VMWare,Visual Box,Java Virtual Machine (Java virtual machine, JVM for short).

Java virtual machine camp: Sun HotSpot VM, BEA JRockit VM, IBM J9 VM, Azul VM, Apache Harmony, Google Dalvik VM, Microsoft JVM …

Start the process

Basic architecture

The Java runtime compiles the source code (.java) into bytecode, which is run by jre. Jre is implemented by the java Virtual Machine (jvm). Jvm analyzes the bytecode, then interprets it and executes it.

JVM consists of three main subsystems:

1. Class loader subsystem

two。 Runtime data area (memory)

3. Executive engine

Class loader subsystem

Class loading includes loading, connecting (authentication, preparation, parsing (optional)), and initialization. ClassLoader and its subclasses are responsible for class loading.

Load: find and read bytecode files on your hard disk through IO

Connection: perform verification, preparation, parsing (optional) steps

Check the correctness of the bytecode file

Prepare, allocate memory to static variables of the class, and assign default values

Parse, convert symbolic references to direct references, and the class loader loads all other classes referenced by the class

Initialization: initializes the static variable of the class to the specified value and executes the static code block

Class loader architecture

1. Startup class loader: responsible for loading the core class library of JRE, such as rt.jar,charsets.jar under the jre target.

2. Extension class loader: responsible for loading the JAR class package in the JRE extension directory ext

3. System class loader: responsible for loading class packages under the ClassPath path

4. User-defined loader: responsible for loading class packages under user-defined paths

Class loading mechanism (parent delegation)

Overall responsibility for the entrustment mechanism. Overall responsibility, when a ClassLoader loads a class, unless shown using another ClassLoader, the classes that the class depends on and references are also loaded by this ClassLoader. Delegation mechanism: first entrusts the parent class loader to find the target class, and then uses its own path to find and load the target class if it cannot be found.

Runtime data area

Heap (Java pile)

Created when the virtual machine is started, it is used to store object instances. Almost all objects (including constant pools) allocate memory on the heap. When the object can no longer apply for memory in this space, an OutOfMemoryError exception will be thrown. It is also the main area managed by the garbage collector. The maximum heap and minimum heap can be specified by the-Xmx-Xms parameter, respectively. Thread sharing.

Stack (Java stack)

Is the memory model executed by the java method, which executes the java method for the virtual machine, and each method creates a stack frame (used to store local variables, Operand stack, dynamic link, method exit, etc.) while executing. Thread monopoly.

Jvm regulates two kinds of exceptions for this area:

1. If the stack depth requested by the thread is greater than that allowed by the virtual machine stack, a StackOverFlowError exception will be thrown.

2. If the virtual machine stack can be expanded dynamically, an OutOfMemoryError will be thrown when sufficient memory space cannot be applied for. Stack space is specified by the jvm parameter-Xss, and the size of the stack determines the depth of the function call.

Local method stack

The native method is executed for the virtual machine, and other specifications are similar to the java stack. Different types of virtual machines are free to implement this area. Thread monopoly.

PC register (program counter)

The address used to store instructions to be executed. Branch, loop, jump, exception handling, thread recovery and other functions all rely on pc registers. Thread monopoly.

If the thread executes a java method, the pc register holds the address of the instruction to be executed. If a native method is executed, the pc register is empty.

Metadata area

The metadata zone replaces the permanent generation, which is similar in nature to the permanent generation, which is the implementation of the legal zone in the JVM specification, except that the metadata area is not in the virtual machine, but uses local memory. The metadata area is in frequent use, and OutOfMemory exceptions can occur.

Dynamic extension of the metadata area. The default-XX: MetaspaceSize value is the high water mark of 21MB. Once touched, the Full GC will be triggered and unloaded the useless class (the class loader corresponding to the class will no longer survive), and the high water mark will be reset. The value of the new high watermark depends on the metaspace released after GC. If less space is released, the high water mark rises. If too much space is released, the high water mark falls.

Executive engine

The execution engine reads the bytecode of the runtime data area and executes it one by one

(1) interpreter: the interpreter interprets bytecode faster, but executes slowly, one sentence at a time.

(2) JIT compiler: JIT compiler eliminates the shortcomings of interpreters. The execution engine converts the bytecode through the interpreter, and when it finds duplicate code, it uses the JIT compiler, which compiles the entire bytecode and changes it to native code. This native code will be used directly for repeated method calls, which improves the performance of the system.

The components of JIT are:

Intermediate code generator (Intermediate Code Generator): generates intermediate code.

Code optimizer (Code Optimizer): responsible for optimizing the intermediate code generated above.

Object code generator (Target Code Generator): responsible for generating machine code or native code.

Parser (Profiler): a special component responsible for finding hotspots (methods that are called multiple times)

(3) garbage collector: collect and delete unreferenced objects. The program can call System.gc () to trigger garbage collection, but execution is not guaranteed.

Native method Interface (JNI): JNI will interact with the native method library and provide the native library needed to execute the engine.

Local method library: a collection of native libraries required by the execution engine.

Garbage collection (GC:Garbage Collection)

1. How to identify garbage and determine whether the object can be recycled?

Reference counting: add a counter to each object, add 1 when there is a reference to the object, and subtract 1 when the reference expires. Whether the object counter is 0 is used to determine whether the object can be recycled. Disadvantages: unable to solve the problem of circular references

Root search algorithm: also known as accessibility analysis, through the "GC ROOTs" object as the search starting point, through the reference down search, the path taken is called the reference chain. Determine whether an object can be recycled by whether it has a path to the reference chain (objects that can be used as GC ROOTs: objects referenced in the virtual machine stack, objects referenced by class static properties in the method area, objects referenced by constants in the method area, objects referenced by JNI in the local method stack)

2 the heap in GC Java is the main area where GC collects garbage. There are two types of GC: Minor GC, Full GC (or Major GC).

Minor GC: the collection is triggered when the new generation (Young Gen) runs out of space. Since most objects in the Java usually do not need to survive for a long time, the new generation is the area where GC collects frequently, so the replication algorithm is adopted.

Full GC: the old age (Old Gen) space is insufficient or the meta-space reaches the high water mark to perform the collection operation. Because the storage of large objects and long-lived objects takes up a large memory space and the recovery efficiency is low, the mark-clear algorithm is adopted.

GC algorithm

According to the recovery strategy, it is divided into: marking-clearing algorithm, marking-finishing algorithm, replication algorithm.

1. Mark-clear algorithm: there are two stages of "marking" and "clearing". First of all, mark which objects can be recycled, and uniformly reclaim the memory space occupied by all the marked objects after the marking is completed. Inadequacies: 1. Unable to handle the problem of circular references 2. The efficiency is not high. Generate a large amount of memory debris (ps: too much space debris may result in not being able to apply for enough contiguous memory space later when allocating large objects, causing a new round of gc to be triggered in advance)

two。 Marking-finishing algorithm: it is divided into two stages: "marking" and "finishing". First mark which objects can be recycled, after the marking is complete, move the object to one end, and then directly clean up the memory beyond the boundary.

3. Replication algorithm: divide the memory space into two equal areas, using only one area at a time. Gc traverses the current use area and copies the objects that are in use to another area. The algorithm only deals with the objects in use each time, so the replication cost is relatively small, and the corresponding memory can be demarcated after the copy, without the problem of "fragmentation". Inadequacies: 1. Memory utilization problem 2. When the survival rate of the object is high, its efficiency will become lower.

According to the partition treatment can be divided into: incremental collection algorithm, generation collection algorithm

1. Incremental collection: real-time garbage collection algorithm, that is, garbage collection while the application is carried out, which can theoretically solve the problems caused by the traditional generation method. Incremental collection divides the heap space into a series of memory blocks, some of which are used first, and the surviving objects in the previously used parts are then put into the later useless space during garbage collection. this can achieve the effect of collecting while using all the time, avoiding the situation of collection paused after the whole use of the traditional generation method.

two。 Generation-by-generation collection: (commercial default) based on the object life cycle is divided into new generation, old age, metaspace, using different algorithms to recover objects with different life cycle.

According to the system thread, it can be divided into serial collection algorithm, parallel collection algorithm and concurrent collection algorithm.

1. Serial collection: single thread is used to deal with garbage collection, which is easy to implement and efficient. Inadequacies: 1. Unable to take advantage of multiprocessors 2. User threads need to be paused

two。 Parallel collection: multi-threading is used to deal with garbage collection, which is fast and efficient. Theoretically, the more the number of CPU, the more it can show the advantage of parallel collector. Deficiency: user thread needs to be paused

3. Concurrent collection: garbage threads and user threads work at the same time. The system does not need to pause user threads during garbage collection

GC collector

Garbage collection algorithm is the theoretical basis of memory collection, and garbage collector is the concrete implementation of memory collection.

The 1.Serial collector, which is mainly aimed at the new generation of collectors, is the most basic and oldest collector. It is a single-threaded collector, and all user threads must be paused while working. The collector adopts replication algorithm.

The Serial Old collector is mainly aimed at the old collection, using the tag-finishing algorithm, which is simple and efficient, but it will stop.

The 2.ParNew collector is a multithreaded version of Serial that uses multithreaded garbage collection (parallel collector, response first) for the new generation of replication algorithms.

3.Parallel Scavenge adopts replication algorithm for the new generation of multithreaded collectors (parallel collectors, throughput priority). Throughput and pause time can be controlled, that is, throughput = time to run user code / (time to run user code + garbage collection time).

The Parallel Old collector is an older version of the Parallel Scavenge collector (parallel collector) that uses multithreading and mark-up algorithms.

4.CMS (Current MarkSweep) collector is a kind of collector aiming at obtaining the shortest recovery pause time in the old age. It is a concurrent collector, which uses the mark-clear algorithm.

The new generation of 5.G1 is similar to ParNew, using replication algorithm. When the new generation occupies a certain proportion, it begins to collect. The old era is similar to CMS, but the difference is that it uses the tag-collation algorithm.

G1 is therefore a parallel and concurrent collector that can make full use of multi-CPU, multi-core environments. And it can establish a predictable pause time model.

Compared with the CMS collector, the G1 collector has the following characteristics:

1. Space integration, G1 collector uses mark-demarcation algorithm, will not produce memory space fragmentation. Allocating large objects (directly into the Humongous area to store short-term giant objects without going directly to the old age and avoiding a lot of overhead of Full GC) will not trigger the next GC in advance because you can't find contiguous space. (Full GC is triggered when young copies, old-age transferred objects have no free partitions, and giant objects do not have contiguous partitions, which should be avoided.)

two。 Predictable pause and reducing pause time is the common concern of G1 and CMS, but in addition to pursuing low pause, G1 can also build a predictable pause time model, which allows users to specify that in a length of N milliseconds, the time spent on garbage collection should not exceed N milliseconds, almost reaching the level of Java real-time system (RTSJ) garbage collector.

3.G1 divides the Java heap into several equal-sized independent regions (Region). Although it retains the concept of the new generation and the old age, it is no longer a physical barrier, they are all (discontiguous) collections of Region.

Common combinations of collectors

Idea of JVM performance tuning

Understand the GC log

[GC [PSYoungGen: 8192 K-> 1000K (9216K)] 16004K-> 14604K (29696K), 0.0317424 secs] [Times: user=0.06 sys=0.00, real=0.03 secs] [GC [PSYoungGen: 919K-> 1016K (9216K)] 22796K-> 20780K (29696K), 0.0314567 secs] [Times: user=0.06 sys=0.00, real=0.03 secs] [Full GC [PSYoungGen: 819K-> 819K (9216K)] [ParOldGen: 20435K-> 20435K (20480K)] 28627K > 28627K (29696K), [Metaspace: 8469K-> 8469K (1056768K)] 0.1307495 secs] [Times: user=0.50 sys=0.00, real=0.13 secs] [Full GC [PSYoungGen: 8192 K-> 8192 K (9216K)] [ParOldGen: 20437K-> 20437K (20480K)] 28629K-> 28629K (29696K), [Metaspace: 8469K-> 8469K (1056768K)], 0.1240311 secs] [Times: user=0.42 sys=0.00, real=0.12 secs]

Common anomalies

StackOverflowError: (stack overflow) OutOfMemoryError: Java heap space (insufficient heap space) OutOfMemoryError: GC overhead limit exceeded (GC takes more than 98% of the time, and GC reclaims less than 2% of memory)

GC parameter

Stack Settin

-Xss: stack size of each thread-Xms: initial heap size, default physical memory 1/64-Xmx: maximum heap size, default physical memory 1/4-Xmn: new generation size-XX:NewSize: set new generation initial size-XX:NewRatio: default 2 means that the new generation accounts for 1x2 of the old generation, accounting for 1x3 of the entire heap memory. -XX:SurvivorRatio: the default of 8 means that a survivor area occupies the Eden memory of 1amp 8, that is, the new generation of memory of 1max 10. -XX:MaxMetaspaceSize: set the maximum allowed size of metaspace, which is unlimited by default. JVM Metaspace will be expanded dynamically.

Garbage collection statistics

-XX:+PrintGC-XX:+PrintGCDetails-XX:+PrintGCTimeStamps-Xloggc:filename

Collector Settings

-XX:+UseSerialGC: set the serial collector-XX:+UseParallelGC: set the parallel collector-XX:+UseParallelOldGC: use the parallel collector in the old days-XX:+UseParNewGC: use the parallel collector in the new generation-XX:+UseParalledlOldGC: set the parallel old collector-XX:+UseConcMarkSweepGC: set the CMS concurrent collector-XX:+UseG1GC: set the G1 collector-XX:ParallelGCThreads: set the number of threads used for garbage collection

Parallel Collector Settings

-XX:ParallelGCThreads: sets the number of CPU to use when collecting by the parallel collector. Number of threads collected in parallel. -XX:MaxGCPauseMillis: sets the maximum pause time for parallel collection-XX:GCTimeRatio: sets the percentage of garbage collection time to program running time. The formula is 1 / (1cm n) CMS Collector setting-XX:+UseConcMarkSweepGC: setting CMS concurrent Collector-XX:+CMSIncrementalMode: set to incremental mode. It is suitable for single CPU situation. -XX:ParallelGCThreads: the number of CPU used when setting the new generation collection mode of concurrent collector to parallel collection. Number of threads collected in parallel. -XX:CMSFullGCsBeforeCompaction: sets the number of CMS garbage collections followed by memory compression-XX:+CMSClassUnloadingEnabled: allows the collection of class metadata-XX:UseCMSInitiatingOccupancyOnly: indicates that CMS collection occurs only when the threshold is reached-XX:+CMSIncrementalMode: set to incremental mode. For single CPU cases-XX:ParallelCMSThreads: set the number of threads in the CMS-XX:CMSInitiatingOccupancyFraction: set how much space the CMS collector is used to trigger-XX:+UseCMSCompactAtFullCollection: set whether the CMS collector will defragment the memory after completing garbage collection

G1 Collector Settings

-XX:+UseG1GC: use the G1 collector-XX:ParallelGCThreads: specify the number of threads GC works on-XX:G1HeapRegionSize: specify the partition size (1MB~32MB, and must be a power of 2). By default, the whole heap is divided into 2048 partitions-XX:GCTimeRatio: throughput size, integers from 0 to 100 (default 9) If the value is n, the system will spend no more than 1 / (1mm n) on garbage collection-XX:MaxGCPauseMillis: target pause time (default 200ms)-XX:G1NewSizePercent: initial Cenozoic memory space (default 5%)-XX:G1MaxNewSizePercent: maximum Cenozoic memory space-XX:TargetSurvivorRatio:Survivor filling capacity (default 50%)-XX:MaxTenuringThreshold: maximum tenure threshold (default 15)-XX:InitiatingHeapOccupancyPercen: old age occupancy The space exceeds the whole heap ratio IHOP threshold (default 45%) Perform mixed collection when exceeded-XX:G1HeapWastePercent:% of heap waste (default 5%)-XX:G1MixedGCCountTarget: maximum total number of parameter mixing cycles (default 8)

Performance analysis and monitoring tools

Jps: virtual machine process status tool

Jstat: virtual machine statistics monitoring tool

Jinfo: virtual machine configuration information tool

Jmap: memory mapping tool

Jhat: virtual machine heap dump snapshot analysis tool

Jstack: stack trace tool

JConsole:java Monitoring and Management console

VisualVM: fault handling tool

Thank you for reading! This is the end of this article on "what are the important and difficult points that programmers must master in Java virtual machines?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.