How to analyze CPU cache and JVM memory model 07/12 Update SLTechnology News&Howtos

How to analyze CPU cache and JVM memory model

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article shows you how to analyze CPU cache and JVM memory model, the content is concise and easy to understand, can definitely make your eyes bright, through the detailed introduction of this article, I hope you can get something.

In the process of java learning, the memory model of JVM is very important. Today, we learn the memory model of computers to help us understand JVM memory.

1. Computer storage media:

Figure 1: simplified storage media for computers

(1) several types of storage media in the computer:

a. From bottom to top, hard disk, memory, cache, register

b. The access speed of the cache is about 1 X ns.

c. Read speed of main memory: 100ns

d. Read speed of hard disk: Xms

e. In general, the access speed of the cache is 10 to 100 times that of the main memory, while the access speed of the main memory is 1 to 10W times that of the hard disk.

(2) Mechanical hard read access time analysis:

a. Seek time: the time it takes to move the read / write head to the correct track. The average seek time is about 3-15ms.

b. Rotation delay time: the time required for disk rotation to move the sector where the requested data is located below the read / write head. The rotation delay depends on the speed of the disk, which is usually represented by 1x2 of the time it takes for the disk to rotate for one week. For example, the average rotation latency of a disk with 7200 rpm is about 60 * 1000 / 7200 / 2 = 4.17ms, while a disk with a speed of 15000 rpm has an average rotation latency of 2ms.

c. Data transfer time: the time required to complete the transmission of the requested data, which depends on the data transfer rate, which is equal to the data size divided by the data transfer rate. At present, the IDE/ATA can reach 133 MB and the SATA II can reach the interface data transfer rate of 300MB/s. The data transmission time is usually much less than the time consumed by the first two parts. It can be ignored in simple calculation.

(3) extended understanding: sequential reading and random reading

It takes time for the head of the mechanical hard disk to move to the correct track. When reading and writing at random, the head keeps moving, and the time is spent on head seeking, resulting in low performance. Therefore, for the mechanical hard disk, the continuous read and write performance is very good, but the random read and write performance is very poor.

(4) locality principle and disk pre-reading

Due to the characteristics of the storage medium, the access of the hard disk itself is much slower than the main memory, coupled with the cost of mechanical movement, the access speed of the hard disk is often one percentile of the main memory, so in order to improve efficiency, it is necessary to reduce the disk Imax O as much as possible. Because of the high efficiency of disk sequential reading (no seek time, only a little rotation time), pre-reading can improve the efficiency of Ibind O for programs with locality. The disk is often not strictly read on demand, but will be pre-read every time, even if only one byte is needed, the disk will start from this position and read a certain length of data back into memory in order. The theory of this is based on the famous locality principle in computer science: when a data is used, the data near it is usually used immediately.

Note: the expansion of knowledge is actually related to the reading and writing efficiency of high-performance Mysql, you can expand the study.

II. CPU cache

(1) several questions about cache

a. What is CPU cache: temporary memory between CPU and memory

b. Why CPU caching is required:

1. Its capacity is much smaller than memory, but the switching speed is much faster than memory.

2. The emergence of cache is mainly to solve the contradiction between the CPU operation speed and the memory read and write speed, because the CPU operation speed is much faster than the memory read and write speed, which makes CPU spend a long time waiting for the data to arrive or write the data to memory. The data in the cache is a small part of the memory, but this small part is about to be accessed by CPU in a short time. When CPU calls a large amount of data, it can be called in the cache first, thus speeding up the reading speed.

c. Cache model

Cpu cache line-> first-level cache-> second-level cache-> third-level cache-> main memory

d. Cache consistency under multithreading:

Cache consistency protocol. The most famous is Intel's MESI protocol, which ensures that copies of shared variables used in each cache are consistent. Its core idea is: when CPU writes data, if you find that the variable is a shared variable, that is, a copy of the variable also exists in other CPU, it will send a signal to other CPU to set the cache row of the variable into an invalid state, so when other CPU needs to read this variable, it will re-read it from memory.

(2), multi-CPU cache and multi-thread cache read and write

1. Each cpu has its own separate cache area

two。 Accessing data cached across cpu is slow. When accessing another cpu2 data across cup1, the data in cpu2 will be copy to cpu2 to keep a copy.

3. Multiple cpu caches may contain the same data cache rows

If 4.cpu modifies the cache row data, it will issue a RFO (Request For Owner) request to obtain the permission of the row data and identify the secondary row data. Other cpu threads cannot operate on this row data.

5. If the value referenced by a copy of the cpu data variable changes, the computer will force the cache row corresponding to each cpu referencing this data to be refreshed to ensure that the cache row data is unified.

6. The cache row data update is not in the minimum unit of containing variables, but in the smallest unit of cache behavior. If the cache row contains n pieces of data and the value of one of them is changed, then the rest of the cache is worth reloading.

Figure 2: multi-CPU cache read and write

(3) multiple CPU reading process:

1. Cache the line variable data xQuery y, and save copies in other cup caches (cpu1,cpu2...)

2.cpu1 modifies xQuery CPU 2 modifies y

3.cpu1 first got the right to execute, and the x value was modified successfully.

4.x value refreshes the main memory

The corresponding cache row of 5.cpu2 xrem y is set to invalid.

6.cpu2. The cache line gets the latest x, y values from memory

7.cpu2 got the right to execute, and the modification y was successful.

8.cpu1 cache line repeat 4-6

Understand the memory model of JVM:

Figure 3: JVM memory model

According to the computer's memory model, you can simply imagine whether JVM's memory model design also requires a main memory, a cache, and a CPU processor.

Memory model of datazone at JVM runtime

. Thread data sharing area: method area and heap

. Thread data quarantine: each thread has a separate memory: stack, program counter (PC register), local method stack

(a). Heap: objects, arrays, and instance variables from the runtime new are stored. The allocation of memory in the heap can be further refined. Due to jvm's garbage collection algorithm, heap memory is divided into younger, older and permanent memory areas.

Younger generation: divided into Eden and two Serivious areas: the size of the new generation can be allocated through parameter configuration-Xmn

Old age: objects that have survived many times of YoungGC will be moved to the old age.

Permanent generation: some large objects, which cannot be GC, will be stored in the permanent generation.

So the corresponding heap is equivalent to the main memory of the computer, where all the object information is stored.

(B). Method area: stores the meta-information, data structures, class information, and initialized static variables of the java class. The method area can be regarded as a computer three-level cache, and the execution of the program depends on obtaining class information.

(C). Local method stack: the call information of the local method, which is equivalent to a three-level cache of CUP

(d). Stack memory: each thread runs with an independent stack memory space, which mainly stores object references and some basic types of variable information, as well as address information. So stack memory is equivalent to first-level cache in cache.

1. A thread will have a separate stack frame, and the threads do not interfere with each other

2. Handles of basic types and objects are stored in stack memory.

3. Access speed is faster than that of heap memory blocks, second only to registers

4. Stack overflow: recursive call, insufficient memory to create a new stack frame, StackOverflowError

5. Stack memory overflow: OutOfMemoryError, not enough to create new threads

(e). Program counter: some JVM operation instruction information. Equivalent to CPU processor

Please correct what is wrong.

Figure 4. JVM memory model

The above content is how to analyze CPU cache and JVM memory model. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.