Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Instance Analysis of memory layout of Java object

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the Java object memory layout instance analysis of the relevant knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe that everyone after reading this Java object memory layout instance analysis article will have a harvest, let's take a look.

The code in this paper runs on the basis of JDK 1.8.0mm 261m 64murbit HotSpot.

1. Overview of object memory structure

Before introducing the composition of an object in memory, let's briefly review the process of creating an object:

1. Jvm loads the class file where the object resides into the method area

2. Jvm reads the entry of the main method, puts the main method on the stack, and executes the code to create the object

3. Allocate the reference to the object in the stack memory of the main method, store the created object in the heap, and point the reference in the stack to the object in the heap

So when the object is instantiated, it is stored in heap memory, and the object here consists of three parts, as shown in the following figure:

Give a brief description of the functions of each component:

Object header: the object header stores information about the state of the object at run time, a pointer to the metadata of the class to which the object belongs, and an additional array length of the object if the object is an array object

Instance data: the instance data stores the real valid data of the object, that is, the value of each property field, and if you have a parent class, it also contains the fields of the parent class. The storage order of fields is affected by the length of the data type and the allocation policy of the virtual machine.

Align padding bytes: in java objects, the reason for the need to align padding bytes is that the size of the object in 64-bit jvm is required to be aligned to 8 bytes, so when the object's length is less than an integer multiple of 8 bytes, the padding operation needs to be performed in the object. Note that the dotted line is used in the aligned padding part of the diagram, because the padding byte is not a fixed part, which is explained later when calculating the size of the object.

2. Brief introduction of JOL tools

Before we start to study the memory structure of objects, let's first introduce the tools we want to use. Openjdk's official website provides a tool to view the memory layout of objects, jol (java object layout), which can introduce coordinates into maven:

Org.openjdk.jol jol-core 0.14

Use the methods provided by jol to view jvm information in your code:

System.out.println (VM.current () .details ())

From the printed information, you can see that we are using 64-bit jvm, and pointer compression is turned on, and the object uses 8-byte alignment by default. The method of viewing the memory layout of an object through jol will be shown in detail in a later example. Let's start the formal learning of the memory layout of an object.

3. Object head

First take a look at the components of the object Object header. The structure will be different depending on the normal object and the array object. Only if the object is an array object will there be an array length part. Ordinary objects do not have this part.

Mark word occupies 8 bytes in the object header. When pointer compression is turned on by default, klass pointer takes 4 bytes, and the array length of array objects occupies 4 bytes. Now that you understand the infrastructure of the object header, take an empty object that does not contain any properties as an example, take a look at its memory layout and create the User class:

Public class User {}

Use jol to view the memory layout of the object header:

Public static void main (String [] args) {User user=new User (); / / View the memory layout of the object System.out.println (ClassLayout.parseInstance (user). ToPrintable ());}

Execute the code to view the print information

OFFSET: offset address in bytes

SIZE: memory usage (in bytes)

Types defined in TYPE:Class

DESCRIPTION: type description, Obejct header for object header, alignment for alignment padding

VALUE: corresponds to the value stored in memory

The current object occupies a total of 16 bytes, because the 8-byte token plus a 4-byte type pointer does not satisfy the alignment to 8 bytes, so 4 bytes need to be filled:

8B (mark word) + 4B (klass pointer) + 0B (instance data) + 4B (padding)

In this way, we have an intuitive way to understand the basic composition of the simplest empty object without attributes in memory. On this basis, let's go deep into the various components of the object head.

3.1 Mark Word markup words

In the object header, mark word has a total of 64 bit that are used to store the object's own run-time data

3.1.1 Lock upgrade based on mark word

Before jdk6, undifferentiated heavyweight locks were used when locking through the synchronized keyword. Heavyweight locks caused serial execution of threads and made cpu switch frequently between user mode and kernel state. With the continuous optimization of synchronized, the concept of lock upgrade is put forward, and bias lock, lightweight lock and heavy lock are introduced. In mark word, the lock flag bit occupies 2 bit, combined with one bit biased lock flag bit, so that through the reciprocal 3 bits, it can be used to identify the state of the lock held by the current object and determine what information is stored in the rest of the bit.

The process of mark word-based lock upgrade is as follows:

1. When the lock object was first created, there was no thread competition, and the object was in a lock-free state. In the memory layout of the empty object printed above, according to the size end, the last 8 bits are 00000001, indicating that they are unlocked and unbiased. This is because there is a delay of 4 seconds to start the biased lock in jdk, which means that the object created 4 seconds after jvm starts will open the biased lock. We cancel this delay time by using the parameter jvm:

-XX:BiasedLockingStartupDelay=0

The last three bits are 101, indicating that the lock of the current object is not held and is in a biased state.

2. Under the condition that there is no thread competition, the first thread that acquires the lock writes its own threadId to the mark word of the object through CAS. If the thread acquires the lock again, it needs to compare whether the threadId in the current thread threadId and the object mark word is consistent. If so, it can be obtained directly, and the lock object always maintains a bias towards the thread, that is to say, the biased lock will not be released actively.

Use the code to test the process of repeatedly acquiring the lock by the same thread:

Public static void main (String [] args) {User user=new User (); synchronized (user) {System.out.println (ClassLayout.parseInstance (user). ToPrintable ());} System.out.println (ClassLayout.parseInstance (user). ToPrintable ()); synchronized (user) {System.out.println (ClassLayout.parseInstance (user). ToPrintable ());}}

You can see that when a thread locks, unlocks, and reacquires an object's lock, the mark word does not change, and the current thread pointer in the lock always points to the same thread.

3. When two or more threads acquire locks alternately, but do not acquire locks concurrently on the object, the preferred lock is upgraded to a lightweight lock. At this stage, the thread uses the spin mode of CAS to try to acquire the lock to avoid the consumption of cpu transition between user mode and kernel state caused by blocking threads. The test code is as follows:

Public static void main (String [] args) throws InterruptedException {User user=new User (); synchronized (user) {System.out.println ("- MAIN--:" + ClassLayout.parseInstance (user). ToPrintable ());} Thread thread = new Thread (()-> {synchronized (user) {System.out.println ("- THREAD--:" + ClassLayout.parseInstance (user). ToPrintable ();}})) Thread.start (); thread.join (); System.out.println ("- END--:" + ClassLayout.parseInstance (user). ToPrintable ());}

The process of changing the entire locked state is as follows:

The main thread first locks the user object, and the first lock is a 101biased lock

After the child thread waits for the main thread to release the lock, the user object is locked, and the bias lock is upgraded to 00 lightweight lock

After the lightweight lock is unlocked, the user object has wireless range competition, returns to 001 unlocked state, and is in an unbiased state. If a thread tries to acquire the lock of the user object later, the lightweight lock will be added directly rather than biased towards the lock

4. When two or more threads synchronize concurrently on the same object, in order to avoid useless spins consuming cpu, lightweight locks will be upgraded to heavy locks. At this point, the pointer in mark word points to the starting address of the monitor object (also known as the pipe or monitor lock). The test code is as follows:

Public static void main (String [] args) {User user = new User (); new Thread (()-> {synchronized (user) {System.out.println ("- THREAD1--:" + ClassLayout.parseInstance (user). ToPrintable ()); try {TimeUnit.SECONDS.sleep (2);} catch (InterruptedException e) {e.printStackTrace () }) .start (); new Thread (()-> {synchronized (user) {System.out.println ("--THREAD2--:" + ClassLayout.parseInstance (user). ToPrintable ()); try {TimeUnit.SECONDS.sleep (2);} catch (InterruptedException e) {e.printStackTrace () }}) .start ();}

As you can see, when two threads compete for locks on user objects at the same time, they are upgraded to 10 heavyweight locks.

3.1.2 additional Information

Describe other important information in mark word:

Hashcode: hashcode in the unlocked state uses deferred loading, and writes are calculated only when the hashCode () method is called for the first time. Verify this process:

Public static void main (String [] args) {User user=new User (); / print memory layout System.out.println (ClassLayout.parseInstance (user). ToPrintable ()); / / calculate hashCode System.out.println (user.hashCode ()); / / print memory layout System.out.println (ClassLayout.parseInstance (user). ToPrintable ()) again;}

As you can see, before the hashCode () method is called, the 31-bit hash does not exist and is all populated with 0. After the method is called, according to the size of the end, the populated data is:

1011001001101100011010010101101

Convert base 2 to base 10, corresponding to a hash of 1496724653. It is important to note that mark word is written only when calling an unoverridden Object.hashCode () method or System.identityHashCode (Object) method, and the user-defined hashCode () method is not written.

You may notice that when the object is locked, there is not enough space in the mark word to hold the hashCode, and the hashcode is moved to the Object Monitor of the heavyweight lock.

Epoch: timestamp biased towards the lock

Generation age (age): in the garbage collection process of jvm, every time the object passes through Young GC, the age will be increased by 1. Here 4 digits indicate that the maximum generation age is 15, which is why the object will be moved to the old age after 15. You can change the age threshold at startup by adding parameters:

-XX:MaxTenuringThreshold

When the set threshold exceeds 15:00, an error will be reported at startup

3.2 Klass Pointer type pointer

Klass Pointer is a pointer to the Class information in the method area through which the virtual machine determines the instance of the class to which the object belongs. Pointer compression is supported in 64-bit JVM. Depending on whether pointer compression is enabled, the size occupied by Klass Pointer will be different:

When pointer compression is not turned on, type pointers occupy 8B (64bit)

When pointer compression is turned on, type pointers occupy 4B (32bit)

In versions after jdk6, pointer compression is enabled by default, and can be turned on or off through the startup parameter:

# enable pointer compression:-XX:+UseCompressedOops# turn off pointer compression:-XX:-UseCompressedOops

Or take the user class as an example, turn off pointer compression and check the memory layout of the object again.

Although the size of the object is still 16 bytes, the composition has changed. 8-byte markup words plus 8-byte type pointers can already meet the alignment conditions, so there is no need to populate.

8B (mark word) + 8B (klass pointer) + 0B (instance data) + 0B (padding) 3.2.1 pointer compression principle

Now that we understand the role of pointer compression, let's take a look at how pointer compression is implemented. First of all, without pointer compression, the memory address of an object is represented by 64 bits, and the memory address range that can be described is:

0 ~ 2 ^ 64-1

When pointer compression is turned on, 4 bytes or 32 bits can be used to represent 2 ^ 32 memory addresses. If this address is the real address, since the smallest unit of CPU addressing is Byte, then it is 4GB memory. This is far from enough for us, but as we said before, objects in java use 8-byte alignment by default, which means that an object must occupy an integer multiple of 8 bytes, which creates a condition that jvm does not need to use a real memory address when locating an object, but navigates to the address (which is the number of a mapped address) mapped by java.

The mapping process is also very simple, because the address offset of each object after 8-byte alignment must be 0, so the last 3 bits 0 can be erased (converted to bit is to erase the last 24 bits) when storing, and then the highest bit is removed, which completes the compression of the pointer from 8 bytes to 4 bytes. In practical use, the mapping to the real address can be realized by adding 3 bits 0 after the compressed pointer.

After compression, each bit in the 32 bits of the pointer can now represent 8 bytes, which is equivalent to an 8-fold expansion of the original memory address. So in the case of 8-byte alignment, 32 bits can represent up to 2 ^ 32 * 8=32GB memory, and the memory address range is:

0 ~ (2 ^ 32-1) * 8

Since the maximum memory that can be represented is 32GB, pointer compression will fail if the configured maximum heap memory exceeds this value. Configure jvm startup parameters:

-Xmx32g

At this point, pointer compression fails and the pointer length is restored to 8 bytes. What if the memory of the business scenario exceeds the 32GB? you can expand it again by changing the default alignment length. We will modify the alignment length to 16 bytes:

-XX:ObjectAlignmentInBytes=16-Xmx32g

You can see that the pointer occupies 4 bytes after compression, while the object is filled and aligned to 16 bytes. According to the above calculation, pointer compression will not work until the maximum heap memory is configured as 64GB.

Make a brief summary of pointer compression:

Through pointer compression and the characteristic of alignment filling, the effect of memory address expansion is achieved by mapping.

Pointer compression can save memory space and improve the addressing efficiency of the program.

It is best not to exceed 32GB when setting heap memory, when pointer compression will fail, resulting in a waste of space.

In addition, pointer compression can act not only on type pointers of object headers, but also on field pointers of reference types, and on array pointers of reference types

3.3 Array length

If an object is an array object, there is a space in the object header to hold the length of the array, occupying 4 bytes (32bit) of space. Test with the following code:

Public static void main (String [] args) {User [] user=new User [2]; / / View the memory layout of the object System.out.println (ClassLayout.parseInstance (user). ToPrintable ());}

The memory structure from top to bottom is as follows:

8-byte mark word

4-byte klass pointer

The length of a 4-byte array with a value of 2, indicating that there are two elements in the array

When pointer compression is turned on, each reference type occupies 4 bytes, and the two elements in the array account for 8 bytes.

It should be noted that when pointer compression is not turned on, there will be a segment of aligned padding bytes after the length of the array.

By calculation:

8B (mark word) + 8B (klass pointer) + 4B (array length) + 16B (instance data) = 36B

You need to align to 8 bytes, and here you choose to add the aligned 4 bytes between the array length and the instance data.

4. Instance data

Instance data (Instance Data) holds the valid information actually stored by the object, saves the field contents of various data types defined in the code, and if there is an inheritance relationship, the subclass also contains fields inherited from the parent class.

Basic data types:

TypeBytesbyte,boolean1char,short2int,float4long,double8

Reference data type:

It takes 8 bytes when pointer compression is enabled, and 4 bytes when pointer compression is enabled.

4.1 Field reordering

Add a property field of the basic data type to the User class:

Public class User {int id,age,weight; byte sex; long phone; char local;}

As you can see, in memory, the order of the attributes is different from that defined in the class, because jvm uses field reordering techniques to reorder the original types for memory alignment. The specific rules are as follows:

Arranged according to the length of the data type, from large to small

Fields of the same length will be assigned to adjacent locations

If the length of a field is L bytes, then the offset (OFFSET) of the field needs to be aligned to nL (n is an integer)

The first two rules above are relatively easy to understand. Here is an example to explain Article 3:

Because the long type is 8 bytes, its offset must be 8n, plus the previous object header occupies 12 bytes, so the minimum offset of the long type variable is 16. By printing the object memory layout, we can find that when the object header is not an integer multiple of 8 bytes (only 8n+4 bytes exist), it will be filled with attributes of 4, 2, and 1 bytes in the order from largest to smallest. In order to distinguish it from alignment padding, it can be called pre-padding, and alignment padding will be carried out if it still does not meet 8-byte integer multiples after padding. In the case of prefixes, the sorting of fields breaks the first rule above.

So in the above memory layout, 4-byte int is used for prepadding, and then the first rule is arranged in order from largest to smallest. If we delete three fields of type int, then look at the memory layout:

Variables of types char and byte are mentioned for prefix and 1-byte alignment padding before the long type.

4.2 have a parent class

When a class has a parent class, the whole follows the principle that the variables defined in the parent class appear before the variables defined in the subclass

Public class A {int i1, d2; long L1, L2; char C1, c2;} public class B extends A {boolean b1; double D1, d2;}

If the parent class needs postcomplement, the variables with shorter type length in the subclass may be advanced, but the whole still follows the principle that the subclass is after the parent class.

Public class A {int i1dir i2; long L1;} public class B extends A {int i1dir i2; long L1;}

As you can see, the variables of shorter length in the subclass are postpadded after being advanced to the parent class.

The front alignment padding of the parent class will be inherited by the child class

Public class A {long l;} public class B extends A {long L2; int i1;}

When class B does not inherit class A, it satisfies exactly 8-byte alignment and does not require alignment padding. When class B inherits class A, it inherits the prefix padding of class A, so it also needs to be aligned at the end of class B.

4.3 reference data types

In the above example, only the sorting of basic data types is discussed, so what is the sorting situation if there is a reference data type? Add a reference type to the User class:

Public class User {int id; String firstName; String lastName; int age;}

You can see that by default, variables of the basic data type come before the reference data type. This order can be modified in the jvm startup parameters:

-XX:FieldsAllocationStyle=0

Rerun, you can see that the order of the reference data types is placed first:

The different values of FieldsAllocationStyle are briefly described:

0: first put the reference pointer of the ordinary object, and then put the basic data type variable

1: by default, it means that the basic data type variable is placed first, and then the reference pointer of the ordinary object is placed.

4.4 static variable

Based on the above, add static variables to the class:

Public class User {int id; static byte local;}

As you can see from the results, the static variable is not in the memory layout of the object, and its size is not calculated in the object, because the static variable belongs to the class, not to an object.

5. Align padding bytes

In Hotspot's automatic memory management system, the starting address of an object must be an integer multiple of 8 bytes, that is, the size of an object must be an integral multiple of 8 bytes. Therefore, if the instance data is not aligned, then you need to fill the vacancy with alignment, and the completed bit bit only acts as a placeholder and has no special meaning.

In the previous example, we already have a good understanding of alignment padding, so here are some additions:

When pointer compression is enabled, if there are variables of type long/double in the class, a gap will be formed between the object header and the instance data. In order to save space, the variables of shorter length will be put in front by default. This function can be turned on or off through the parameter jvm:

# enable-XX:+CompactFields# off-XX:-CompactFields

When the test is off, you can see that the shorter-length variables are not populated forward:

In the previous pointer compression, we mentioned that you can change the alignment width, which is also achieved by modifying the following jvm parameter configuration:

-XX:ObjectAlignmentInBytes

By default, the alignment width is 8, and this value can be changed to an integer power of 2 within 2, typically 8-byte alignment or 16-byte alignment. The test was modified to 16-byte alignment:

In the above example, when adjusted to 16 bytes, the property field of the last row is only 6 bytes, so 10 bytes are added for alignment. Of course, in general, it is not recommended to modify the alignment length parameter, if the alignment width is too long, it may lead to a waste of memory space.

This is the end of the article on "Java object memory layout instance Analysis". Thank you for reading! I believe you all have a certain understanding of the knowledge of "Java object memory layout instance Analysis". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report