Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of StringTable in JVM

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the example analysis of StringTable in JVM, which has a certain reference value, and interested friends can refer to it. I hope you will gain a lot after reading this article.

Basic characteristics of StringTableString

String: string, represented by a pair of "" causes

String S1 = "Nemo"; / / how literal quantities are defined

String S2 = new String ("Nemo")

String is declared as final and cannot be inherited

String implements the Serializable interface: indicates that strings are serialized.

Implements the Comparable interface: indicates that the string can be compared in size

String defines private final char [] value internally in jdk8 and before to store string data.

Change to byte when JDK9 []

Why did JDK9 change the structure?

Official statement: http://openjdk.java.net/jeps/254

The current implementation of the String class stores characters in the char array, using two bytes (16 bits) for each character. Data collected from many different applications shows that strings are the main component of heap usage, and that most string objects contain only Latin characters. These characters require only one byte of storage, so half of the space in the internal char array of these string objects will not be used.

We recommend changing the internal representation of the string class from a utf-16 character array to a byte array + an encoding-flag field. The new String class stores characters encoded as ISO-8859-1/Latin-1 (one byte per character) or UTF-16 (two bytes per character) depending on the contents of the string. The encoding flag indicates which encoding to use.

Conclusion: String does not need char [] to store any more. It is changed to byte [] plus coding mark, which saves some space.

↓ CloseCode ↓

/ / before private final char value []; / / after private final byte [] value

String-based data structures, such as StringBuffer and StringBuilder, have also been modified.

The immutability of String

String: represents an immutable sequence of characters. Abbreviation: immutable.

When reassigning a string, the specified memory region assignment needs to be overridden and cannot be assigned using the original value.

When concatenating an existing string, you also need to reassign the memory region assignment, which cannot be assigned using the original value.

When you call the replace () method of String to modify a specified character or string, you also need to reassign the memory region assignment, which cannot be assigned using the original value.

Assign a string value literally (as opposed to new), where the string value is declared in the string constant pool.

Code

↓ CloseCode ↓

/ * String immutability * * @ author: Nemo * / public class StringTest1 {public static void test1 () {/ / literally defined, "abc" is stored in the string constant pool String S1 = "abc"; String S2 = "abc"; System.out.println (S1 = = S2); S1 = "hello"; System.out.println (S1 = = S2) System.out.println (S1); System.out.println (S2); System.out.println ("-");} public static void test2 () {String S1 = "abc"; String S2 = "abc" / / once modified, an object is recreated, which is immutable S2 + = "def"; System.out.println (S1); System.out.println (S2); System.out.println ("-");} public static void test3 () {String S1 = "abc" String S2 = s1.replace ('averse,' m'); System.out.println (S1); System.out.println (S2);} public static void main (String [] args) {test1 (); test2 (); test3 ();}}

Running result

↓ CloseCode ↓

Truefalsehelloabc-abcabcdef-abcmbc interview questions

↓ CloseCode ↓

/ * interview questions * * @ author: Nemo * / public class StringExer {String str = new String ("good"); char [] ch = {'t 'args'}; public void change (String str, char ch []) {str = "test ok"; ch [0] = 'baked;} public static void main (String [] args) {StringExer ex = new StringExer () Ex.change (ex.str, ex.ch); System.out.println (ex.str); System.out.println (ex.ch);}}

Output result

↓ CloseCode ↓

Attention, goodbest.

String constant pools do not store strings with the same content

The string Pool of String is a fixed-size Hashtable, and the default size length is 1009. If you put a lot of string into string Pool, it will cause serious Hash conflicts, resulting in a long linked list, and the direct impact of the long linked list is that the performance will be greatly degraded when string.intern is called.

Use-XX:StringTablesize to set the length of the stringTable

In jdk6, stringTable is fixed, which is 1009 in length, so if there are too many strings in the constant pool, it will lead to a rapid decline in efficiency. There are no requirements for stringTablesize settings

In jdk7, the default length of stringTable is 60013. The stringTablesize setting is not required.

In jdk8, the minimum length of StringTable can be set to 1009

Memory allocation of String

There are eight basic data types and a relatively special type string in the Java language. These types provide a concept of constant pooling in order to make them faster and more memory-efficient while running.

Constant pools are similar to a cache provided at the Java system level. The constant pools of the eight basic data types are all coordinated by the system, while the constant pools of String type are special. There are two main ways to use it.

String objects declared directly in double quotes are stored directly in the constant pool.

For example: string info= "atguigu.com"

If you are not a string object declared in double quotes, you can use the intern () method provided by string.

Java 6 and before, the string constant pool was stored in the permanent generation

The oracle engineers in Java 7 made a big change to the logic of the string pool, relocating the string constant pool to the Java heap.

All strings are stored in the Heap, like other normal objects, so that you only need to resize the heap for tuning applications.

The string constant pool concept was used a lot, but this change gives us good reason to reconsider using String.intern () in Java 7.

Java8 metaspace, string constants on the heap

Why does StringTable adjust from permanent generation to heap

Official website description: https://www.oracle.com/technetwork/java/javase/jdk7-relnotes-418459.html#jdk7changes

In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but in the main parts of the Java heap (called the younger and older generations), along with other objects created by the application. This change will result in more data residing in the primary Java heap and less data in the permanent build, so you may need to resize the heap. As a result of this change, most applications see only relatively small differences in heap usage, but this difference occurs in larger applications that load many classes or use a lot of strings. The intern () method will see a more significant difference.

Reason:

The default of the permanent generation is relatively small.

The frequency of permanent garbage collection is low.

Basic operation of String

The Java language specification requires exactly the same string literals, should contain the same Unicode character sequence (constant containing the same code point sequence), and must point to the same String class instance.

String concatenation operation

The splicing result of constant and constant is in the constant pool, and the principle is compilation time optimization.

There will be no variables with the same content in the constant pool

As long as one of them is a variable, the result is in the heap. The principle of variable splicing is StringBuilder.

If the stitching result calls the intern () method, the string object that is not already in the constant pool is actively put into the pool and the object address is returned.

↓ CloseCode ↓

Public static void test1 () {String S1 = "a" + "b" + "c"; / / get the constant pool of abc String S2 = "abc"; / / abc is stored in the constant pool, and the address of the constant pool is directly returned / * * the final java is compiled into .class, and then execute .class * / System.out.println (S1 = = S2) / / true, because it is stored in the string constant pool System.out.println (s1.equals (S2)); / / true} public static void test2 () {String S1 = "javaEE"; String S2 = "hadoop"; String S3 = "javaEEhadoop"; String S4 = "javaEE" + "hadoop"; String S5 = S1 + "hadoop"; String S6 = "javaEE" + S2 String S7 = S1 + S2; System.out.println (S3 = = S4); / / true System.out.println (S3 = = S5); / / false System.out.println (S3 = = S6); / / false System.out.println (S3 = = S7); / / false System.out.println (S5 = = S6); / / false System.out.println (S5 = = S7) / / false System.out.println (S6 = = S7); / / false String S8 = s6.intern (); System.out.println (S3 = = S8); / / true}

From the above results, we can know:

If a variable appears before and after the splicing symbol, it is equivalent to new String () in the heap space, and the specific content is the stitching result.

Calling the intern method determines whether there is a JavaEEhadoop value in the string constant pool, and if so, returns the value in the constant pool, otherwise it is created in the constant pool.

Underlying principle

The underlying layer of the stitching operation actually uses StringBuilder.

Implementation details of S1 + S2

StringBuilder s = new StringBuilder ()

S.append (S1)

S.append (S2)

S.toString ();-> similar to new String ("ab")

After JDK5, StringBuilder is used, and StringBuffer is used before JDK5

The value of StringStringBufferStringBuilderString is immutable, which causes each operation on String to generate a new String object, which is not only inefficient, but also wastes a lot of priority memory space. StringBuffer is a mutable class, and a thread-safe string manipulation class, and any operation on the string it points to will not produce a new object. Each StringBuffer object has a certain buffer capacity. When the string size does not exceed the capacity, no new capacity will be allocated. When the string size exceeds the capacity, the capacity variable class will be automatically increased, and the speed will be faster and immutable.

Thread safety thread is not safe

Multithreaded operation string single thread operation string

Note that if we are variables, we need new StringBuilder to concatenate, but if we use final decorations, we get it from the constant pool. So string constants or constant references on the left and right sides of the splicing symbol are still optimized by the compiler. In other words, variables modified by final will become constant, and classes and methods will not be inherited.

In development, when you can use final, it is recommended to use the

↓ CloseCode ↓

Public static void test4 () {final String S1 = "a"; final String S2 = "b"; String S3 = "ab"; String S4 = S1 + S2; System.out.println (S3 = = S4);}

Running result

↓ CloseCode ↓

Performance comparison between true splicing operation and append

↓ CloseCode ↓

Public static void method1 (int highLevel) {String src = ""; for (int I = 0; I < highLevel; iTunes +) {src + = "a"; / / each loop creates a StringBuilder object}} public static void method2 (int highLevel) {StringBuilder sb = new StringBuilder (); for (int I = 0; I < highLevel) ITunes +) {sb.append ("a");}}

Time consumed by method 1: 4005ms, time consumed by method 2: 7ms

Conclusion:

The efficiency of adding strings through StringBuilder's append () is much higher than that of String's string concatenation method.

Benefits

StringBuilder's append way to create only one StringBuilder object from beginning to end

For string concatenation, you also need to create a lot of StringBuilder objects and String objects that are created when toString is called

Because more StringBuilder and String objects are created in memory, it takes up too much memory. If you do GC, it will take more time.

Room for improvement

We use the empty parameter constructor of StringBuilder, the default string capacity is 16, and then copy the original string to the new string. We can also initialize a larger length by default to reduce the number of capacity expansion.

Therefore, in the actual development, we can make sure that the string that needs to be added before and after is not higher than a certain limit value, then it is recommended to use the constructor to create a threshold length.

Use of intern ()

Intern is a native method that calls the method of the underlying C

The string pool is initially empty and is maintained privately by the String class. When the intern method is called, if the pool already contains a string equivalent to the string object determined by the equals (object) method, the string in the pool is returned. Otherwise, the string object is added to the pool and a reference to the string object is returned.

If you are not a string object declared in double quotes, you can use the intern method provided by string: the intern method queries the string constant pool for the existence of the current string, and if not, puts the current string into the constant pool.

For example:

↓ CloseCode ↓

String myInfo = new string ("I love atguigu") .intern

That is, if the string.intern method is called on any string, the class instance to which the result is returned must be exactly the same as the string instance that appears directly as a constant. Therefore, the value of the following expression must be true:

↓ CloseCode ↓

("a" + "b" + "c") .intern () = = "abc"

Generally speaking, Interned string is to ensure that there is only one copy of the string in memory, which can save memory space and speed up the execution of string manipulation tasks. Note that this value is stored in the string internal pool (String Intern Pool)

Space efficiency Test of intern

When we pass the test, there is actually a big difference between using intern and not using it.

↓ CloseCode ↓

/ * use Intern () to test execution efficiency * @ author: Nemo * / public class StringIntern2 {static final int MAX_COUNT = 1000 * 10000; static final String [] arr = new string [Max _ COUNT]; public static void main (String [] args) {Integer [] data = new Integer [] {1pr. System.out.println +) {arr [I] = new String (String.valueOf (data [I% data.length]). Intern ();} long end = System.currentTimeMillis (); System.out.println ("time spent:" (end-length)); try {Thread.sleep (1000000);} catch (Exception e) {e.getStackTrace () }}}

Conclusion: using the intern () method can save memory space when there are a lot of existing strings in the program, especially if there are a lot of duplicated strings.

A large website platform requires a large number of strings to be stored in memory. For example, social networking sites, many people store: Beijing, Haidian District and other information. At this point, if all strings call the intern () method, the memory size will be significantly reduced.

The interview question new String ("ab") creates several objects

↓ CloseCode ↓

How many objects will be created by / * * new String ("ab")? If you look at the bytecode, you can see that it is two objects * * @ author: Nemo * / public class StringNewTest {public static void main (String [] args) {String str = new String ("ab");}}

We convert it to bytecode to see

↓ CloseCode ↓

0 new # 2 3 dup 4 ldc # 3 6 invokespecial # 4 9 astore_110 return

There are two objects in it.

One object is: the new keyword is created in the heap space

Another object: an object in the string constant pool

New String ("a") + new String ("b") creates several objects

↓ CloseCode ↓

How many objects will be created by / * * new String ("ab")? If you look at the bytecode, you can see that it is two objects * * @ author: Nemo * / public class StringNewTest {public static void main (String [] args) {String str = new String ("a") + new String ("b");}}

The bytecode file is

↓ CloseCode ↓

0 new # 23 dup 4 invokespecial # 3 7 new # 4 10 dup11 ldc # 5 13 invokespecial # 6 16 invokevirtual # 7 19 new # 4 22 dup23 ldc # 8 25 invokespecial # 6 28 invokevirtual # 7 31 invokevirtual # 9 34 astore_135 return

We created six objects.

Object 1:new StringBuilder ()

Object 2:new String ("a")

Object 3: an of the constant pool

Object 4:new String ("b")

Object 5: constant pool b

Object 6:toString would be approximately equal to creating a new String ("ab")

Calling the toString method does not generate "ab" in the constant pool (but "a" and "b") because the "ab" constant is not declared. Of course, if it's new String ("ab"), then the constant pool must have constants.

Use of intern: in JDK6 and JDK7JDK6

↓ CloseCode ↓

String s = new String ("1"); / / already exists in the constant pool, the "1" constant is put into the constant pool, and the new object is put into the heap s.intern (); / / put the object into the constant pool. But calling this method doesn't make much difference, because there is already 1String S2 = "1"; System.out.println (s = = S2); / / falseString S3 = new String ("1") + new String ("1"); the address of the / / S3 variable record is: new String ("11") / / after the last line of code is executed, there is no "11" s3.intern () in the string constant pool; / / generate "11" in the string constant pool. How to understand: when a new object "11" is created in jdk6, there is a new address / / jdk7: instead of creating "11" in the constant pool, an address String S4 = "11" pointing to new String ("11") in the heap space is created; / / S4 variable records the address: using the "11" address System.out.println (S3 = = S4) generated in the constant pool when the previous line of code is executed; / / false

Output result

↓ CloseCode ↓

Falsefalse

Why are the objects different?

String s = new String ("1"); the "1" constant is put into the constant pool, and the new object is put on the heap

When String S2 = "1";, check the constant pool and find that there is "1", directly returning a reference to the constant pool without creating an object.

For the S2 string, its creation process is the same as mentioned above. Before creating the object, JVM will search the String object pool to see if the character object has been created, and if it has been created, it will directly return a reference, otherwise it will first create the return reference.

And the s string variable, its creation process needs one more step. In addition to being similar to the str2 string object creation process, it creates a new String object, which is the function of the new keyword, and returns a reference to s.

One is the object created by new, which is the address in the heap space

One is literal assignment, which is the object in the constant pool and the address of the constant pool slave, which is obviously not the same.

If it's like this, then it's true.

↓ CloseCode ↓

String s = new String ("1"); s = s.intern (); String S2 = "1"; System.out.println (s = = S2); / / true

For the following, because the address recorded by the S3 variable is new String ("11"), after the execution of this code, there is no "11" in the constant pool, which is the relationship of JDK6. After executing s3.intern (), "11" is generated in the constant pool, and finally S4 uses the address of S3.

Why is the final output S3 = = S4 false?

This is because a new object "11" has been created in JDK6, that is, a new address, S2 = new address.

In JDK7, in JDK7, instead of innovating a new object, it points to a new object in the constant pool.

In JDK7

↓ CloseCode ↓

String s = new String ("1"); s.intern (); String S2 = "1"; System.out.println (s = = S2); / / falseString S3 = new String ("1") + new String ("1"); s3.intern (); String S4 = "11"; System.out.println (S3 = = S4); / / true

Expansion

↓ CloseCode ↓

String S3 = new String ("1") + new String ("1"); String S4 = "11"; / / the string s3.intern () generated in the constant pool; / / then S3 will look in the constant pool and find it and do nothing System.out.println (S3 = = S4)

When we move the position of S4 up one line, we find that it will change a lot, and the final result is false.

Summary

Summarize the use of intern () for string:

In JDK1.6, try to put this string object into the string pool (string constant pool).

If there is one in the string pool, it will not be put in. Returns the address of the object in the existing string pool

If not, a copy of the object is copied, put into the string pool, and the object address in the string pool is returned

From JDK1.7, try to put this string object into the string pool.

If there is one in the string pool, it will not be put in. Returns the address of the object in the existing string pool

If not, a copy of the reference address of the object is copied, put into the string pool, and the reference address in the string pool is returned

Exercise:

In JDK6, create a string "ab" in the string constant pool

In JDK8, instead of creating a "ab" in the string constant pool, the address in the heap is copied to the string pool.

So the above results in JDK6 are:

↓ CloseCode ↓

Truefalse

In JDK8 is

↓ CloseCode ↓

Truetrue

For the following question, the performance is the same in JDK6 and 8

Use intern () to optimize execution efficiency: space usage

Conclusion: for a large number of strings in the program, especially if there are many duplicate strings, using intern () can save memory space.

A large website platform requires a large number of strings to be stored in memory. For example, social networking sites, many people store: Beijing, Haidian District and other information. At this point, if all strings call the intern () method, the memory size will be significantly reduced.

Garbage collection of StringTable

↓ CloseCode ↓

/ * * String garbage collection *-Xms15m-Xmx15m-XX:+PrintStringTableStatistics-XX:+PrintGCDetails * @ author: Nemo * / public class StringGCTest {public static void main (String [] args) {for (int I = 0; I < 1000000; iTunes +) {String.valueOf (I). Intern ();}

After execution, there are only more than 60000 objects because of garbage collection.

Deduplication operation of String in G1

Official statement: http://openjdk.java.net/jeps/192

Note that the repetition here refers to the data in the heap, not in the constant pool, because the constant pool itself does not repeat.

String str1 = new String ("hello")

String str2 = new String ("hello")

It refers to the de-weight of the objects in this heap.

Description

Background: tests on many Java applications (both large and small) yield the following results:

String objects account for 25% of the heap survival data set.

There are 13.5% duplicate string objects in the heap survival data set.

The average length of a string object is 45

The bottleneck of many large-scale Java applications is memory. Tests show that in these types of applications, almost 25% of the data sets that survive in the Java heap are String objects. Furthermore, almost half of the String objects are duplicated, which means:

Stringl.equals (string2) = true. Having duplicate string objects on the heap must be a waste of memory. This project will implement automatic and continuous deduplication of duplicate string objects in the G1 garbage collector, so as to avoid wasting memory.

Realize

When the garbage collector works, it accesses the surviving objects on the heap. Each accessed object is checked to see if it is a candidate string object to be deduplicated.

If so, insert a reference to this object into the queue for subsequent processing. A deduplicated thread runs in the background, processing the queue. Processing an element of the queue means removing the element from the queue and then trying to repeat the string object that it references.

Use a hashtable to record all non-repeating char arrays used by string objects. When deduplicated, the hashtable is checked to see if an identical char array already exists on the heap.

If it exists, the string object will be adjusted to reference that array, releasing the reference to the original array and eventually being reclaimed by the garbage collector.

If the lookup fails, the char array is inserted into the hashtable so that the array can be shared later.

Thank you for reading this article carefully. I hope the article "sample Analysis of StringTable in JVM" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report