Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Does substring in Java really cause memory leaks?

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces you to Java substring really will cause memory leakage, the content is very detailed, interested friends can refer to, I hope to help you.

In Java development, String is a type that we can say must be used to develop programs. String has a substring method to intercept strings, which we must use often. But did you know that there is some discussion in forums and communities abroad about whether substring in Java 6 will cause memory leaks, so much so that Java officials have marked it as a bug, and Java 7 has been reimplemented for this reason. Read here may be your question, how can substring cause memory leakage? Then we take the problem, walk into the small black room, see if substring has memory leakage, and how it causes the so-called memory leakage.

basic introduction

The substring method provides two overloads, *** for methods that accept only one parameter at the start of the intercept.

public String substring(int beginIndex)

For example, if we use the above method,"unhappy".substring(2) returns the result "happy"

Another overload is a method that accepts parameters for a start intercept position and an end intercept position.

public String substring(int beginIndex, int endIndex)

Using this method,"smiles".substring(1, 5) returns the result "mile"

Through this introduction we basically understand the role of substring, so that we can understand the following content.

preparations

Because this problem occurs in Java 6, if your Java version number is not Java 6, you need to adjust it.

Terminal Adjustment (for Mac)

View java version number

13:03 $ java -version java version "1.8.0_25" Java(TM) SE Runtime Environment (build 1.8.0_25-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)

Switch to 1.6

export JAVA_HOME=$(/usr/libexec/java_home -v 1.6)

Ubuntu uses alternatives --config java, Fedora uses alternatives --config java above.

If you are using Eclipse, you can select Project, right-click and select Properties- Java Compiler for special assignments.

problem recurrence

Here is the code used to reproduce the problem in java official bug.

public class TestGC { private String largeString = new String(new byte[100000]); String getString() { return this.largeString.substring(0,2); } public static void main(String[] args) { java.util.ArrayList list = new java.util.ArrayList(); for (int i = 0; i

< 1000000; i++) { TestGC gc = new TestGC(); list.add(gc.getString()); } } } 然而上面的代码,只要使用Java 6 (Java 7和8 都不会抛出异常)运行一下就会报java.lang.OutOfMemoryError: Java heap space的异常,这说明没有足够的堆内存供我们创建对象,JVM选择了抛出异常操作。 于是有人会说,是因为你每个循环中创建了一个TestGC对象,虽然我们加入ArrayList只是两个字符的字符串,但是这个对象中又存储largeString这么大的对象,这样必然会造成OOM的。 然而,其实你说的不对。比如我们看一下这样的代码,我们只修改getString方法。 public class TestGC { private String largeString = new String(new byte[100000]); String getString() { //return this.largeString.substring(0,2); return new String("ab"); } public static void main(String[] args) { java.util.ArrayList list = new java.util.ArrayList(); for (int i = 0; i < 1000000; i++) { TestGC gc = new TestGC(); list.add(gc.getString()); } } } 执行上面的方法,并不会导致OOM异常,因为我们持有的时1000000个ab字符串对象,而TestGC对象(包括其中的largeString)会在java的垃圾回收中释放掉。所以这里不会存在内存溢出。 那么究竟是什么导致的内存泄露呢?要研究这个问题,我们需要看一下方法的实现,即可。 深入Java 6实现 在String类中存在这样三个属性 value 字符数组,存储字符串实际的内容 offset 该字符串在字符数组value中的起始位置 count 字符串包含的字符的长度 Java 6中substring的实现 public String substring(int beginIndex, int endIndex) { if (beginIndex < 0) { throw new StringIndexOutOfBoundsException(beginIndex); } if (endIndex >

count) { throw new StringIndexOutOfBoundsException(endIndex); } if (beginIndex > endIndex) { throw new StringIndexOutOfBoundsException(endIndex - beginIndex); } return ((beginIndex == 0) && (endIndex == count)) ? this : new String(offset + beginIndex, endIndex - beginIndex, value); }

Constructor method called by the above method

//Package private constructor which shares value array for speed. String(int offset, int count, char value[]) { this.value = value; this.offset = offset; this.count = count; }

When we read the above code, we should be enlightened, so it is like this ah!

When we call substring of string a to get string b, in fact, this operation is nothing more than adjusting offset and count of b, using the content of the value character array before a, and not creating a new content character array dedicated to b.

For example, if we have a 1G string a, we use substring(0,2) to get a string b with only two characters. If b has a longer lifetime than a or if a is manually set to null, when garbage collection is performed, a is collected and b is not collected, then the 1G memory footprint still exists because b holds a reference to the 1G character array.

See here, you should be able to understand why the above code appears memory overflow.

Shared Content Character Array

In fact, it is a great design for the generated string in substring to share the content array with the original string, so as to avoid repeating the character array every time the substring is executed. As its documentation states, shared content character arrays are for speed. But for the problem in this example, the shared content character array is a bit lame.

how to solve

For the previously uncommon case where a 1G string intercepts only two characters, use the following code, so that there is no reference to the content array of the 1G string.

String littleString = new String(largeString.substring(0,2));

The following constructor copies the array if the length of the array of source string contents is greater than the length of the string, and the new string creates an array of characters containing only the contents of the source string.

public String(String original) { int size = original.count; char[] originalValue = original.value; char[] v; if (originalValue.length > size) { // The array representing the String is bigger than the new // String itself. Perhaps this constructor is being called // in order to trim the baggage, so make a copy of the array. int off = original.offset; v = Arrays.copyOfRange(originalValue, off, off+size); } else { // The array representing the String is the same // size as the String, so no point in making a copy. v = originalValue; } this.offset = 0; this.count = size; this.value = v; }

Java 7 implementation

In Java 7, the implementation of substring abandons the previous mechanism of content array sharing, and adopts array copy for substrings (except itself) to implement that a single string holds its own content.

public String substring(int beginIndex, int endIndex) { if (beginIndex

< 0) { throw new StringIndexOutOfBoundsException(beginIndex); } if (endIndex >

value.length) { throw new StringIndexOutOfBoundsException(endIndex); } int subLen = endIndex - beginIndex; if (subLen

< 0) { throw new StringIndexOutOfBoundsException(subLen); } return ((beginIndex == 0) && (endIndex == value.length)) ? this : new String(value, beginIndex, subLen); } substring方法中调用的构造方法,进行内容字符数组复制。 public String(char value[], int offset, int count) { if (offset < 0) { throw new StringIndexOutOfBoundsException(offset); } if (count < 0) { throw new StringIndexOutOfBoundsException(count); } // Note: offset or count might be near -1>

>>1. if (offset > value.length - count) { throw new StringIndexOutOfBoundsException(offset + count); } this.value = Arrays.copyOfRange(value, offset, offset+count); }

Is it really a memory leak?

We know that substring can cause memory problems in some cases, but is this called a memory leak?

In fact, I personally think this should not be regarded as a memory leak. The string b generated by using substring will certainly hold the content array reference of the original string a, but when both a and b are recycled, the content of the character array can also be garbage collected.

Which version is better?

Java 7's changes to substring received mixed feedback.

Individuals prefer Java 6 implementations, and when substring, use shared content character arrays, which are faster and don't have to reapply memory. While the memory performance issues in this article are possible, there are ways to fix them.

Java 7 implementations avoid the problem in this article by requiring no special operations from the programmer, but the performance of each substring operation is always worse than Java 6 implementations. This implementation seems a bit "bad."

Value of the problem

Although this issue appears in Java 6 and has been fixed in Java 7, it does not mean that we do not need to know, and Java 7's reimplementation has been severely criticized.

In fact, the value of this problem is still relatively valuable, especially the implementation of this optimization of content character array sharing. Hope to be able to provide help and some ideas for everyone's future design implementation.

Affected methods

Both trim and subSequence have operations that call substring. Java 6 and Java 7 substring implementation changes also indirectly affect these methods.

reference resource

The following three articles are all well written, but there are some slight problems, which I have already marked out. When you read them, you need to pay attention to them.

The substring() Method in JDK 6 and JDK 7 The string concatenation mentioned in this article to solve the problem in java6 is not recommended. For specific reasons, please refer to Java Details: String concatenation

How SubString method works in Java-Memory Leak Fixed in JDK 1.7 There is a conceptual error mentioned in this article, the new string does not prevent the old string from being recycled, but rather blocks the array of content characters in the old string. Attention is required when reading.

JDK-4513622 : (str) keeping a substring of a field prevents GC for object There is a test mentioned in this article that has a bit of a problem with using non-new forms, which ignores the existence of a string constant pool, see the following note for details.

note

In the code above to reproduce the problem

String getString() { //return this.largeString.substring(0,2); return new String("ab"); }

Here *** don't write like this, because there is a pool of string constants in the JVM,"ab" doesn't recreate new strings, all variables refer to an object, and new String() recreates objects every time.

String getString() { return "ab"; } About Java substring really can cause memory leak Mody shared here, I hope the above content can be of some help to everyone, you can learn more knowledge. If you think the article is good, you can share it so that more people can see it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report