How to reduce the cost of Java garbage collection 07/19 Update SLTechnology News&Howtos

How to reduce the cost of Java garbage collection

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of "how to reduce the cost of Java garbage collection". The editor shows you the operation process through an actual case. The operation method is simple, fast and practical. I hope this article "how to reduce the cost of Java garbage collection" can help you solve the problem.

Tip # 1: capacity of prediction sets

All standard Java collections, including implementations of customization and extensions (such as Trove and Google's Guava), use arrays (native data types or object-based types) underneath. Because the size of an array is immutable once it is allocated, adding elements to the collection will in most cases result in the need to reapply for a new high-capacity array to replace the old array (the array used by the underlying implementation of the collection).

Even if the size of collection initialization is not provided, most collection implementations try to optimize the processing of reallocation arrays and spread their overhead to a minimum. However, you can get the best results by providing size when constructing the collection.

Let's analyze the following code as a simple example:

Public static List reverse (List & lt;? Extends T & gt; list) {

List result = new ArrayList ()

For (int I = list.size ()-1; I & gt; = 0; iMel -) {

Result.add (list.get (I))

}

Return result

}

This method allocates a new array, then fills it up with items from another list, only in reverse order. This method allocates a new array and then populates it with elements from another list, except that the order of the elements has changed.

This approach can be costly in terms of performance, and the optimization point is the line of code that adds elements to the new list. With each addition of an element, list needs to make sure that its underlying array has enough space to hold the new element. If there is a free location, simply store the new element in the next free slot. If not, a new underlying array is assigned, the contents of the old array are copied to the new array, and new elements are added. This will result in multiple allocations of arrays, and those remaining old arrays will eventually be recycled by GC.

We can avoid these redundant allocations by letting the underlying array know how many elements it will store when constructing the collection.

Public static List reverse (List & lt;? Extends T & gt; list) {

List result = new ArrayList (list.size ())

For (int I = list.size ()-1; I & gt; = 0; iMel -) {

Result.add (list.get (I))

}

Return result

}

The above code specifies enough space through ArrayList's constructor to store list.size () elements, and the allocation is completed during initialization, which means that List does not have to reallocate memory during the iteration.

Guava's collection class goes a step further, allowing you to explicitly specify the number of expected elements or specify a predictive value when initializing the collection.

one

2List result = Lists.newArrayListWithCapacity (list.size ())

List result = Lists.newArrayListWithExpectedSize (list.size ())

In the above code, the former is used for us to know exactly how many elements the collection is going to store, while the latter is allocated in a way that takes into account the misestimation.

Tip # 2: processing data streams directly

When dealing with data streams, such as reading data from a file or downloading data from a network, the following code is very common:

1byte [] fileData = readFileToByteArray (new File ("myfile.txt"))

The resulting byte array may be parsed by XML documents, JSON objects, or protocol buffered messages, as well as some common options.

This is unwise when dealing with large files or when the size of the file is unpredictable, because when JVM cannot allocate a buffer to process the real file, it results in OutOfMemeoryErrors.

Even if the size of the data is manageable, using the above pattern can still incur huge overhead when it comes to garbage collection because it allocates a very large area in the heap to store file data.

A better approach is to use the appropriate InputStream (such as FileInputStream in this example) to pass it directly to the parser instead of reading the entire file into a byte array at once. All major open source libraries provide corresponding API to accept an input stream directly for processing, such as:

FileInputStream fis = new FileInputStream (fileName)

MyProtoBufMessage msg = MyProtoBufMessage.parseFrom (fis)

Tip # 3: using immutable objects

There are too many benefits of immutability. I don't even have to repeat it. However, there is one advantage that can have an impact on garbage collection, which should be paid attention to.

The properties of an immutable object cannot be modified after the object has been created (the example here uses properties of reference data types), such as:

Public class ObjectPair {

Private final Object first

Private final Object second

Public ObjectPair (Object first, Object second) {

This.first = first

This.second = second

}

Public Object getFirst () {

Return first

}

Public Object getSecond () {

Return second

}

Instantiating the above class results in an immutable object-all its properties are decorated with final and cannot be changed after construction.

Immutability means that all objects referenced by an immutable container are created before the container is constructed. In the case of GC: this container is at least as young as the youngest reference it holds. This means that when the younger generation performs garbage collection, GC skips immutable objects because they are old and does not complete the collection of immutable objects until it is determined that they are not referenced by any objects in the old age.

Fewer scanned objects means less scanned memory pages, and fewer scanned memory pages mean shorter GC lifecycles, shorter GC pauses and better overall throughput.

Tip # 4: be careful of string concatenation

Strings are probably the most commonly used non-native data structures in all JVM-based applications. However, because of its implicit overhead burden and easy to use, it is very easy to become the culprit that takes up a lot of memory.

The problem is obviously not the string literals, but the initialization of allocating memory at run time. Let's take a quick look at an example of dynamically building a string:

Public static String toString (T [] array) {

String result = "["

For (int I = 0; I & lt; array.length; iTunes +) {

Result + = (array [I] = = array? "this": array [I])

If (I & lt; array.length-1) {

Result + = ","

}

Result + = "]"

Return result

}

This seems like a good way to take an array of characters and return a string. But this is catastrophic for object memory allocation.

It's hard to see behind the grammatical sugar, but here's what's going on behind the scenes:

Public static String toString (T [] array) {

String result = "["

For (int I = 0; I & lt; array.length; iTunes +) {

StringBuilder sb1 = new StringBuilder (result)

Sb1.append (Array [I] = = array? "this": array [I])

Result = sb1.toString ()

If (I & lt; array.length-1) {

StringBuilder sb2 = new StringBuilder (result)

Sb2.append (,)

Result = sb2.toString ()

}

StringBuilder sb3 = new StringBuilder (result)

Sb3.append ("]")

Result = sb3.toString ()

Return result

}

Strings are immutable, which means that each time splicing occurs, they themselves are not modified, but new strings are allocated in turn. In addition, the compiler uses the standard StringBuilder class to perform these stitching operations. This can be problematic because each iteration is implicitly assigned both a temporary string and a temporary StringBuilder object to help build the final result.

The best way to avoid the above is to use StringBuilder and direct append instead of the local splicing operator ("+"). Here is an example:

Public static String toString (T [] array) {

StringBuilder sb = new StringBuilder ("[")

For (int I = 0; I & lt; array.length; iTunes +) {

Sb.append (Array [I] = = array? "this": array [I])

If (I & lt; array.length-1) {

Sb.append (,)

}

Sb.append ("]")

Return sb.toString ()

}

Here, we only assign a unique StringBuilder at the beginning of the method. At this point, all the strings and elements in the list are appended to a separate StringBuilder. Finally, you use the toString () method to convert it to a string and return it at one time.

Tip # 5: collections that use specific native types

The Java standard collection library is simple and generic, allowing semi-static binding of types when using collections. For example, it's a great way to create a Set that only holds strings or to store a map such as Map.

The real problem is when we want to use a list to store the int type, or a map to store the double type as value. Because generics do not support native data types, another option is to replace them with wrapper types, where we use List.

This approach is wasteful because an Integer is a complete object, and the header of an object occupies 12 bytes and its internal maintained int properties, for a total of 16 bytes per Integer object. This consumes four times as much space as storing the same number of int-type list! A more serious problem than this is that, in fact, because Integer is a true object instance, it needs to be considered by the garbage collector during the garbage collection phase.

To deal with this problem, we use the great Trove collection library in Takipi. Trove removes the specificity of some generics to support specific collections of native types that use memory more efficiently. For example, we use Map, which consumes a lot of performance, and there is another special option in Trove in the form of TIntDoubleMap

TIntDoubleMap map = new TIntDoubleHashMap ()

Map.put (5,7.0)

Map.put (- 1, 9.999)

...

The underlying implementation of Trove uses arrays of native types, so boxing (int- > Integer) or unboxing (Integer- > int) of elements does not occur when manipulating the collection, and no objects are stored because the underlying data is stored using native data types.

That's all for "how to reduce the cost of Java garbage collection". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.