Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the maximum length of String

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article will explain in detail what the maximum length of String is, and the content of the article is of high quality, so the editor will share it with you for reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Preface

Do you feel bored when you see the question "what is the String length limit"? Indeed, this is how I felt when I first saw it.

However, when tracking the problem in depth, it is found that the significance of the length limit of String itself is not important, the important thing is that a large number of knowledge points will be linked together in the process, which is simply a perfect problem. No wonder similar questions arise in high-level interviews.

To follow the limits of the length of String, you need to remind the reader that the conclusion is not important, but the process of analysis and the knowledge reserve involved. For example, there are a lot of knowledge points, such as the underlying implementation of String, the scope of int types, the Java Virtual Machine Specification, the source code implementation of Java compiler, and so on.

String source code tracking

To see the length limit of the String class, you must start with the source code implementation of String. Here is an example of JDK8, which is the most widely used at present. The underlying implementation of JDK9 and later String has changed. You can refer to the article "JDK9's new round of optimization of String strings".

As we all know, the String class provides a length method. Can we know the maximum length of the String directly from this method?

/ * Returns the length of this string. * The length is equal to the number of Unicode * code units in the string. * * @ return the length of the sequence of characters represented by this * object. * / public int length () {return value.length;}

The document does not say what the maximum length is, but we can learn some clues from the type of result returned. The result type is int, which means that the range of values of int is one of the restrictions.

If you know that the value range of int in the positive integer part is 2 ^ 31-1, that's good. If you don't know, you can check the corresponding wrapper class Integer:

Public final class Integer extends Number implements Comparable {/ * * A constant holding the minimum value an {@ code int} can * have. * / @ Native public static final int MIN_VALUE = 0x80000000; / * * A constant holding the maximum value an {@ code int} can * have, 231-1. * / @ Native public static final int MAX_VALUE = 0x7fffffff; / /.}

Both MIN_VALUE and MAX_VALUE values or comments indicate the range of values for int. At this point, the maximum length of String should be:

2 ^ 31-1 = 2147483647

Going back to the length method, we see that the value of length is obtained through value, while value is implemented as an array of char in JDK8:

Public final class String implements java.io.Serializable, Comparable, CharSequence {/ * * The value is used for character storage. * / private final char value []; / /...}

The char in the internal code (running memory) in Java is encoded in the way of UTF16, and one char takes up two bytes. Therefore, you also need to multiply the above calculated value by 2.

At this time, the calculation formula is:

2 ^ 31-1 = 2147483647 16-bit Unicodecharacter 2147483647 * 2 = 4294967294 (Byte) 4294967294 / 1024 = 4194303.998046875 (KB) 4194303.998046875 / 1024 = 4095.9999980926513671875 (MB) 4095.99980926513671875 / 1024 = 3.9999813735485076904296875 (GB)

That is to say, the maximum memory footprint of a string is about equal to 4GB. But at this point, if you declare a string of 100000 in length, you will find that the compiler will throw an exception with the following message:

Error: constant string is too long

Didn't you say 2.1 billion? Why is 100000 abnormal? In fact, this exception is determined by the limitation of the compilation time.

Compilation time limit of string constant pool

Friends who know about the JVM virtual machine must know that when a string declaration is made literally, it will enter the Class constant pool as a constant after compilation.

String s = "New Horizon of Program"

The constant pool has a limit on the length of the String. Each data item in the constant pool has its own type. UTF-8-encoded Unicode strings in Java are represented as CONSTANT_Utf8 types in the constant pool.

You can see that String is defined by CONSTANT_String_info in the Java virtual machine specification.

You can see that "the value of the string_index entry must be a valid index on the constant pool, and the item of the constant pool at that index must be a CONSTANT_Utf8_info (§4.4.7) structure."

Move on to the definition of CONSTANT_Utf8_info:

Length indicates the length of the bytes [] array, of type U2. The definition of U2 can also be found in the Java virtual machine specification:

U2 represents the unsigned number of two bytes, with 8 bits in one byte and 16 bits in 2 bytes. Therefore, the maximum value that U2 can represent is 2 ^ 16-1 = 65535.

At this point, the second limitation is that the format of the constant pool in the Class file stipulates that the length of the string constant cannot exceed 65535.

At this point, if you try to declare a 65535-length string literally:

String s = "8888. 8888"; / / there are 655.35 million characters "8" in it.

The compiler also throws the same exception. And why is that?

The answer to this question can also be found in the Java Virtual Machine Specification (Section 4.7.3):

It turned out to be to make up for an early design bug that "is exactly 65535 bytes long and ends with a 1-byte instruction that cannot be processed by an exception handler", thus limiting the maximum length of the array to 65534.

If you can look at the source code for the compiler section of the JVM, you can see the code implementation for this limitation in the Gen class:

/ * * Check a constant value and report if it is a string that is * too large. * / private void checkStringConstant (DiagnosticPosition pos, Object constValue) {if (nerrs! = 0 | | / / only complain about a long string once constValue = = null | |! (constValue instanceof String) | | ((String) constValue). Length () < Pool.MAX_STRING_LENGTH) return; log.error (pos, "limit.string"); nerrs++;}

The definition of Pool.MAX_STRING_LENGTH is as follows:

Public class Pool {public static final int MAX_STRING_LENGTH = 0xFFF; / /.}

Try to declare a string of 65534 in length again and you will find that it compiles normally. At this point, it can be concluded that the maximum length of a string at compile time is 65534.

We know that Java distinguishes compile time from run time, so is there a length limit at run time?

Length limit of run time

The limitation of the run time of String is mainly reflected in the constructor of String. A constructor for String is as follows:

Public String (char value [], int offset, int count) {/ /...}

Where the parameter count is the maximum length of the string. The calculation at this time is consistent with the previous algorithm, which is first converted to bit, and then converted to GB:

(2 ^ 31-1) * 16ax 8max 1024max 1024max 1024max 1024 = 4GB

That is, the runtime can theoretically support 4GB-sized strings, beyond which an exception will be thrown. JDK9 optimizes the storage of String, and the underlying byte array replaces the char array, which saves half the space for pure Latin1 characters.

Of course, this limitation of 4GB is based on the premise that JVM can allocate so much available memory.

Summary

Through the above analysis, we can draw a conclusion: first, the length of the string at compile time cannot exceed 65534; second, at run time, the length of the string cannot exceed 2 ^ 31-1, and the 4GB cannot exceed the maximum memory allocated by the virtual machine.

About the maximum length of String is how much to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report