Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the limit of String length

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "what is the limit of String length". In daily operation, I believe many people have doubts about the limit of String length. The editor consulted all kinds of data and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the question of "how much is the limit of String length?" Next, please follow the editor to study!

String

First of all, to know the length limit of String, we need to know how String stores strings. String actually uses an array of char types to store characters in strings.

It used to be the container that stored the String.

So since String is an array storage, will the array have a length limit? Yes, there are restrictions, but under the precondition, let's take a look at the method in String that returns length.

Length method in the String class

From this, we can see that the return value type is int. The array defined in Java can specify the length of the array. Of course, if it is not specified, it will be specified by default according to the array element:

Int [] arr1 = new int [10]; / / define an array of length 10

Int [] arr2 = {1, 2, 3, 4, 5}; / / then the length of the array is 5.

Integers are limited in java. If we look at the wrapper class Integer corresponding to the int type through the source code, we can see that the maximum length limit is 2 ^ 31-1, which means that the length of the array is 0-2 ^ 31-1, so the calculation is (2 ^ 31-1 = 2147483647 = 4GB).

Value range of Integer

Seeing this, we try to verify the above point of view by coding.

Define a string literally

The above is a 100000-character string I constructed by defining the literal quantity. After compilation, the virtual machine prompts me to report an error, saying that our string is too long. Isn't it agreed that we can save 2.1 billion? Why is it wrong when there are only 100000?

In fact, this involves the limitations of the JVM compilation specification, in fact, when JVM compiles, if we define the string as a literal quantity, JVM will store it in the constant pool at compile time. At this time, JVM limits the storage of String types in this constant pool. Next, let's take a look at what the manual says.

Screenshot of java virtual machine specification

In the constant pool, each cp_info entry must have the same format, starting with a single-byte "tag" entry that represents the cp_info type. The content of the subsequent info [] item is determined by the type of tag.

Java virtual machine specification manual constant type table

We can see that the representation of the String type is CONSTANT_String. Let's take a look at how CONSTANT_String is defined.

The U2 string_index defined here represents the valid index of the constant pool, and its type is represented by the CONSTANT_Utf8_info structure. what we need to note here is the length defined in it. Let's look at the figure below.

In the class file, U2 represents an unsigned number of 2 byte units. We know that 1 byte occupies 8 bits and 2 bytes is 16 bits, so the energy saving range of 2 words is 2 ^ 16-1 = 65535. The definition of U1 and U2 in the class file format is summarized as follows:

Here is the summary of the java virtual machine specification.

1. Explanation of file content types in class file

Define a set of private data types to represent the contents of the Class file, including U1dint U2 and U4, representing 1, 2, and 4-byte unsigned numbers, respectively.

Each Class file consists of a byte stream of 8 bytes, and all 16-bit, 32-bit, and 64-bit data will be constructed into 2, 4, and 8 8-byte units.

2. Effective range interpretation of program exception handling.

The values of the start_pc and end_pc terms indicate the valid range of exception handlers in the code [] array.

Start_pc must be a valid index to the opcode of an instruction in the current code [] array. End_pc is either a valid index to the opcode of an instruction in the current code [] array, or equal to the value of code_length, that is, the length of the current code [] array. The value of start_pc must be less than end_pc.

The exception handler takes effect when the program counter is in the range [start_pc, end_pc). That is, let x be the value within the valid range of the exception handle, and x satisfies: start_pc ≤ x < end_pc.

In fact, the fact that the end_pc value itself does not fall within the valid range of the exception handler is a design flaw in the history of the Java virtual machine: if the code property of a method in the Java virtual machine happens to be 65535 bytes long and ends with a 1-byte instruction, then the instruction cannot be processed by the exception handler. However, the compiler can indirectly compensate for this BUG by limiting the maximum length of the code [] array of any method, instance initialization method, or class initialization method to 65534.

Note: here we mark the points that individuals think are more important, first of all, the first bold is that the valid range of the array is [0-65565], but the second bold is explained, because the virtual machine still needs 1 byte of instruction to end, so in fact, the real valid range is [0-65564], it should be noted here that the scope here is only at compile time. If you are a runtime concatenated string, you can go beyond this range.

Next, let's do a small experiment to test whether we can build a string with a length of 65534 and see if it can be compiled.

First, a 65534-length string is constructed through a for loop. After printing on the console, we calculate that it is indeed 65534 characters through an online character statistics tool of our own Duniang, as follows:

Then we copy the characters and assign them to the string in the form of a defined literal amount, and we can see that the lower right corner of these characters is indeed 65534, so we run a wave, and it is a success.

See here, let's sum up:

Q: is there a limit on the length of the string? How much is it?

Answer: first of all, the content of the string is stored by a character array char []. Because the length and index of the array are integers, and the return value of the method length () that returns the length of the string in the String class is also int, by looking at the class Integer in the java source code, we can see that the maximum range of Integer is 2 ^ 31-1, because the array starts from 0. So the maximum length of the array can make [0 ~ 2 ^ 31-1] approximately 4GB by calculation.

But by reading the definition of the class file format in the java virtual machine manual and the structure definition of the String type in the constant pool, we can know that U2 is defined for the index, that is, the unsigned occupies 2 bytes, and the maximum range that 2 bytes can represent is 2 ^ 16-1 = 65535.

It's actually 65535, but since JVM needs 1 byte to represent the end instruction, the range is 65534. Beyond this range will report errors at compile time, but the range of run-time splicing or assignment is in the maximum range of shaping.

At this point, the study of "what is the limit of String length" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report