Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Use compilation to peek into the mysteries of data structure behind String

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Friends who are familiar with C++, java, VB and other programming languages all know String (string), which is the data type that represents text in the programming language. The string is composed of several characters and is an important member of all programming languages. Maybe many friends just use it and don't study the mystery carefully. In fact, there are still a lot of strings worthy of our in-depth study.

Think and think

Have you ever thought about the following questions during the development of the string in Swift?

How much memory is occupied by a string variable?

What is the difference between the underlying storage of the string str1 and str2?

If you splice str1 and str2, what will happen to the underlying storage of str1 and str2?

If you can answer the above questions accurately, you still have a good understanding of the underlying storage mechanism of Swift strings.

How much memory does a string variable take up?

The method of 1:MemoryLayout

First of all, you can test it with the help of the MemoryLayout brought by Swift.

Method 2: compilation

In addition, we can also use "a powerful low-level analysis aid-assembly language" to peep into the underlying storage of String, which actually analyzes the bottom layer of other grammars and system libraries.

Examples such as the principle of polymorphism, the principle of generics, the bottom layer of Array, the bottom layer of enumeration, and so on.

In addition, not only the underlying analysis of Swift,C, C++ and OC, but also the assembly language "

After all, every valid code you write will eventually be converted to machine instructions (0 and 1).

Machine instructions correspond to assembly instructions, and each machine instruction can be translated into a corresponding assembly instruction.

Being able to read assembly instructions is equivalent to being able to read machine instructions and know where CPU is located (which registers and blocks of memory are operated).

The code for this tutorial runs directly on the command "(CommandLineTools) item" of Mac

Therefore, the assembly code shown is based on the AT&T format assembly of X64, and the ARM assembly of real iOS devices actually has extreme similarities between different kinds of assembly, except that some instructions are called differently.

Like Microsoft's Visual Studio, Xcode also has a built-in disassembly function that makes it easy to view the assembly instructions corresponding to each sentence of code. the steps to open the disassembly boundary are as follows

Put a breakpoint on the code that needs to be debugged (the disassembly bound will be displayed in the breakpoint debug state)

Menu: Debug >

Translate into a compilation

Translated into disassembly

Run the program, see the disassembly boundary

If you have disassembly experience, it can be inferred from the assembly in sections 16 and 17 that String occupies 16 bytes because it stores the contents of the string str in the rax and rdx registers, while both rax and rdx are 8 bytes.

The content of the compilation is too much, because of the relationship between time and space, the chapter will not explain each sentence of the assembly instruction in detail, but more to explain the importance of the assembly.

Third, the underlying storage of strings

Snoop on memory

Earlier, I wrote a device that can peek into the memory of Swift variables: https://github.com/CoderMJLee/Mems now peeps into the 16-byte memory of the string, what data is stored

By default, memory data is displayed in groups of 8 bytes.

Transfer parameters

Memory data is displayed according to a 1-byte memory group.

The ASCII value of the character '0characters' is 0x30~0x39. If you take a closer look at the original 16 bytes of str1, what do you find?

It directly stores the ASCII values of all characters in 16 bytes of str1

The 0xa in the last byte 0xea is the number of characters, which is also a total of 10 characters.

Splicing

It can be found that when splicing "ABCDE" into str1,

In the end, it stores the ASCII values of the five characters of "0123456789ABCDE" in the 16 bytes of str1, and the 0xf in the last byte of 0xef is the number of characters, which is a total of 15 characters.

You can see that the first 16 bytes are full. What if you concatenate one more character?

As you can see, the data stored in str1 blocks has changed with a constant, and the ASCII value of each character is not equal. What exactly is the meaning of 16 bytes?

Where are the ASCII values of all the characters ('0characters 9',' A' to'F') stored?

Other circumstances

What if the string contains more than 15 characters when initialization begins (before splicing)?

I'm sure you can guess this result.

There is no ASCII value of any characters in these 16 bytes.

And these 16 bytes are the same as

There's still a difference.

Although the contents of their strings are all "0123456789ABCDEF", if you enter the splicing operation on str2

It is not difficult to find that: at this time, the 16-byte size of str2 has changed, with

How to solve the above questions?

It's a little similar.

The above questions can be solved just by looking at the printed memory data, but all of them can be beneficial. Assemble! To resolve and analyze the assembly instructions, I came to the conclusion that because the space of the chapter is limited and my work is usually busy, I recorded the detailed analysis of the above problems into a video of "more than 2". Interested friends can watch it at 1.5 times the speed.

Link: https://pan.baidu.com/s/1AkS3K1ZKP8zyxhlhLRaBkA extraction code: kzrk

The video may be a little difficult for friends who do not have the basis of compilation. It is best to choose a clear-headed time to watch it.

After watching the video, I hope my family can definitely feel the importance of assembly language, and don't always stay in the layer of writing level language code and indulging in grammar candy.

Fourth, finally

We have done so much, of course, not just to snoop on the bottom of the string. Like data structures and algorithms, assembly is absolutely necessary to take your programming career to the next level.

In the field of programming, strings are just like a planet in the universe, small and great. There are still many things waiting for us to explore. In the future, there are more areas of the Internet that need to be programmed. The progress of the times, the rapid development of software, not learning is tantamount to retrogression, will be eliminated by the times. For programmers, only by constantly exploring and learning more techniques, can we roam in this field.

If you want to improve your abilities, get a promotion, get a raise, and break through bottlenecks, be sure to learn more about compilation and data structures and algorithms. If you want an in-depth understanding, welcome × × ×. There are not only face-to-face opportunities with Daniel in the programming world, but also several free programming skills and technical upgrading tips. We look forward to making progress with you.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report