In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
In this issue, the editor will bring you the underlying computer knowledge that Java programmers must understand. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
Each of us programmers may have a dream, that is to become Daniel, we may be immersed in various frameworks, thinking that the framework is everything, that the application layer is the most important, you are wrong. In today's computer industry, application is the basic quality, if you understand its principle can make you go further in the industry, and the basic knowledge of computer is the top priority. Next, follow in my footsteps and introduce you to the basic knowledge of computers.
CPU
Don't you know CPU yet? I'll show you what CPU is right now.
It's no exaggeration to say that CPU, whose full name is Central Processing Unit, is the hardest core component in your computer. CPU is the core component that allows your computer to be called a computer, but it does not represent your computer. The relationship between CPU and computer is the same as that between brain and human. The core of CPU is to get instructions from a program or application and perform calculations. This process can be divided into three key stages: extraction, decoding and execution. CPU fetches the instruction from the main memory of the system, then decodes the actual content of the instruction, and then executes the instruction by the relevant part of the CPU.
CPU internal processing process
The following figure shows the running process of the general program (take C language as an example). It can be said that understanding the running process of the program is the basis and premise of mastering the running mechanism of the program.
In this process, CPU is responsible for interpreting and running what is eventually translated into machine language.
CPU mainly consists of two parts: control unit and arithmetic logic unit (ALU).
Control unit: fetches instructions from memory and decodes the arithmetic logic unit (ALU): handles arithmetic and logic operations
CPU is the heart and brain of a computer. Both it and memory are electronic components made up of many transistors. It receives data input, executes instructions, and processes information. It communicates with input / output (I / O) devices that send data to and receive data from the CPU.
From the functional point of view, the interior of CPU consists of four parts: register, controller, arithmetic unit and clock, which are connected by electrical signals.
Registers are part of the CPU. They can be used to temporarily store instructions, data, and addresses. You can think of it as a kind of memory. Depending on the type, there will be 20-100 registers inside a CPU. The controller is responsible for reading the instructions and data in the memory into the register, and according to the result of the instruction, the computer calculator is responsible for calculating the data clock that reads the register from the memory. The data clock is responsible for sending out the clock signal that CPU starts timing. CPU is a collection of registers.
Of the four structures of CPU, we programmers only need to know the registers, and the other three don't need to pay too much attention to them. Why? Because the program describes registers as objects.
For different types of CPU, the type and number of internal registers and the range of values stored in the registers are different. However, depending on the function, registers can be divided into the following categories
The category function accumulation register stores the running data and the calculated data. The flag register is used to reflect the state of the processor and some characteristics of the operation results, as well as to control the execution of instructions. Program counter is the place where the address of the unit where the next instruction is located is stored. The base address register stores the starting position of the data memory, the relative address of the base address register, the general register stores arbitrary data instruction registers to store running instructions, used internally by CPU, programmers cannot read and write to the register the starting position of the stack register storage stack area
Among them, there is only one program counter, cumulative register, flag register, instruction register and stack register, and there are generally multiple other registers.
The following is an explanation of each register
Program counter
The program counter (Program Counter) is used to store the address of the unit where the next instruction is located.
When the program is executed, the initial value of PC is the address of the first instruction of the program. When the program is executed sequentially, the controller first takes an instruction from memory according to the instruction address indicated by the program counter, then analyzes and executes the instruction, and adds 1 to the value of PC to point to the next instruction to be executed.
Let's take a look at the execution process of the program counter in detail based on an example.
This is an addition operation, the program starts, and after compilation and parsing, the program in the hard disk will be copied to memory by the operating system. The program in the example is to add 123 and 456 and output the results to the display.
The address 0100 is the starting position where the program runs. After the operating system such as Windows copies the program from the hard disk to memory, it will set the program counter as the starting position 0100, and then execute the program. After each instruction is executed, the value of the program counter will increase by 1 (or point directly to the address of the next instruction). Then, according to the value of the program counter, CPU will read the command from memory and execute it, that is, the program counter controls the flow of the program.
Conditional branching and loop mechanism
The conditional control flow in high-level language is mainly divided into three kinds: sequential execution, conditional branching and loop judgment. Sequential execution is the execution of instructions in the order of the content of the address. A conditional branch is an instruction that executes any address according to the condition. A loop is the repeated execution of an instruction at the same address.
The case of sequential execution is relatively simple. The value of the program counter for each instruction executed is the + 1 condition and the loop branch will make the value of the program counter point to any address, so that the program can return to the previous address to repeat the same instruction, or jump to any instruction.
The following is a conditional branch as an example to illustrate the execution of the program (the loop is also similar)
The starting process of the program is the same as the sequence flow. CPU starts to execute commands at 0100 and executes sequentially at 0100 and 0101. The value of PC is in the order of + 1. When the instruction at 0102 address is executed, the value of 0106 register is judged to be greater than 0, jump to the instruction at 0104 address, the value is output to the display, and then the program ends, and 0103 instructions are skipped, which is the same as the if () judgment in our program. If the condition is not met, the instruction will be skipped directly. So the execution of PC is not a direct + 1, but the address of the next instruction.
Flag register
Conditional and loop branches will use jump (jump instruction) to determine whether to jump according to the current instruction. Above we mentioned the flag register, which will be saved regardless of whether the result of the operation of the current accumulation register is positive, negative or zero.
During the operation of CPU, the value of the flag register is automatically set according to the result of the current operation, and the positive, negative and zero states of the operation result are represented by the three bits of the flag register. The results of the first byte bit, the second byte bit and the third byte bit of the flag register are all 1, representing positive, zero and negative numbers, respectively.
The execution mechanism of CPU is interesting. Suppose the XXX stored in the accumulation register is compared with the YYY stored in the general register. Behind the comparison, the operation mechanism of CPU will do subtraction. No matter whether the result of the subtraction operation is positive, zero or negative, it will be saved to the flag register. The result is positive that XXX is larger than YYY, zero means that XXX and YYY are equal, and negative means that XXX is smaller than YYY. The instructions compared by the program are actually subtracted within the CPU.
Function calling mechanism
Next, we continue to introduce the function call mechanism, even for programs written in high-level languages, function call processing is achieved by setting the value of the program counter to the storage address of the function. After the function executes the jump instruction, it must be returned. The simple instruction jump is meaningless. Here is an example of realizing the function jump.
In the figure, the variables an and b are assigned to 123 and 456 respectively, and the MyFun (afield b) method is called to jump the instruction. The address in the figure is the address of the runtime after compiling the C language into the machine language. because the 1-line C program usually becomes a multi-line machine language after compilation, the addresses in the figure are scattered. After executing the MyFun instruction, the program returns to the next instruction of MyFun, and CPU continues to execute the following instruction.
The two important instructions to call and return the function are the call instruction and the return instruction. If the entry address of the function is set before the program counter, the call instruction will store the address of the instruction to be executed after calling the function in the main memory called stack. After the function is processed, the return instruction is executed through the exit of the function. The function of the return instruction is to set the address saved in the stack to the program counter. Before the MyFun function is called, the 0154 address is saved in the stack, and after the MyFun function is processed, the 0154 address is saved in the program counter.
The call procedure is as follows
In some high-level language conditional or loop statements, the processing of function calls is converted into call instructions, and the processing after the end of the function is converted into return instructions.
Implement an array through address and index
Next, let's take a look at the base address register and the index register. Through these two registers, we can divide specific areas on the main memory to achieve similar array operations.
First, we use hexadecimal numbers to divide the 00000000-FFFFFFFF address on the computer's memory. Then, all the memory addresses in this range can be viewed as long as there is a 32-bit register. However, it is more convenient to use two registers if you want to split a specific area of memory like an array for continuous viewing.
For example, we use two registers (the base address register and the index register) to represent the value of memory
This representation is very similar to the construction of an array, which refers to a data structure in which data of the same length are arranged continuously in memory. All the values of the array are represented by the array name, and the data elements of the array are divided by index, for example, a [0]-a [4], and 0-4 in [] is the subscript of the array.
CPU instruction execution process
The CPU of almost all von Neumann computers can be divided into five stages: fetching instructions, decoding instructions, executing instructions, accessing access numbers, and writing back results.
The instruction fetching stage is the process of reading the instructions in memory into the registers in CPU. The program registers are used to store the address instruction decoding stage where the next instruction is located. After the instruction fetching is completed, it immediately enters the instruction decoding stage. In the instruction decoding stage, the instruction decoder splits and interprets the retrieved instructions according to the predetermined instruction format, and identifies different instruction categories and various methods to obtain operands. In the stage of executing the instruction, after the decoding is completed, it is necessary to execute this instruction. The task of this stage is to complete the various operations specified in the instruction and specifically realize the function of the instruction. In the access data fetching stage, it may be necessary to extract data from memory according to the needs of the instruction. the task of this stage is to get the address of the Operand in main memory according to the instruction address code, and read the Operand from main memory for operation. The result Write Back,WB phase, as the last stage, writes back the result data of the execution instruction phase to some storage form: the result data is often written to the internal register of the CPU so that it can be quickly accessed by subsequent instructions; memory
CPU and memory are like a bunch of inseparable lovers. Without memory, CPU can not execute program instructions, then the computer will lose its meaning; only memory, can not execute instructions, then the computer can not run.
So what is memory? How do memory and CPU interact? Now let's introduce it.
What is memory?
Memory (Memory) is one of the most important components in the computer, and it is the bridge between the program and CPU.
All the programs in the computer run in the memory, so the memory has a great influence on the computer. The memory is also called the main memory. Its function is to store the operation data in CPU and the data exchanged with the hard disk and other external storage devices. As long as the computer is in operation, CPU will transfer the data needed for operation to the main memory for operation, and when the operation is completed, CPU will transmit the results. The operation of the main memory also determines the stable operation of the computer.
Physical structure of memory
The interior of memory is made up of all kinds of IC circuits. It has a large variety, but it is mainly divided into three kinds of memory.
Random access memory (RAM): the most important type of memory that means that data can be read and written from it. When the machine is turned off, the information in memory is lost. Read-only memory (ROM): ROM can only be used to read data, not write data, but when the machine power outage, the data will not be lost. Cache (Cache): Cache is also often seen. It is divided into first-level cache (L1 Cache), second-level cache (L2 cache) and third-level cache (L3 Cache). It is located between memory and CPU. It is a memory that reads and writes faster than memory. When CPU writes data to memory, the data is also written to the cache. When CPU needs to read data, it will read directly from the cache. Of course, if the required data is not in Cache, CPU will read the data in memory again.
Memory IC is a complete structure, which also has power supply, address signal, data signal, control signal and IC pin for addressing to read and write data. Here is a virtual schematic diagram of the IC pin
In the figure, VCC and GND represent the power supply, A0-A9 is the pin of the address signal, D0-D7 represents the control signal, RD and WR are good control signals. I have distinguished them with different colors. After connecting the power supply to VCC and GND, we can transmit 0 and 1 signals to other pins. In most cases, + 5V means 1M 0V means 0.
We all know that memory is used to store data, so how much data can be stored in this memory IC? D0-D7 represents a data signal, that is, you can input and output data of 8 bit = 1 byte at a time. A0-A9 is a total of ten address signals, indicating that you can specify 00000 00000-11111 11111 to the power of 10 = 1024 addresses. Each address holds 1 byte of data, so we can conclude that the capacity of the memory IC is 1 KB.
Read and write process of memory
Let's focus on the process of reading and writing data in memory IC. Let's look at a model for writing and reading data to and from memory IC.
To describe this process in detail, suppose we want to write 1byte data to memory IC, the process goes like this:
First of all, connect the + 5V power to the VCC and the 0V power to the GND, use A0-A9 to specify the storage place of the data, then input the value of the data to the data signal of D0-D7, and set the value of WR (write) to 1. After performing these operations, when you can write data to the memory IC to read data, you only need to specify the storage place of the data through the address signal of A0-A9, and then set the value of RD to 1. The RD and WR in the figure are also called control signals. Where write and read operations cannot be performed when both WR and RD are 0. The realistic model of memory
In order to make it easier to remember, we map the memory model to our real world model, where the memory model is very much like the building we live in. In this building, the first floor can store one byte of data, and the floor number is the address. Here is a model diagram of the integration of memory and floor.
We know that the data in the program is not only numerical value, but also the concept of data type, which means the size of memory (the number of floors occupied) in terms of memory. Even if the memory is physically forced to read and write data one byte at a time, it is possible to read and write data in a specific number of bytes in a program by specifying its data type.
Binary system
We all know that the underlying layer of the computer uses binary data for data stream transmission, so why use binary to represent the computer? Or, what is a binary number? In the extension step, how to use binary to add, subtract, multiply and divide? Let's take a look.
What is a binary number
So what is a binary number? In order to illustrate this problem, let's first convert the number 00100111 into a decimal number, convert a binary number into a decimal number, and directly convert the values in each position * bit weight, then we will convert the above values.
That is to say, the conversion from 00100111 to decimal is 39. This 39 is not written in conjunction with the numbers 3 and 9, but 3 * 10 + 9 * 1, in which 10, 1 is the bit weight, and so on. The position weight in the above example is 7 6 5 4 3 2 10 from high to low. This bit weight is also called the power, so the highest bit is the 7th power of 2, the 6th power of 2, and so on. The operation of a binary number is based on 2 every time. This 2 refers to the cardinality, so the cardinality of the decimal number is 10. In any case, the value of the bit weight is the number of digits-1, then the first bit weight is 1-1 = 0, the second bit weight is 2-1 = 1, and so on.
So the binary number we are talking about is actually the number expressed by the numbers 0 and 1, its cardinality is 2, and its value is the result of the sum of the digit weight of each number. Generally speaking, the value refers to the decimal number, so its value is 3 * 10 + 9 * 1 = 39.
The relationship between shift operation and multiplication and division
After learning about binary, let's take a look at the operation of binary. Like decimal numbers, addition, subtraction, multiplication and division also apply to binary numbers, as long as you pay attention to every binary. The operation of binary numbers is also unique to computer programs, so it is necessary to understand binary operations.
First of all, let's introduce the shift operation, which refers to the operation of moving elements in each position of the binary value to the left and to the right, as shown in the following figure.
Complement number
Just now we did not talk about moving to the right because the high values vacated after moving to the right have two forms: 0 and 1. In order to distinguish when to fill 0 and when to fill 1, we first need to master the method that binary numbers represent negative numbers.
When a negative value is represented in a binary number, the highest bit is generally used as a symbol, so we use the highest bit as a symbol bit. A sign bit of 0 indicates a positive number and a symbol bit of 1 indicates a negative number. So how should-1 be expressed in binary numbers? Many people may think that because the binary number of 1 is 0000 0001 and the highest bit is the symbol bit, the correct representation of-1 should be 1000 0001, but is this answer really correct?
There is no subtraction in the computer world. When the computer is doing subtraction, it is actually doing addition, that is, the subtraction operation realized by addition. For example, 100-50, in fact, when the computer looks at it, it should be 100 + (- 50). For this reason, a binary complement is used to represent a negative number, which is a negative number represented by a positive number.
In order to get the complement, we need to reverse all the values of each digit in the binary, and then the result can be + 1. Remember this conclusion first, and let's demonstrate it.
Specifically, it is necessary to get the binary number of a certain value first, then do the inverse operation on each bit of the binary number (0-> 1, 1-> 0), and then take the reverse number + 1, so that the complement is obtained.
Although the acquisition of complements is not easy to understand intuitively, it is logically very rigorous. For example, let's take a look at the process of 1-1. Let's first use the above 1000 0001 (it is the complement of 1. If you don't know, please take a look at the above, no matter whether it is correct, just use it to do a calculation).
Oddly enough, 1-1 will become 130 instead of 0, so it can be concluded that 1000 0001 means-1 is completely wrong.
So how to express what is right? In fact, we have already given the result above, that is, 1111 1111, to prove its correctness.
We can see that 1-1 is actually 1 + (- 1). The inverse + 1 above is changed to 1111 1111, and then the addition operation is done with 1. The result is 1 0000 0000 of the nine digits, resulting in an overflow. The computer will directly ignore the overflow bit, that is, it will directly throw out the highest bit 1 and become 0000 0000. That's 0, and the result is correct, so 1111 1111 means-1.
So the binary representation of a negative number is to find its complement first, and the process of solving the complement is to reverse the binary number of the original value, and then take the result + 1.
The difference between the right shift of arithmetic and the right shift of logic
After learning about the complement, let's reconsider the issue of moving right. There are two cases of 0 and 1 when the highest bit is vacated after the shift.
When using a binary number as a signed value to the right, the value of the symbol bit before the shift (0 or 1) needs to be filled in the highest bit after the shift. This is called the arithmetic shift to the right. If the value uses a negative value represented by a complement, then the highest position 1 after moving to the right can correctly represent the numerical operations such as 1, 2, 1, 4, 4, 1, 8, etc. If it is a positive number, then you can fill in 0 directly in the vacant position.
Let's look at an example of moving to the right. Move-4 to the right by two places. Let's take a look at the shift diagram.
As shown in the above figure, in the case of logical right shift,-4 moving two bits to the right will become 63, which is obviously not its 1max 4, so you can't use logical right shift, so if you move the arithmetic to the right, the right shift will become-1, which is obviously its 1max 4, so you use arithmetic to move right.
Then we can draw a conclusion: when moving to the left, whether it is the figure or the numerical value, we only need to fill in the low position 0 after the shift; when moving to the right, we need to judge whether the logic moves to the right or the arithmetic moves to the right according to the situation.
Let's introduce symbolic extension: the purpose of symbolic expansion of data is to produce a result of doubling the number of digits, but the numerical value is constant, to meet the requirements of some instructions for operands, such as the divisor that is longer than the divisor. For example, the number of data digits is lengthened to reduce errors in the calculation process.
Taking 8-bit binary as an example, symbol extension refers to converting it into 16-bit and 32-bit binary numbers while keeping the value unchanged. When converting a positive 8-bit binary number 0111 1111 to a 16-bit binary number, it is easy to get the correct result of 0000 0000 0111 1111, but what about the value represented by a complement such as 1111 1111? Just express it as 1111 1111 1111.
In other words, no matter the positive number or the negative number represented by the complement, you only need to fill the high position with 0 and 1.
The relationship between memory and disk
As we all know, the five basic components of a computer are memory, controller, arithmetic unit, input and output devices. From the point of view of storage function, memory can be divided into memory and disk. We have introduced memory above. Let's introduce the disk and the relationship between disk and memory.
The program cannot run without reading into memory
The main storage components of a computer are memory and disk. Programs stored on disk must be loaded into memory before they can be run, and programs saved on disk cannot be run directly, because CPU, which is responsible for parsing and running program contents, needs to specify the memory address through the program counter to read out program instructions.
Disk construction disk cache
As we mentioned above, disk and memory are often mutually beneficial and symbiotic, cooperate with each other and have a good cooperative relationship with each other. Every time memory needs to read data from disk, it is bound to read the same content, so there must be a role responsible for storing what we often need to read. We all often use caching technology when making software, so the hardware level is no exception. Disks also have caches. Disk caching is called disk caching.
Disk caching refers to the way data read from the disk is stored in memory, so that when the same content needs to be read next, it will no longer be read through the actual disk, but through the disk cache. The emergence of a certain technology or framework is bound to solve a certain problem, then disk cache greatly improves the speed of disk access.
Virtual memory
Virtual memory is the second medium in which memory and disk interact. Virtual memory refers to the use of part of the disk as hypothetical memory. This is in contrast to the fact that the disk cache is a hypothetical disk (actually memory), while virtual memory is imaginary memory (actually a disk).
Virtual memory is a technology of memory management in computer system. It makes the application think that it has continuously available memory (a complete address space), but in practice, it is usually divided into multiple physical fragments, and some are stored on an external disk manager for data exchange if necessary.
With the help of virtual memory, you can still run programs when you run out of memory. For example, you can still run 10MB programs when there is only 5MB memory left. Because CPU can only execute programs that are loaded into memory, the space in virtual memory needs to be swap with the space in memory, and then run the program.
How to swap virtual memory and memory
There are two methods of virtual memory: paging and segmenting. Windows uses paging. This method means that without considering the program structure, the running program is divided according to a certain size of the page and replaced by the page as a unit. In paging, we read the contents of the disk into memory called Page In, and write the contents of memory to disk called Page Out.
The page size of the Windows computer is 4KB, that is, the application needs to be sliced by 4KB pages, placed on disk in page units, and then replaced.
In order to implement the memory function, Windows provides files (page file, page files) used by virtual memory on disk. This file is generated and managed by Windows and is the same size as virtual memory, usually 1-2 times the size of memory.
Physical structure of disk
Before we introduced the physical structure of CPU and memory, now let's introduce the physical structure of disk. The physical structure of a disk refers to the form in which the disk stores data.
A disk is used by dividing its physical surface into multiple spaces. There are two ways of division: variable length mode and sector mode. The former divides the physical structure into variable length space, while the latter divides the disk structure into fixed length space. Generally speaking, the hard disk and floppy disk used by Windows use sectors. In the sector, the space that divides the disk surface into several concentric circles is the track, which is divided into sectors according to the fixed size of storage space.
The sector is the smallest unit of physical reading and writing to the disk. The disk used in Windows is usually 512 bytes per sector. However, Windows reads and writes to the disk logically in a cluster of integral multiples of sectors. Depending on the capacity of the disk, 1 cluster can be 512 bytes (1 cluster = 1 sector), 1KB (1 cluster = 2 sectors), 2KB, 4KB, 8KB, 16KB, 32KB (1 cluster = 64 sectors). Clusters and sectors are equal in size.
Compression algorithm
We must all have the experience of compressing and unzipping files. When files are too large, we will use file compression to reduce file footprint. For example, Wechat has a limit of 100 MB for uploading files. I have a folder here that cannot be uploaded, but after I unzip the file, it must be less than 100 MB, so my file can be uploaded.
In addition, when we save the photos taken by the camera to the computer, we will also use the compression algorithm for file compression, the file compression format is generally JPEG.
So what is the compression algorithm? How is the compression algorithm defined? Before we know the algorithm, we need to know how the file is stored.
file store
A file is a form of storing data in a storage medium such as a disk. The most basic unit of data stored in a program file is bytes. The size of the file is expressed in terms of xxxKB, xxxMB, etc., because the file is stored in bytes B = Byte.
A file is a collection of bytes of data. There are 256 kinds of byte data represented by 1 byte (8 bits), which is 0000 0000-1111 1111 in binary. If the data stored in the file is text, then the file is a text file. If it is a graphic, then the file is an image file. In any case, the number of bytes in the file is stored continuously.
Definition of compression algorithm
The collection of files described above is actually a collection of bytes of data, so we can define the compression algorithm.
Compression algorithm (compaction algorithm) refers to the algorithm of data compression, which mainly includes two steps: compression and restoration (decompression).
In fact, it is an algorithm that reduces the file byte space and occupies space without changing the original file attributes.
According to the definition of compression algorithm, we can divide it into different types:
Damaged and lossless
Lossless compression: can reconstruct the compressed data without distortion and restore the original data accurately. It can be used in situations with strict requirements for the accuracy of data, such as the compression of executable files and ordinary files, the compression of disk, and the compression of multimedia data. The compression of this method is relatively small. Such as differential coding, RLE, Huffman coding, LZW coding, arithmetic coding.
Lossy compression: there is distortion, can not completely and accurately recover the original data, the reconstructed data is only an approximation of the original data. It can be used in situations where the accuracy of data is not high, such as the compression of multimedia data. The compression of this method is relatively large. For example, predictive coding, sound sense coding, fractal compression, wavelet compression, JPEG/MPEG.
Symmetrical property
If the complexity of the codec algorithm is about the same as the time required, it is a symmetrical coding method, and most compression algorithms are symmetrical. But there are also asymmetries, generally difficult to encode but easy to decode, such as Huffman coding and fractal coding. But the coding method used in cryptography is on the contrary, it is easy to encode, but very difficult to decode.
Between frames and within frames
Both intra-frame and inter-frame coding methods are used in video coding. Intra-frame coding refers to the coding method completed independently within a frame, which is the same as that of still images, such as JPEG;, while inter-frame coding needs to refer to the front and back frames before and after encoding and decoding, and consider the compression of temporal redundancy between frames, such as MPEG.
Real-time performance
In some multimedia applications, it is necessary to process or transmit data in real time (such as live digital recording and video recording, playing MP3/RM/VCD/DVD, video / audio on demand, network live broadcast, videophone, video conference). The coding and decoding generally requires a delay of ≤ 50 ms. This requires simple / fast / efficient algorithms and high-speed / complex CPU/DSP chips.
Graded treatment
Some compression algorithms can simultaneously deal with multimedia data with different resolutions, different transmission rates and different quality levels, such as JPEG2000 and MPEG-2/4.
These concepts are somewhat abstract, mainly to let you understand the classification of compression algorithms. Let's analyze the characteristics, advantages and disadvantages of several commonly used compression algorithms.
Understanding of several commonly used Compression algorithms the Mechanism of RLE algorithm
Next, let's take a formal look at the file compression mechanism. First, let's try to compress the 17-half-character file (text file) of AAAAAABBCDDEEEEEF. Although these words have little practical meaning, they are well suited to describe the compression mechanism of RLE.
Since half-width characters (that is, English characters) are saved in the file as 1 byte, the size of the above file is 17 bytes. As shown in the picture
So, how can I compress the file? You might as well consider that we can use any compression algorithm as long as we can make the file less than 17 bytes.
One of the most obvious forms of compression I think you have thought of is to deduplicate the same characters, that is, the number of times the characters are repeated. So when the above file is compressed, it will look like this.
From the figure, we can see that 17 characters of AAAAAABBCDDEEEEEF have been successfully compressed into 12 characters of A6B2C1D2E5F1, that is, 12 / 17 = 70%, with a compression ratio of 70%.
Like this, the compression method that represents the contents of the file in the form of data * repetition times is called the RLE (Run Length Encoding) algorithm. RLE algorithm is a good compression method, which is often used to compress fax images and so on. Because the essence of image file is also a collection of byte data, it can be compressed by RLE algorithm.
Huffman algorithm and Morse coding
Let's introduce another compression algorithm, namely Huffman algorithm. Before you can understand the Huffman algorithm, you must discard the data in which one character of a half-width English number is 1 byte (8 bits). Let's take a look at the basic idea of Huffman algorithm.
Text files are made up of different types of characters, and different characters appear differently. For example, it is common that An appears about 100 times and Q is used only three times in a text file.
The key of Huffman algorithm is that the repeated data is represented by the number of bytes less than 8 bits, while the less commonly used data can be represented by more than 8 bits. When both An and Q are represented by 8 bits, the size of the original file is 100 times * 8 bits + 3 times * 8 bits = 824 bits, assuming that A uses 2 bits and Q uses 10 bits to represent 2 * 100 + 3 * 10 = 230 bits.
However, it should be noted that in the end, the disk is stored in 8 bits per byte to save the file.
Huffman algorithm is more complicated. Before we get to know more about it, let's eat some dessert and learn about Morse code. You must have seen American TV series or war movies. Morse code is often used to convey information in war communications, such as the following
Next, let's take a look at Morse code. here is an example of Morse code. We think of 1 as a short point (tick) and 11 as a long point (tap).
Morse coding generally uses short coding to represent the characters with the highest frequency in the text. As shown in the table, if the bit representing the short point is 1 and the bit representing the long point is 11, then the character of E (tick) can be represented by 1 and C (tick-tock) can be represented by 110101101 of 9 bits.
In the actual Morse code, if the length of the short point is 1, the length of the long point is 3, and the interval between the short point and the long point is 1. The length here refers to the length of the sound. For example, we want to use the above AAAAAABBCDDEEEEEF example to rewrite in Morse code, where symbols representing time intervals need to be added between characters. Here we use 00 to distinguish.
Therefore, the text of AAAAAABBCDDEEEEEF becomes A * 6 times + B * 2 times + C * 1 times + D * 2 times + E * 5 times + F * 1 times + character interval * 16 = 4 bits * 6 times + 8 bits * 2 times + 9 bits * 1 times + 6 bits * 2 times + 1 bit * 5 times + 8 bits * 1 times + 2 bits * 16 times = 106bits = 14 bytes.
So the compression ratio of Morse code is 14 / 17 = 82%. The efficiency is not very outstanding.
Implementation of Huffman algorithm with binary Tree
As mentioned just now, Morse coding determines the length of encoded data that represents each character according to the frequency of each character in everyday text. However, in this coding system, it is not the most efficient for texts like AAAAAABBCDDEEEEEF.
Let's take a look at the Huffman algorithm. Huffman algorithm means that the best coding system is constructed for each compressed object file, and the compression is carried out on the basis of the coding system. Therefore, what kind of coding (Huffman coding) is used to divide the data depends on each file. Huffman coding information and compressed data are stored in the files compressed by Huffman algorithm.
Next, we sort out the A-F characters in AAAAAABBCDDEEEEEF according to the principle that the characters with high frequency are encoded with as few digits as possible. Sorted out according to the order of occurrence frequency from high to low, the results are as follows, and the coding scheme is also listed.
Character occurrence frequency coding (scheme) digits A601E511B2102D2112C11003F11013
In the coding scheme of the above table, with the decrease of the occurrence frequency, the number of data bits of character coding information is also gradually increasing, from the first 1 bit and 2 bits to 3 bits in turn. However, there is a problem with this coding system. You don't know the 3-bit code of 100. It means to use the three codes of 1, 0 and 0 to represent E, An and A. Or do you use 10 or 0 to represent B and A? Or 100 for C?
In Huffman algorithm, through the construction of coding system with the help of Huffman tree, we can build a coding system that can clearly distinguish even without the use of characters to distinguish symbols. However, the algorithm of Huffman tree is more complex, the following is a Huffman tree construction process.
The natural tree leaves from the root, while the Huffman tree produces leaves and branches.
Huffman tree can increase the compression ratio.
After using the Huffman tree, the more frequent the data, the less the number of digits, which is the core idea of the Huffman tree. From step 2 of the figure above, we can see that when the branches connect the data, we start with the data that appears less frequently. This means that the less frequent data reaches the root, the more branches. The more branches, the more digits of the code.
Next, let's take a look at the compression ratio of the Huffman tree, using the data from the above figure to show that the AAAAAABBCDDEEEEEF is 000000000000 100100 110 101101 0101010101 111 focus 40 bits = 5 bytes. The data before compression is 17 bytes, but the data after compression has reached an amazing 5 bytes, that is, the compression ratio = 5 / 17 = 29%. It is simply amazing.
You can refer to it. No matter which type of data, Huffman tree can be used as a compression algorithm.
File type pre-compression post-compression compression ratio text file 14862 bytes 4119 bytes 28% image file 96062 bytes 9456 bytes 10%EXE file 24576 bytes 4652 bytes 19% reversible and non-reversible compression
Finally, let's take a look at the data form of the image file. The purpose of the use of image files is usually to output image data to monitors, printers and other devices. Commonly used image formats are: BMP, JPEG, TIFF, GIF format and so on.
BMP: it is a kind of image form made by using the brush that comes with Windows. JPEG: it is a commonly used image data form such as digital camera. TIFF: it is a kind of image form that can quickly show the nature of the data by including a "tag" in the file. GIF: it is a data form developed by the United States, requiring no more than 256chromatic numbers.
Image files can use the RLE and Huffman algorithms described earlier, because in most cases, image files do not require data to be restored to the same state as before compression, allowing part of the data to be lost. The compression that can be restored to the pre-compression state is called reversible compression, and the compression that can not be restored to the pre-compression state is called non-reversible compression.
Generally speaking, the file in JPEG format is non-reversible compression, so some of the image information is blurred after restoration. GIF is reversible compression.
Operating system environment
The program contains the content of the running environment, it can be said that the running environment = operating system + hardware, the operating system can also be called software, it is composed of a series of instructions. We will not introduce the operating system, we will mainly introduce the identification of hardware.
We must have all played games. What do you need to do before you play games? Do you need to see if your laptop or computer is capable of playing games? Here is the configuration of a game (miss wow)
The main configurations in the figure are as follows
Operating system version: it refers to the system environment in which the application runs. Now there are three main operating system environments on the market, Windows, Linux and Unix. Generally speaking, almost all the large games we play are run on Windows, so Windows is the paradise of games. Windows operating systems will also be divided into 32-bit operating systems and 64-bit operating systems, which are not compatible with each other.
Processor: the processor refers to the CPU. The computing power of your computer, generally speaking, is the number of instructions per second. If your computer feels that the card cannot be picked up, it is probably due to the lack of computing power of the CPU. To deepen your understanding, please read another article by the blogger: CPU of hardware knowledge that programmers need to know.
Graphics card: graphics card undertakes the task of graphics output, so it is also known as graphics processing unit (Graphic Processing Unit,GPU). Graphics card is also very important. For example, Jianling, which I played before, opens five gears (in fact, the image becomes clearer) card, which is actually the reason why the graphics card can not be displayed.
Memory: memory is the main memory, which is the part of the storage space in which your application can dynamically analyze instructions when it is running. its size can also determine the running speed of your computer. If you want to deepen your understanding, please read another article by the blogger that programmers need to know about hard-core memory.
Storage space: storage space refers to the disk space occupied by the application installation. As can be seen from the figure, the minimum storage space for this game must be larger than 5GB. In fact, we all leave a large part of the game to install.
From the point of view of the running environment of the program, the type of CPU is a particularly important parameter. In order to make the program run normally, it must meet the minimum configuration required by CPU.
CPU can only explain its own inherent language. Different CPU can explain different kinds of machine languages. Programs in machine language are called native code (native code). Programs written by programmers in high-level languages such as C are just text files. Text files (excluding text encoding problems) can be displayed and edited in any environment. We call it the source code. By compiling the source code, you can get the native code.
The following figure reflects this process.
Windows operating system overcomes hardware differences other than CPU.
The hardware of the computer is not only composed of CPU, but also includes data and memory used to store program instructions, and peripherals such as keyboards, monitors, hard drives, printers and so on.
In WIndows software, keyboard input, display output and so on do not send instructions directly to the hardware. Instead, it is done by sending instructions to Windows. As a result, programmers do not have to pay attention to the different composition of memory and Imap O addresses. Windows operates hardware rather than software, and software can control hardware by operating Windows system.
Differences of API among different operating systems
Next, let's take a look at the types of operating systems. For the same type of computer, there are many choices for the type of operating system that can be installed.
For example: AT compatible computers can not only install Windows, but also use Unix series of Linux and FreeBSD (also a Unix operating system) and other operating systems.
Of course, application software must be specially developed according to different operating system types. The type of CPU is different, the language of the corresponding machine is also different, the same reason, the type of operating system is different, the way that the application program transmits instructions to the operating system is also different.
The way an application passes instructions to the system is called API (Application Programming Interface). Windows and API of the Linux operating system provide a combination of functions that any application can take advantage of. Because the API of different operating systems is different. Therefore, how to port the same application to another operating system, it is necessary to cover the API part used by the application.
Keyboard input, mouse input, display output, file input and output and other functions that interact with peripheral devices are all provided through API.
This is why Windows applications cannot be ported directly to the Linux operating system. API is so different.
Under the same type of operating system, API is almost the same regardless of hardware. However, because different types of CPU have different machine languages, the native code is also different.
History of operating system function
In fact, the operating system is also a kind of software, the emergence of any new thing must have its historical background, then the operating system does not appear out of thin air, it must have its historical background.
In the era when there is no operating system for computers, there are no programs at all. People control the computer through various buttons, which is a very troublesome process.
As a result, someone developed a monitoring program that only has the function of loading and running, which is the prototype of the operating system. By starting the monitoring program in advance, programmers can load various programs into memory and run them as needed. Although it is still troublesome, the workload has been greatly alleviated compared to developing without any program.
With the development of the times, people find that many programs have common parts in the process of using monitoring programs to write programs. For example, text input through the keyboard, data display on the display, etc., if each writing a new application requires the same processing, it is really a waste of time.
Therefore, the basic input and output part of the program is appended to the monitoring program. This is how the initial operating system was born.
Similar ideas can be shared, and it is found that more applications can be added to monitoring programs, such as hardware control programs, programming language processors (assembly, compilation, parsing), and various applications, etc. the result is an operating system that is not much different from what it is today, that is, the operating system is actually a collection of multiple programs.
Characteristics of Windows operating system
The Windows operating system has the largest number of users in the world. As a senior user of the Windows operating system, do you know the characteristics of the Windows operating system? Here are some features of the Windows operating system
There are two versions of the Windows operating system: 32-bit and 64-bit through API function integration to provide system calls provide a graphical user interface through WYSIWYG to achieve printout, WYSIWYG is actually What You See Is What You Get, it is worthwhile that the graphics and text displayed on the display can be printed to the printer as is. Provide multitasking function, that is, it can open multiple tasks at the same time to provide network function and database function to realize the self-setting of device driver through plug and play.
These are some of the features that are meaningful to programmers. Here are some of these features.
32-bit operating system
The 32-bit operating system represented here represents the most efficient data size. The basic unit of data processing in Windows is 32 bits. This is different from the original 16-bit operating systems such as MS-DOS, because it takes two times to process 32-bit data in 16-bit operating systems, while 32-bit operating systems only need to process 32-bit data once, so generally, applications on windows can handle 32-bit data at most.
For example, when using C language to deal with integer data, there are three options: 8-bit char type, 16-bit short type, and 32-bit long type. if you use a larger number of long types for processing, only memory and disk overhead will be increased, with little impact on performance.
Now most of the operating systems on the market are 64-bit operating systems, and so are 64-bit operating systems.
Provide system calls through the set of API functions
Windows provides system calls through a set of functions called API. API is the interface between the application and the operating system, the full name is Application Programming Interface, the application program interface.
The current mainstream 32-bit version of Windows API, also known as Win32 API, is so named because it needs to be distinguished from different operating systems, such as the original 16-bit version of Win16 API and the later popular Win64 API.
API is provided through multiple DLL files, and the entities of each API are functions written in C language. Therefore, in the C language environment, it is easier to use API. For example, the MessageBox () function used by API is saved in the DLL file user32.dll provided by Windows.
Provides a user interface using GUI
GUI (Graphical User Interface) refers to the graphical user interface, by clicking on the window in the display and icons and other visual user interface, for example: there are two versions of the Linux operating system, one is the simple version, which directly controls the hardware through the command line, and the other is the visual version, which controls the hardware through the cursor click on the graphical interface.
Realize printout through WYSIWYG
WYSIWYG means that the output on the monitor can be printed directly through the printer. In Windows, monitors and printers are treated as equivalent graphics output devices, and this function also provides conditions for WYSIWYG.
With the WYSIWYG feature, programmers can make it a lot easier. Initially, in order to display on the display and print in the printer, you have to write their own programs, but in Windows, you can basically display and print these two functions in one program with the help of WYSIWYG.
Provide multitasking
Multitasking refers to the function of running multiple applications at the same time. Windows realizes the multitasking function through clock division technology. Clock division refers to the way in which multiple programs switch and run in a short time interval. From the user's point of view, it is as if multiple programs are running at the same time, and the underlying layer is CPU time slice, which is also the core of multithreading and multitasking.
CPU fragmentation, which is also clock division.
Provide network and database functions
In Windows, network functionality is provided as a standard function. Database (database server) functions are sometimes appended.
Although network functions and database functions are not indispensable to the operating system, they are collectively called middleware rather than applications because they are very close to the operating system. It means that it is in the middle layer between the operating system and the application, and the operating system and middleware are combined together, which is called system software. Applications can make use of not only the operating system, but also the functions of middleware.
Applications can use operating systems and middleware
Compared to the operating system that cannot be easily replaced once installed, the middleware can be replaced as needed, but for most applications, replacing the middleware will cause the application to be replaced as well, from this point of view, it is not so easy to replace the middleware.
Automatic setting of device driver through plug and play
Plug and play (Plug-and-Play) refers to a mechanism that can be used directly after a new device is connected (plug). When a new device is connected to a computer, the computer automatically installs and sets the driver used to control the device.
The device driver is a part of the operating system, which provides basic input and output functions with the hardware. Keyboard, mouse, monitor, disk device and so on, the device drivers of the necessary hardware in these computers are generally installed with the operating system.
Sometimes the DLL file is installed with the device driver file. These DLL files are stored to take advantage of the newly added hardware API, and through API, the heart application running the hardware can be made.
Assembly language and native code
As we discussed in the previous article, the computer CPU can only run native code (machine language) programs, and the code written in high-level languages such as C needs to be compiled by the compiler and converted into native code before it can be interpreted and executed by CPU.
However, the readability of the native code is very poor, so it is necessary to replace the local code with a language that can be read directly, that is, in each local code, there is an English abbreviation indicating its function, such as adding the abbreviation of add (addition) to the local code of addition operation, and adding the abbreviation of cmp (compare) to the local code of comparison operator. These signs that represent specific native code instructions through abbreviations are called mnemonics, and languages that use mnemonics are called assembly language. In this way, by reading the assembly language, you can also understand the meaning of the native code.
However, even source code written in assembly language will eventually have to be converted to native code to run. The program responsible for doing this is called a compiler, and the conversion process is called assembly. Assemblers and compilers are the same in terms of the ability to convert source code to native code.
The source code written in assembly language corresponds to the native code one by one. As a result, native code can in turn be converted into code written in assembly language. The process of converting native code into assembly code is called disassembly, and the program that performs disassembly is called disassembler.
One-to-one conversion between native code and assembly language
Even the source code written in C is compiled and converted into native code for a particular CPU. If you disassemble it, you can get the source code of the assembly language and investigate its contents. However, it is more difficult for native code to be decompiled into C source code than to disassemble native code into assembly code, because there is no one-to-one correspondence between C code and native code.
Output the source code of assembly language through the compiler
We mentioned above that native code can be disassembled into assembly code, but is this the only way to convert it? Obviously not, the source code written in C can also be compiled by a compiler called assembly code, so let's give it a try.
First of all, I need to make some preparations. I need to download the Borland C++ 5.5compiler first. For convenience, I can download the reader directly from my Baidu network disk (link: https://pan.baidu.com/s/19LqVICpn5GcV88thD2AnlA password: hz1u)
After downloading, you need to configure. The following is the configuration description (https://wenku.baidu.com/view/22e2f418650e52ea551898ad.html). The tutorial is complete and you can follow the configuration. Let's start our compilation process.
First, use a text editor such as Windows notepad to write the following code
/ / A function that returns the sum of two parameter values
Int AddNum (int a _ line int b) {
Return a + b
}
/ / the function that calls the AddNum function
Void MyFunc () {
Int c
C = AddNum (123456)
}
After writing, save its file name as Sample4.c, the extension of the C language source file, usually expressed by .c, the above program is to provide two input parameters and return the sum of them.
Open a command prompt under the Windows operating system, change to the folder where you saved Sample4.c, and enter
Bcc32-c-S Sample4.c
Bcc32 is the command to start Borland C++, the option of-c refers to only compiling and not linking, and the-S option is used to specify the source code for generating assembly language
As a result of compilation, an assembly language source code named Sample4.asm is generated in the current directory.
The extension of the assembly language source file, usually represented by .asm, so let's open it with an editor and take a look at the contents of Sample4.asm
.386p
Ifdef?? version
If?? version GT 500H
.mmx
Endif
Endif
Model flat
Ifndef?? version
? debug macro
Endm
Endif
? debug S "Sample4.c"
? debug T "Sample4.c"
_ TEXT segment dword public use32 'CODE'
_ TEXT ends
_ DATA segment dword public use32 'DATA'
_ DATA ends
_ BSS segment dword public use32 'BSS'
_ BSS ends
DGROUP group _ BSS,_DATA
_ TEXT segment dword public use32 'CODE'
_ AddNum proc near
? live1@0:
; int AddNum (int a _ line int b) {
Push ebp
Mov ebp,esp
; return a + b
@ 1:
Mov eax,dword ptr [ebp+8]
Add eax,dword ptr [ebp+12]
;}
@ 3:
@ 2:
Pop ebp
Ret
_ AddNum endp
_ MyFunc proc near
? live1@48:
; void MyFunc () {
Push ebp
Mov ebp,esp
; int c
; c = AddNum (123456)
@ 4:
Push 456
Push 123
Call _ AddNum
Add esp,8
;}
@ 5:
Pop ebp
Ret
_ MyFunc endp
_ TEXT ends
Public _ AddNum
Public _ MyFunc
? debug D "Sample4.c" 20343 45835
End
In this way, the compiler successfully converts the C language into assembly code.
Pseudo instructions that are not converted to native code
It may feel difficult for readers who see the assembly code for the first time, but it is actually relatively simple, and it may be even simpler than the C language. In order to facilitate reading the source code of the assembly code, there are several main points to pay attention to.
The source code of the assembly language consists of instructions that are converted into native code (opcodes described later) and pseudo instructions for the assembler. The pseudo instruction is responsible for indicating the construction of the program and the method of assembly to the assembler (conversion program). However, pseudo instructions cannot be compiled and converted into native code. The following are the pseudo instructions intercepted by the above program
_ TEXT segment dword public use32 'CODE'
_ TEXT ends
_ DATA segment dword public use32 'DATA'
_ DATA ends
_ BSS segment dword public use32 'BSS'
_ BSS ends
DGROUP group _ BSS,_DATA
_ AddNum proc near
_ AddNum endp
_ MyFunc proc near
_ MyFunc endp
_ TEXT ends
End
The part surrounded by the directive segment and ends is obtained by adding a name to the collection of commands and data that make up the program, called segment definition. The English expression of segment definition has the meaning of region. in this program, segment definition refers to the meaning of the collection of programs such as commands and data, and a program is composed of multiple segment definitions.
The starting position of the above code defines three segment definitions named _ TEXT, _ DATA and _ BSS, respectively. _ TEXT is the specified segment definition, _ DATA is the segment definition of the data initialized (with initial value), and _ BSS is the segment definition of the data that has not been initialized. The name of this definition is defined by Borland C++ and assigned automatically by the Borland C++ compiler, so the order of program segment definitions is _ TEXT, _ DATA, _ BSS, which also ensures memory continuity.
_ TEXT segment dword public use32 'CODE'
_ TEXT ends
_ DATA segment dword public use32 'DATA'
_ DATA ends
_ BSS segment dword public use32 'BSS'
_ BSS ends
A segment is used to distinguish or divide an area of scope. The assembly language segment directive indicates the beginning of the segment definition, and the ends directive represents the end of the segment definition. The definition of a segment is a continuous memory space.
The group directive refers to a group with the aggregate name DGROUP defined by the segments _ BSS and _ DATA.
DGROUP group _ BSS,_DATA
Surround _ TEXT segment and _ TEXT ends of _ AddNum and _ MyFun to indicate that _ AddNum and _ MyFun belong to the definition of _ TEXT.
_ TEXT segment dword public use32 'CODE'
_ TEXT ends
As a result, even if instructions and data are mixed in the source code, they are translated into regular native code after compilation and assembly.
The parts enclosed by _ AddNum proc and _ AddNum endp, and the parts enclosed by _ MyFunc proc and _ MyFunc endp represent the ranges of the AddNum function and the MyFunc function, respectively.
_ AddNum proc near
_ AddNum endp
_ MyFunc proc near
_ MyFunc endp
After compilation, the function name is preceded by an underscore of _, which is the rule of Borland C++. AddNum functions written in C are internally handled with the name _ AddNum. The part surrounded by the directive proc and endp represents the scope of the procedure. In assembly language, this form of function equivalent to C language is called procedure.
The end directive at the end indicates the end of the source code.
The syntax of assembly language is opcode + Operand.
In assembly language, a line represents a pair of CPU instructions. The grammatical structure of assembly language instructions is opcodes + operands, and there are also instructions that only opcodes have no operands.
The opcode represents the instruction action, and the Operand represents the instruction object. The use of opcodes and operands together is an English instruction. For example, in terms of English grammar, the opcode is the verb and the Operand is the object. For example, in this sentence Give me money, the English instruction, Give is the opcode, and me and money are operands. If there are multiple operands in assembly language, separate them with commas, just like Give me,money.
What form of opcode can be used is determined by the type of CPU. The function of the opcode is sorted out below.
The local code needs to be loaded into memory before it can be run, and the instructions and data that make up the local code are stored in memory. When the program is running, CPU reads the data and instructions from memory and processes them in registers inside CPU.
If you don't already know much about the relationship between CPU and memory, please read another article by the author that programmers need to know more about the CPU of hardware knowledge.
Registers are storage areas in CPU. Registers not only have temporary storage and computing functions, but also have computing functions. The main categories and roles of x86 series are shown in the following figure.
Instruction parsing
Here is the analysis of the instructions in CPU
The most commonly used mov instruction
The most commonly used instruction is the mov instruction to store data in the register and memory, and the two operands of the mov instruction, which are used to specify the storage place and readout source of the data, respectively.
Registers, constants, labels (appended to the address), and those enclosed in square brackets ([]) can be specified in the Operand. If you specify something that is not surrounded by ([]) square brackets, the value is processed; if you specify what is enclosed in square brackets, the value of the square brackets is interpreted as a memory address, and then the value corresponding to that memory address is read and written. Let's explain the code snippet above
Mov ebp,esp
Mov eax,dword ptr [ebp+8]
In mov ebp,esp, the value in the esp register is stored directly in the ebp, that is, if the value of the esp register is 100, then the value of the ebp register is also 100.
In the mov eax,dword ptr [ebp+8] instruction, the value of the ebp register + 8 is parsed as the memory address. If ebp
If the value of the register is 100, the value of the eax register is the value of the address of 100 + 8. Dword ptr, also known as double word pointer, simply explains that it reads 4 bytes of data from a specified memory address.
Push and pop the stack
When the program is running, it applies for the allocation of a data space called a stack on memory. The characteristic of stack is last-in-first-out. Data is accumulated gradually from the lower layer of memory (large address number) to the upper layer (small address number) when stored, and read from top to bottom.
Stack is the area where temporary data is stored, which is characterized by the storage and reading of data through push instructions and pop instructions. Storing data in the stack is called putting in the stack, and reading out the data from the stack is called out of the stack. In the CPU of 32-bit x86 series, you can process 32-bit (4-byte) data by push or pop once.
The calling mechanism of function
Let's analyze the function calling mechanism together. Let's take the code written in C language above as an example. First of all, let's start with the assembly language part of the MyFunc function calling the AddNum function to explain the function calling mechanism. The stack plays a great role in the function call. Here is the assembly processing of the processed MyFunc function.
_ MyFunc proc near
Push ebp; stores the value of the ebp register in the stack (1)
Mov ebp,esp; store the value of the esp register in the ebp register (2)
Push 456; put 456 on the stack (3)
Push 123; put 123 on the stack (4)
Call _ AddNum; call AddNum function (5)
Add esp,8; the value of the esp register + 8 (6)
Pop ebp; read out the values in the stack and store them in the esp register (7)
Ret; ends the MyFunc function and returns to the call source (8)
_ MyFunc endp
The processing of (1), (2), (7), (8) in code interpretation applies to all functions in the C language, which we will explain later when we show what the AddNum function handles. Here, I hope you will pay attention to (3)-(6) this part, which is very important to understand the function calling mechanism.
(3) and (4) indicate that the parameters passed to the AddNum function are stacked through push. In the C language source code, although it is described as the function AddNum (123456), it will first enter the stack in the order of 456123. That is, the value at the back enters the stack first. This is the rule of C language.
(5) the call instruction will jump the program flow to the address of the AddNum function instruction. In assembly language, the function name represents the memory address where the function is located. After the AddNum function is processed, the program flow must return to the line numbered (6). After the call instruction runs, the memory address of the next line of the call instruction (that is, line (6)) (the memory address to be returned after the function is called) is automatically push into the stack. The value will be de-stacked through the ret instruction pop at the end of the AddNum function processing, and the program will return to the line (6).
(6) the section destroys the two parameters (456 and 123) stored in the stack. Although it can be achieved with two pop instructions, it is more efficient to use esp register + 8 (it can be processed once). When you input and output numeric values to the stack, the unit of numeric value is 4 bytes. Therefore, by adding 2 times 8 of 4 to the esp register responsible for stack address management, you can achieve the same effect as running the pop command twice. Although the data in memory actually remains, as long as the value of the esp register is updated to the data location in front of the data storage address, the data is tantamount to destruction.
When I was compiling the Sample4.c file, I got the message in the following figure
The figure means that the value of c is defined in MyFunc but never used, which is actually a function optimized by the compiler. Because the variable c that stores the return value of the AddNum function is not used later, the compiler thinks that the variable is meaningless and does not generate the corresponding assembly language code.
The following figure shows the change of stack memory before and after calling the function AddNum
Internal processing of function
Above, we have analyzed the code of the whole process of Sample4.c with assembly code. Now let's focus on the source code of the AddNum function, and analyze the mechanism of parameter receiving, return value and return.
_ AddNum proc near
Push ebp (1)
Mov ebp,esp (2)
Mov eax,dword ptr [ebp+8] (3)
Add eax,dword ptr [ebp+12] (4)
Pop ebp (5)
Ret (6)
_ AddNum endp
The value of the ebp register is put into the stack in (1) and out of the stack in (5), which is mainly to restore the contents of the ebp register used in the function to the state before the function call.
(2) assign the value of the esp register responsible for managing the stack address to the ebp register. This is because parameters in square brackets in mov instructions are not allowed to specify esp registers. Therefore, the method of reading and writing the contents of the stack not directly through esp, but with ebp registers is used here.
(3) use [ebp + 8] to specify the first parameter 123 stored in the stack and read it into the eax register. Like this, you can refer to the contents of the stack without using the pop instruction. The eax register is selected from multiple registers because eax is the cumulative register responsible for the operation.
Through the add instruction of (4), the result of adding the value of the current eax register to the second parameter is stored in the eax register. [ebp + 12] is used to specify the second parameter 456. In C language, the return value of a function must be returned through the eax register, which is also specified. That is, the parameters of the function are passed through the stack, and the return value is returned through the register.
(6) after the ret instruction runs, the function returns the destination memory address automatically off the stack, and the program flow jumps back to the next line of (6) (Call _ AddNum). At this point, the state of the stack at the entry and exit of the AddNum function changes, as shown in the following figure
Global and local variables
After we are familiar with the assembly language, let's take a look at global variables and local variables. variables defined outside the function are called global variables, and variables defined inside the function are called local variables. global variables can be used in any function. Local variables can only be used inside the function definition of local variables. Next, let's take a look at the difference between global variables and local variables through assembly language.
The following C language code defines local variables and global variables, and assigns values to each variable. Let's take a look at the source code first.
/ / define the global variable to be initialized
Int A1 = 1
Int a2 = 2
Int A3 = 3
Int A4 = 4
Int A5 = 5
/ / define global variables that are not initialized
Int b1,b2,b3,b4,b5
/ / define function
Void MyFunc () {
/ / define local variables
Int c1,c2,c3,c4,c5,c6,c7,c8,c9,c10
/ / assign values to local variables
C1 = 1
C2 = 2
C 3 = 3
C4 = 4
C5 = 5
C6 = 6
C7 = 7
C8 = 8
C9 = 9
C10 = 10
/ / assign local variables to global variables
A1 = C1
A2 = c2
A3 = c3
A4 = c4
A5 = c5
B1 = c6
B2 = c7
B3 = c8
B4 = c9
B5 = c10
}
The above code is very violent, but it doesn't matter, as long as it is convenient for us to analyze its assembly source code. The assembly code compiled with Borland C++ is as follows, and the compiled source code is relatively long. Here we only use part of it for analysis (we changed the definition order of the following paragraphs and deleted some comments).
_ DATA segment dword public use32 'DATA'
Align 4
_ a1 label dword
Dd 1
Align 4
_ a2 label dword
Dd 2
Align 4
_ a3 label dword
Dd 3
Align 4
_ a4 label dword
Dd 4
Align 4
_ a5 label dword
Dd 5
_ DATA ends
_ BSS segment dword public use32 'BSS'
Align 4
_ b1 label dword
Db 4 dup (?)
Align 4
_ b2 label dword
Db 4 dup (?)
Align 4
_ b3 label dword
Db 4 dup (?)
Align 4
_ b4 label dword
Db 4 dup (?)
Align 4
_ b5 label dword
Db 4 dup (?)
_ BSS ends
_ TEXT segment dword public use32 'CODE'
_ MyFunc proc near
Push ebp
Mov ebp,esp
Add esp,-20
Push ebx
Push esi
Mov eax,1
Mov edx,2
Mov ecx,3
Mov ebx,4
Mov esi,5
Mov dword ptr [ebp-4], 6
Mov dword ptr [ebp-8], 7
Mov dword ptr [ebp-12], 8
Mov dword ptr [ebp-16], 9
Mov dword ptr [ebp-20], 10
Mov dword ptr [_ a1], eax
Mov dword ptr [_ a2], edx
Mov dword ptr [_ a3], ecx
Mov dword ptr [_ a4], ebx
Mov dword ptr [_ a5], esi
Mov eax,dword ptr [ebp-4]
Mov dword ptr [_ b1], eax
Mov edx,dword ptr [ebp-8]
Mov dword ptr [_ b2], edx
Mov ecx,dword ptr [ebp-12]
Mov dword ptr [_ b3], ecx
Mov eax,dword ptr [ebp-16]
Mov dword ptr [_ b4], eax
Mov edx,dword ptr [ebp-20]
Mov dword ptr [_ b5], edx
Pop esi
Pop ebx
Mov esp,ebp
Pop ebp
Ret
_ MyFunc endp
_ TEXT ends
Compiled programs are classified into groups called segment definitions.
Initialized global variables are summarized into a segment definition named _ DATA _ DATA segment dword public use32 'DATA'
...
_ DATA ends
Global variables that are not initialized are summarized into a segment definition named _ BSS _ BSS segment dword public use32 'BSS'
...
_ BSS ends
The assembly code surrounded by the segment definition _ TEXT is the definition of Borland C++ _ TEXT segment dword public use32 'CODE'
_ MyFunc proc near
...
_ MyFunc endp
_ TEXT ends
Before we analyze the assembly code above, let's take a look at more assembly instructions. This table is a continuation of some of the above opcodes and their functions.
The opcode Operand function addA,B adds the values of An and B, and assigns the result to the AcallA calling function AcmpA,B to compare An and B. the comparison result is automatically stored in the flag register, the value of A + 1ige signature and the cmp command are used together. Jump to the label line using a combination of jl tag signature and cmp command. Jump to the label line using a combination of jle tag signature and cmp command. Jump to the label line using a combination of jmp tag signature and cmp command. Jump to the label line movA,B assigns the value of B to ApopA to read the value from the stack and save it to ApushA to store the value of An in the stack ret does not return the processing to the bit of the calling source xorA,BA and B for comparison, and stores the result in A.
Let's first take a look at the definition of the _ DATA section. The _ A1 label dword defines the label _ A1.
The label represents the position relative to the starting position of the segment definition. Because _ A1 is at the beginning of the definition of the _ DATA section, the relative position is 0. _ A1 is equivalent to the global variable A1. The compiled function and variable names are preceded by a (_), which is also the rule of Borland C++. Dd 1 means that the application allocates 4 bytes of memory and stores the initial value of 1. Dd means that define double word represents two byte fields of length 2 (word), which means 4 bytes.
In Borland C++, because the length of the int type is 4 bytes, the assembler converts int A1 = 1 into _ A1 label dword and dd 1. Similarly, labels _ a2-A5 equivalent to global variables are defined here, and their initial values of 2-5 are also stored in their respective 4 bytes.
Next, let's talk about the definition of the _ BSS section. The label _ b1-_ b5 equivalent to the global variable b1-b5 is defined here. Among them, db 4dup (?) Represents a field in which the application has been allocated 4 bytes, but the value has not yet been determined. To express). Db (define byte) indicates that there is a memory space of 1 byte in length. Therefore, db 4 dup (?) This is 4 bytes of memory space.
Note: db 4 dup. Don't be confused with dd 4, which represents four memory spaces of 1 byte in length. Db 4 represents a double-byte (= 4-byte) memory with a value of 4
Temporarily ensure the memory space used by local variables
We know that local variables are temporarily stored in registers and stacks. The stack is used to store the local variable inside the function. After the function call is completed, the local variable value is destroyed, but the register may be used for other purposes. Therefore, local variables are only temporarily stored in registers and stacks during processing.
Recall that the above code defines 10 local variables? This is to indicate that not only stacks but also registers are stored for local variables. To ensure the required domain for C1-c10, registers are used when registers are idle and stacks are used when register space is insufficient.
Let's continue to analyze the contents of the above code. The _ TEXT section definition represents the scope of the MyFunc function. The area of memory required by local variables defined in the MyFunc function. Will be allocated in registers as much as possible. You may think that using high-performance registers to replace ordinary memory is a waste of resources, but the compiler does not think so, as long as the register has space, the compiler will use it. Because the access speed of registers is much higher than that of memory, direct access to registers can be processed efficiently. Local variables use registers, which is the optimized running result of the Borland C++ compiler.
The following in the code listing represents the part that assigns local variables to the register
Mov eax,1
Mov edx,2
Mov ecx,3
Mov ebx,4
Mov esi,5
It is not enough to define a local variable. Only when a local variable is assigned a value will it be allocated to the memory area of the register. The above code is equivalent to assigning five local variables C1-c5 with values of 1-5 respectively. Eax, edx, ecx, ebx, esi are the names of x86 series 32-bit CPU registers. It is up to the compiler to decide which register to use.
Among the registers owned by x86 series CPU, the program can operate more than a dozen, of which there will be a few idle at most. Therefore, when the number of local variables exceeds the number of registers, there are not enough registers available. In this case, the compiler will use the stack to store the remaining local variables.
In this part of the code above, after allocating registers to the local variables C1-c5, the number of registers available is insufficient. As a result, the remaining five local variables c6-c10 are allocated to the memory space of the stack. As shown in the following code
Mov dword ptr [ebp-4], 6
Mov dword ptr [ebp-8], 7
Mov dword ptr [ebp-12], 8
Mov dword ptr [ebp-16], 9
Mov dword ptr [ebp-20], 10
The function entry add esp,-20 refers to the subtraction of 20% from the value of the esp register (stack pointer) where the stack data is stored. To ensure that the memory variables c6-c10 are on the stack, you need to reserve the space required for five local variables of type int (4 bytes * 5 = 20 bytes).
The mov ebp,esp instruction means to assign the value of the esp register to the ebp register. The reason for this is to restore the value of the esp register to the original state by mov esp ebp at the exit of the function, thus releasing the stack space applied for allocation, and then the local variables used in the stack disappear. This is also the stack cleaning process. In the case of registers, local variables automatically disappear when registers are used for other purposes, as shown in the following figure.
Mov dword ptr [ebp-4], 6
Mov dword ptr [ebp-8], 7
Mov dword ptr [ebp-12], 8
Mov dword ptr [ebp-16], 9
Mov dword ptr [ebp-20], 10
These five lines of code are the part of substituting values into the stack space. Because the value of the esp register is saved in the esp register with the help of mov ebp and esp before applying for memory space from the stack, by using the forms of [ebp-4], [ebp-8], [ebp-12], [ebp-16], [ebp-20] You can apply for the allocation of 20 bytes of stack memory space to be split into 5 4-byte space to use.
For example, mov dword ptr [ebp-4], 6 means that the 4-byte data of 6 is stored in the address ([ebp-4]) that starts at the lower end of the requested memory space (the location indicated by the ebp register).
Processing of loop control statements
All the above are sequential processes, so now let's analyze the processing of the loop flow and see how the flow control of c language programs such as for loop and if conditional branch is realized. Let's take the code and the compiled results as an example to take a look at the process of the program control flow.
/ / define MySub function
Void MySub () {
/ / do not do any processing
}
/ / define MyFunc function
Void Myfunc () {
Int i
For (int I = 0 position I)
< 10;i++){ // 重复调用MySub十次 MySub(); } } 上述代码将局部变量 i 作为循环条件,循环调用十次MySub 函数,下面是它主要的汇编代码 xor ebx, ebx ; 将寄存器清0 @4 call _MySub ; 调用MySub函数 inc ebx ; ebx寄存器的值 + 1 cmp ebx,10 ; 将ebx寄存器的值和10进行比较 jl short @4 ; 如果小于10就跳转到 @4 C 语言中的 for 语句是通过在括号中指定循环计数器的初始值(i = 0)、循环的继续条件(i < 10)、循环计数器的更新(i++) 这三种形式来进行循环处理的。与此相对的汇编代码就是通过比较指令(cmp) 和 跳转指令(jl)来实现的。 下面我们来对上述代码进行说明 MyFunc 函数中用到的局部变量只有 i ,变量 i 申请分配了 ebx 寄存器的内存空间。for 语句括号中的 i = 0 被转换为 xor ebx,ebx 这一处理,xor 指令会对左起第一个操作数和右起第二个操作数进行 XOR 运算,然后把结果存储在第一个操作数中。 由于这里把第一个操作数和第二个操作数都指定为了 ebx,因此就变成了对相同数值的 XOR 运算。也就是说不管当前寄存器的值是什么,最终的结果都是0。类似的,我们使用 mov ebx,0 也能得到相同的结果,但是 xor 指令的处理速度更快,而且编译器也会启动最优化功能。 XOR 指的就是异或操作,它的运算规则是 如果a、b两个值不相同,则异或结果为1。如果a、b两个值相同,异或结果为0。 相同数值进行 XOR 运算,运算结果为0。XOR 的运算规则是,值不同时结果为1,值相同时结果为0。例如 01010101 和 01010101 进行运算,就会分别对各个数字位进行 XOR 运算。因为每个数字位都相同,所以运算结果为0。 ebx 寄存器的值初始化后,会通过 call 指定调用 _MySub 函数,从 _MySub 函数返回后,会执行inc ebx 指令,对 ebx 的值进行 + 1 操作,这个操作就相当于 i++ 的意思,++ 表示的就是当前数值 + 1。 这里需要知道 i++ 和 ++i 的区别 i++ 是先赋值,复制完成后再对 i执行 + 1 操作 ++i 是先进行 +1 操作,完成后再进行赋值 inc 下一行的 cmp 是用来对第一个操作数和第二个操作数的数值进行比较的指令。cmp ebx,10 就相当于 C 语言中的 i < 10 这一处理,意思是把 ebx 寄存器的值与10进行比较。汇编语言中比较指令的结果,会存储在 CPU 的标志寄存器中。不过,标志寄存器的值,程序是无法直接参考的。那如何判断比较结果呢? 汇编语言中有多个跳转指令,这些跳转指令会根据标志寄存器的值来判断是否进行跳转操作,例如最后一行的 jl,它会根据 cmp ebx,10 指令所存储在标志寄存器中的值来判断是否跳转,jl 这条指令表示的就是 jump on less than(小于的话就跳转)。发现如果 i 比 10 小,就会跳转到 @4 所在的指令处继续执行。 那么汇编代码的意思也可以用 C 语言来改写一下,加深理解 i ^= i; L4: MySub(); i++; if(i < 10) goto L4; 代码第一行 i ^= i 指的就是 i 和 i 进行异或运算,也就是 XOR 运算,MySub() 函数用 L4 标签来替代,然后进行 i 自增操作,如果i 的值小于 10 的话,就会一直循环 MySub() 函数。 条件分支的处理方法 条件分支的处理方式和循环的处理方式很相似,使用的也是 cmp 指令和跳转指令。下面是用 C 语言编写的条件分支的代码 // 定义MySub1 函数 void MySub1(){ // 不做任何处理 } // 定义MySub2 函数 void MySub2(){ // 不做任何处理 } // 定义MySub3 函数 void MySub3(){ // 不做任何处理 } // 定义MyFunc 函数 void MyFunc(){ int a = 123; // 根据条件调用不同的函数 if(a >100) {
MySub1 ()
}
Else if (a < 50) {
MySub2 ()
}
Else
{
MySub3 ()
}
}
A very simple C language code that implements conditional judgment, so the result of compiling it with Borland C++ is as follows
_ MyFunc proc near
Push ebp
Mov ebp,esp
Mov eax,123; store 123 in the eax register
Compare the value of the eax register with cmp eax,100
Jle short @ 8; for 100hrs, jump to the @ 8 tag
Jmp short @ 11; Jump to @ 11 tag
@ 8:
Cmp eax,50; compare the value of the eax register with 50
When jge short @ 10; is greater than 50, jump to the @ 10 tag
Call _ MySub2; call the MySub2 function
Jmp short @ 11; Jump to @ 11 tag
@ 10:
Call _ MySub3; call the MySub3 function
@ 11:
Pop ebp
Ret
_ MyFunc endp
The above code uses three kinds of jump instructions, namely, jle (jump on less or equal) compares the result when the hour jumps, jge (jump on greater or equal) jumps when the result is large, and jmp that jumps no matter what the result. Before these jump instructions, there is an instruction cmp for comparison, which constitutes the main logical form of the above assembly code.
By comparing the above assembly code with the C language source code, we must have a new understanding of how the program works. Moreover, the knowledge gained from the assembly source code is also helpful to understand the characteristics of Java and other high-level languages, such as variables modified by the native keyword in Java, so the bottom of this variable is written in C language. There are also some syntax candies in Java that can only be known through assembly code. In some cases, it is also helpful to find the cause of bug.
The programming methods we have learned above are all serial processing, so what are the characteristics of serial processing?
One of the biggest features of serial processing is to concentrate on only one thing and do another after one thing is done.
Computers support multithreading, and the core of multithreading is CPU switching, as shown in the following figure
Let's take a practical example. Let's look at a piece of code.
/ / define global variables
Int counter = 100
/ / define MyFunc1 ()
Void MyFunc () {
Counter * = 2
}
Void MyFunc2 () {
Counter * = 2
}
The above code is a C language program that updates the value of counter. Both MyFunc1 () and MyFunc2 () deal with doubling the value of counter, and then assigning the value of counter to counter. Here, we assume that multithreading is used and the MyFunc1 and MyFunc2 functions are called at the same time. In this case, the value of the global variable counter should be programmed 100 * 2 * 2 = 400. If you open multiple threads, you will find that the number of counter is sometimes 200. it is difficult to find out why this happens if you do not understand how the program works.
The code that we convert the above code into assembly language is as follows
Mov eax,dword ptr [_ counter]; read the value of counter into the eax register
Add eax,eax; increase the value of the eax register by two times.
Mov dword ptr [_ counter], eax; stores the value of the eax register in counter.
In multithreaded programs, for every line of code represented in assembly language, the processing may be switched to another thread. Therefore, assuming that the MyFun1 function reads the value of counter 100 and does not have time to write its double value 200 to counter, and the MyFun2 function happens to read the value of counter 100, then the result will be 200.
To avoid this bug, we can use a locking method that forbids thread switching as a function or a behavior unit of C language code, or use some thread-safe way to avoid this problem.
Almost no one writes programs in assembly language these days, because high-level languages such as C and Java are much faster than assembly languages. However, the experience of assembly language is still very important, with the help of assembly language, we can better understand the mechanism of computer operation.
The above is the underlying computer knowledge that Java programmers must understand. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.