Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the hashing algorithm of the block chain?

2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "what is the hashing algorithm of blockchain". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Hash is an encryption algorithm

A hash function (Hash Function), also known as a hash or hash function. The hash function is a public function that maps a message M of any length to a short and fixed-length value H (M), which is called a hash value, a hash value (Hash Value), a hash value, or a message digest (Message Digest). It is an one-way cryptosystem, that is, an irreversible mapping from plaintext to ciphertext, with only encryption process and no decryption process.

Its function expression is: hull H (m)

Regardless of the digital format of the input and the size of the file, the output is a fixed-length bit string. Take the Sh356 algorithm used by Bitcoin as an example. No matter what data file the input is, the output is 256bit.

Each bit is a bit 0 or 1256bit is 256 strings of 0 or 1 binary digits. If expressed in hexadecimal numbers, how many bits are there?

16 equals 2 to the fourth power, so each hexadecimal number can represent 4 digits of bit. So, 256bit bit is represented as a hexadecimal number, and of course 256divided by 4 equals 64 bits.

So the hash you usually see looks like this:

00740f40257a13bf03b40f54a9fe398c79a664bb21cfa2870ab07888b21eeba8 .

This is a hash value randomly copied from btc.com. If you are not sure, you can count it. Is it 64 bits?

2. The characteristics of Hash function.

The Hash function has the following characteristics.

Easy to compress: for any size of input x, the length of the Hash value is very small, in practical application, the length of the Hash value generated by the function H is fixed.

Easy to calculate: for any given message, it is easier to calculate its hash value.

Unidirectional: for a given hash value, it is difficult to find the inverse of Hash that makes it computationally infeasible. Given a hash function H and a hash value H (M), M is not feasible in calculation. That is, the output from the hash cannot reverse the original value of the input. This is the basis for the security of hash functions.

Collision resistance: the ideal Hash function is collision-free, but it is difficult to do so in the design of practical algorithms.

There are two kinds of collision resistance: one is weak collision resistance, that is, it is not computationally feasible to find another message for a given message, and the other is strong collision resistance, that is, for any pair of different messages. makes it computationally infeasible.

* * High sensitivity: * * this is from a bit point of view, which means that a change in the input of 1 bit will cause a change in the bit of 1 bit. Any change in message M results in a change in the hash value H (M). That is, if the input is slightly different, the output after the hash operation must be different.

3. Hash algorithm

Convert URL A to the number 1. URL B, convert to the number 2.

A URL X, converted into the number N, according to the number N as the subscript, you can quickly find out the information of the URL X. The conversion process is the hashing algorithm.

For example, there are ten thousand songs here. I will give you a new song X and ask you to confirm whether this song is within those ten thousand songs.

There is no doubt that comparing ten thousand songs one by one is very slow. But if there is a way to condense the data of each ten thousand songs into one number (called a hash code) and get ten thousand numbers, then use the same algorithm to calculate the code of the new song X. see if the code of song X is in the previous ten thousand numbers, you can know whether song X is in those ten thousand songs.

As an example, if you are asked to organize those 10,000 songs, a simple hash algorithm is to use the number of bytes of the hard disk occupied by the song as a hash code. In this way, you can order 10,000 songs by size, and then meet a new song. Just see if the new song has the same number of bytes as one of the existing 10,000 songs. You'll know if the new song is within those 10,000 songs.

A reliable hashing algorithm should meet the following requirements:

For a given data M, it is easy to calculate the hash value XroomF (M).

It is difficult to inversely calculate M from X

It's hard to find M and N to make F (N) = F (M)

It is mentioned earlier that the hash function has the property of anti-collision, which means that someone can find one odd and one even to make the hash results consistent, but this is not feasible in calculation.

First of all, if the messages in large space are compressed into small space, there must be collisions. Suppose the hash length is fixed at 256 bits, and if the order is 1, 2, … 2 ^ 256 + 1, these 2 ^ 256 + 1 input values, calculate their hashes one by one, it is certain that you can find two input values so that their hashes are the same. But don't be happy too soon, because you have to have time to figure it out, it's yours.

According to the birthday paradox, if 2 ^ 128 + 1 inputs are randomly selected, there is a 99.8% chance of finding at least one pair of collision inputs. Then for a hash function with a hash value of 256 bits, it takes an average of 2 ^ 128 hashes to find the collision pair. If the computer performs 10000 hash calculations per second, it will take about 10 ^ 27 years to complete 2 ^ 128 hash calculations. In the block chain, the collision resistance of the hash function is used to verify the integrity of blocks and transactions, which can be identified as soon as there is tampering.

It is mentioned earlier that mining requires miners to obtain values less than a given difficulty value through continuous calculation of random numbers. * * difficulty * * is an important reference index for miners when digging, which determines how many hashes miners need to go through before they can produce a legal block. Bitcoin chunks are generated about every 10 minutes, and in order for new chunks to be generated at a basic rate, the difficulty value must be adjusted according to changes in computing power across the network.

The hash function adjusts the difficulty value to ensure that the digging time of each block is about 10 minutes. The difficulty value calculated by the hash function is of great significance to ensure the safety of the block chain system. As several computer scientists in the United States wrote in their book: "Hash ciphers are Swiss Army knives in cryptography, and they have found a place in many unique applications, in order to ensure security." different applications will require different characteristics of hash functions. Facts have proved that it is extremely difficult to determine a series of hash functions to fully achieve provable security. "

Workload proof requires a target value. The formula for calculating the target value (target) of bitcoin workload proof is as follows:

Target value = maximum target value / difficulty value

Where the maximum target value is a constant value:

0x00000000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

* * the target value is inversely proportional to the difficulty value. * * the proof of bitcoin workload is that the hash value of the block calculated by the miner must be less than the target value.

We can also simply understand that the process of proof of bitcoin workload is the process of constantly changing block blocks (that is, trying different random values) as input to perform SHA256 hash operations to find a hash value of a specific format (that is, a certain number of leading zeros are required). The more leading zeros are required, the more difficult it is.

4. Give a chestnut to help understand

▌ scene 1, Xiao Xing and Ah stay on the basketball court

Xiao Xing: dork, are you thirsty? would you like to buy some water and buy me a bottle by the way?

Dork: Oh, I don't know your little mind. You're thirsty yourself. You go, I won't go.

Xiao Xing: Hey, let's not talk about these useless things, let's flip a coin, okay? heads you go, tails I go, fair, how about it?

Nerd: all right.

.

▌ scene 2. Xiao Xing is chatting with a fool in real time.

Dork: Xiao Xing, come to my house to play today. On the way here, there is a pizzeria. It's very delicious. By the way, bring some.

Xiao Xing: Oh, why don't you come to my house and bring pizza.

Nerd: Xiao Xing, I can't believe you said that. It seems that we can only flip a coin.

Xiao Xing: damn it, how do I throw this? how do I know if you're up to something?

Nerd: well, that's true, too. How about this.

1. Consider encrypting the result

Nerd: I think of a number, suppose it is A, and then An is multiplied by a number B to get the result C. An is my key. I'll tell you the result C. You guess whether An is odd or even. If you get it right, you win.

Xiao Xing: I can't do that. If you tell me C is 12, I guess An is odd. You can say An is 4 and B is 3. I guess An is an even number. You can say that An is 3 and B is 4. Why don't you tell me what C is, and tell me what B is.

Nerd: then this won't work. Telling you C and B doesn't mean telling you how much An is, and guessing nothing. No, we have to do it another way.

two。 Irreversible encryption

Dork: Xiao Xing, do you think this is all right? I think an A goes through the following process:

1.A+123=B 2.B ^ 2 = C 3. Take the 2 ~ 4 digits in C to form a 3-digit D 4.D/12 to calculate the remainder, and get E.

Nerd: I'll tell you E and the above calculation. Guess whether An is odd or even, and then I'll tell you what An is. You can follow the above calculation process to verify whether I am lying or not.

Xiao Xing: well, let's see, if the A you think is 5, then:

5 "123" 128 128 ^ 2 = 16384 Duan 638 E=638mod12=53

(mod represents the remainder of division)

Xiao Xing: why, that's amazing. A value of A corresponds to a unique value of E, and A can't be calculated according to E. You're such a bitch. Well, that's fair. Anyone who lies can be identified.

Xiao Xing: dork, ask the question.

This encryption method in which some information is lost is called "one-way encryption", also known as the hash algorithm.

5. Common hash algorithm

1. SHA-1 algorithm

The input of SHA-1 is a message with a maximum length of less than 264 bits, the input message is processed in 512-bit packets, and the output is a 160-bit message digest. SHA-1 has the advantages of high speed, easy to implement and wide range of applications. Its algorithm is described as follows.

* * populate the input message: * * after filling, the length module 512 of the message should be congruent with 448. The way to fill it is that the first place is 1 and the rest are 0. The length of the message before it is populated is appended in big-endian to the last 64 bits left in the previous step. This step is necessary, even if the length of the message is already the desired length. The length of the fill ranges from 1 to 512.

Initialization buffer: you can use 160bits to store the initial variable, intermediate summary, and final summary of the Hash function, but you must first initialize and assign a value to each 32-bit initial variable, that is:

Enter the message processing main cycle and process the message block: process the 512-bit message block at a time for a total of 4 rounds of 20 operations per round, as shown in the figure. These four rounds of processing have a similar structure, but the auxiliary functions and constants used in each round are different. The input of each round is the current values A, B, C, D, E of the currently processed message packet and buffer, and the output is still placed in the buffer to replace the old values of A, B, C, D, E. The output of the fourth round is then added to the input CVq of the first round to generate the CVq+1, wherein the addition is the addition of each word in the buffer five-word CVq to the corresponding font module 232in.

Figure the processing flow of a single 512-bit message block

Output: after all the message packets have been processed, the output of the last packet is the resulting message summary value.

The step function of SHA-1 is shown in the figure. It is the most important function of SHA-1 and the most critical part of SHA-1.

The step function of graph SHA-1

Every time SHA-1 runs the step function, the values of A, B, C and D are assigned to the registers B, C, D and E in turn. At the same time, the input values, constants and sub-message blocks of A, B, C, D and E are assigned to An after the step function operation.

Where t is the number of steps, 0 ≤ t ≤ 79 Wt is a 32-bit word derived from the current 512-bit packet, and Kt is the additive constant.

The input of the basic logic function f is 3 32-bit words and the output is a 32-bit word, and its function is expressed as follows.

For the message packet wt exported from each input packet, the first 16 message words wt (0 ≤ t ≤ 15) are the 16 32-bit words corresponding to the message input packet, and the rest wt (0 ≤ t ≤ 79) can be obtained according to the following formula:

Where ROTLs represents the left cyclic shift of s bits, as shown in the figure.

The generation process of 80 message words in figure SHA-1

2. SHA-2 algorithm

For SHA-2 series Hash algorithm, the output length of SHA-2 series hash algorithm can be 224bit, 256bit, 384bit and 512bit, corresponding to SHA-224, SHA-256, SHA-384 and SHA-512 respectively. It also includes two other algorithms: SHA-512/224 and SHA-512/256. It has stronger security strength and more flexible output length than the previous Hash algorithm, in which SHA-256 is a commonly used algorithm. The first four algorithms are briefly described below.

SHA-256 algorithm

The input of SHA-256 algorithm is a message with a maximum length of less than 264 bits, and the output is a 256-bit message digest. The input message is processed in 512-bit packets. The algorithm is described below.

(1) message filling

Add a "1" and several "zeros" to make its length module 512 congruent with 448. Appends a 64-bit length block to the message, whose value is the length of the message before it is populated. As a result, a message packet with a length of 512 integer multiples is generated, and the length of the populated message is up to 264 bits.

(2) initialize link variables

The intermediate and final results of the linked variables are stored in a 256-bit buffer represented by eight 32-bit registers A, B, C, D, E, F, G, and H. the output is still placed in the buffer to replace the old A, B, C, D, E, F, G, H. The first step is to initialize the link variable, which is stored in eight registers A, B, C, D, E, F, G, and H.

The initial link variable is the first 32 bits of the decimal part of the square root of the first eight primes (2, 3, 5, 7, 11, 13, 17, 19).

(3) deal with the main cycle module

The message block is processed in 512-bit packets, with a 64-step loop (as shown in the figure). The input of each round is the message packet of the current processing and the 256-bit buffer A, B, C, D, E, F, G, H of the previous round output. Different message words and constants are used in each step, and how to get them is given below.

Compression function of graph SHA-256

(4) get the final hash value.

After all the 512-bit message block packets are processed, the result of the last packet processing is the final output 256-bit message digest.

Step function is not only the most important function in SHA-256, but also the most critical component in SHA-256. The operation process is shown in the figure.

The step function of graph SHA-256

According to the values of T1 and T2, the registers An and E are updated. The input values of A, B, C, E, F and G are assigned to B, C, D, F, G and H.

The method of obtaining Kt is to take the first 64 primes (2, 3, 5, 5, 7,... The decimal part of the cube root, convert it to binary, and then take the first 64 bits of the 64 numbers as Kt Its function is to provide a 64-bit random string set to eliminate any regularity in the input data.

For the message packet Wt exported by each input packet, the first 16 message words Wt (0 ≤ t ≤ 15) are directly based on the 16 32-bit words corresponding to the message input packet, and the rest are calculated according to the following formula:

The process of generating 64 message words in figure SHA-256

SHA-512 algorithm

SHA-512 is an algorithm with high security performance in SHA-2, which is mainly composed of plaintext filling, message spread function transformation and random number transformation. The initial value and intermediate calculation result are composed of 8 64-bit shift registers. The maximum input length of the algorithm is 2128 bits, and a message digest is generated. The input message is divided into several 1024-bit blocks for processing. The specific parameters are as follows: message digest length is 512 bits; message length is less than 2128 bits; message block size is 1024 bits; message word size is 64 bits; the number of steps is 80 steps. The following figure shows the whole process of processing the message and outputting the message digest, and the specific steps of the process are as follows.

The overall structure of the figure SHA-512

Message padding: fill a "1" and several "zeros" so that the length module 1024 is congruent with 896, and the number of filled bits is 0-1023. The length of the pre-populated message is appended to the end of the populated message with a 128-bit field, and its value is the length of the pre-populated message.

Link variable initialization: the intermediate and final results of the link variable are stored in a 512-bit buffer, which is represented by eight 64-bit registers A, B, C, D, E, F, G, H. The initial link variable is also stored in eight registers A, B, C, D, E, F, G, H, with values of:

The initial link variable is stored in big-endian mode, that is, the most significant bytes of the word are stored in the low address location. The initial link variable is taken from the first 64 bits of the binary representation of the decimal part of the square root of the first eight primes.

Main loop operation: the message is processed in 1024-bit packets, with 80 steps of loop operation. In each iteration, the values of 512-bit buffer A, B, C, D, E, F, G, H are taken as inputs, which are taken from the results of the previous iteration compression, and different message words and constants are used in each step.

Calculate the final hash value: after all the N 1024-bit packets of the message have been processed, the link variable of the Nth iteration compression output is the final hash value.

Step function is the most important part of SHA-512, and its operation process is similar to SHA-256. The calculation equation of each step is as follows. The updated values of B, C, D, F, G and H are the input state values of A, B, C, E, F and G, respectively, and two temporary variables are generated to update the An and E registers.

For each step t in the 80-step operation, a 64-bit message word Wt is used, whose value is derived from the 1024-bit message packet Mi currently being processed, as shown in the figure. The first 16 message words Wt (0 ≤ t ≤ 15) correspond to the 16 32-bit words after the message input packet, and the rest are calculated according to the following formula:

The process of generating 80 message words in figure SHA-512

Among them

In the formula, ROTRn (X) means to move n bits to the right of the 64-bit variable x, and SHRn (X) means to move n bits to the right of the 64-bit variable x.

As can be seen from the figure, in the first 16 steps, the value of Wt is equal to the corresponding 64-bit words in the message packet, while in the remaining 64 steps, the value is calculated from the previous four values, and two of the four values need to be shifted and cyclically shifted.

The method of obtaining Kt is to take the first 80 primes (2, 3, 5, 5, 7,... The decimal part of the cube root, converts it to binary, and then takes the first 64 bits of the 80 numbers as Kt, which provides a set of 64-bit random strings to eliminate any regularity in the input data

SHA-224 and SHA-384

SHA-256 and SHA-512 are very new Hash functions, the former defining a word as 32-bit and the latter defining a word as 64-bit. In fact, the structure of the two is the same, but there are differences in the number of cycles and usage constants. SHA-224 and SHA-384 are the truncated forms of the above two Hash functions, which are calculated with different initial values.

The input message length of SHA-224 is the same as that of SHA-256, which is also less than 264 bits, and the packet size is also 512 bits. Its processing flow is basically the same as that of SHA-256, but there are two differences as follows.

The message digest of SHA-224 is taken from the bit words of seven registers A, B, C, D, E, F and G, while the message digest of SHA-256 is taken from 32 bits of eight registers A, B, C, D, E, F, G and H.

The initial link variable of SHA-224 is different from that of SHA-256 in that it is stored in a high-end format, but the initial link variable is obtained by taking the second 32 bits of the decimal part of the square root of the first 9 to 16 primes (23, 29, 31, 37, 41, 43, 47, 53). The initial link variable of SHA-224 is as follows:

The detailed calculation steps of SHA-224 are the same as those of SHA-256.

The input message length of SHA-384 is the same as that of SHA-512, but also less than 2128 bits, and its packet size is also 1024 bits. The processing flow is basically the same as SHA-512, but there are two differences as follows.

The 384-bit message digest of SHA-384 is taken from 6 64-bit words of A, B, C, D, E, F, while the message digest of SHA-512 is taken from 8 64-bit words of A, B, C, D, E, F, G and H.

SHA-384 's initial link variable is different from SHA-512 's initial link variable. It is also stored in high-end format, but its initial link variable is obtained by taking the first 64 digits of the decimal part of the square root of the first 9 to 16 primes (23, 29, 31, 37, 41, 43, 47, 53). The initial link variable of SHA-384 is as follows:

The detailed calculation steps for SHA-384 are the same as those for SHA-512.

3. SHA-3 algorithm

The SHA-3 algorithm adopts Sponge structure as a whole, which is divided into two stages: absorption and extraction. The core permutation f of SHA-3 acts on a 5 × 5 × 64 three-dimensional matrix. There are 24 rounds in the whole f, and each round includes five links: θ, ρ, π, χ, τ. The five links of the algorithm act on different dimensions of the three-dimensional matrix respectively. The θ link is a linear operation acting on the column; the ρ link is a linear operation acting on each channel; the 64 bits on each channel are cyclically shifted; the π link is a linear operation that moves the elements on each channel as a whole to another channel; the χ link is a nonlinear operation acting on each row, which is equivalent to replacing 5 bits on each line with another 5 bits; and the τ link is an additive constant link.

At present, the security analysis of SHA-3 algorithm in public literature is mainly carried out from the following aspects.

Collision attack, primary image attack and second primary image attack on SHA-3 algorithm.

For the analysis of core permutation of SHA-3 algorithm, this kind of analysis mainly focuses on the distinction between algorithm permutation and random permutation.

Based on the expansion of the difference characteristics of SHA-3 algorithm, the high probability difference chain of SHA-3 permutation is studied, and the difference divider is constructed.

The stereo encryption idea and sponge structure of Keccak algorithm make SHA-3 superior to SHA-2 or even AES. The Sponge function establishes a mapping from an arbitrary length input to an arbitrary length output.

4. RIPEMD160 algorithm

RIPEMD (RACE Integrity Primitives Evaluation Message Digest), that is, the RACE raw integrity check message digest. RIPEMD uses the design principle of MD4 and improves on the algorithm defects of MD4. RIPEMD-128 was first released in 1996, and its performance is similar to that of SHA-1.

RIPEMD-160 is an improvement on RIPEMD-128 and is the most common version of RIPEMD. RIPEMD-160 outputs 160bit hash value, and the violent collision search attack on 160bit Hash function needs 280 calculations, and its computational strength is greatly improved. The design of RIPEMD-160 fully absorbs some properties of MD4, MD5 and RIPEMD-128, so that it has better anti-strong collision ability. It is designed to replace the 128bit Hash functions MD4, MD5, and RIPEMD.

RIPEMD-160 uses a 160-bit cache to store the intermediate results of the algorithm and the final hash value. The cache area consists of five 32-bit registers A, B, C, D and E. The initial value of the register is as follows:

Data is stored in the form of low-order bytes stored on a low address.

The core of the processing algorithm is a compression function module with 10 loops, each of which consists of 16 processing steps. Different original logic functions are used in each loop, and the processing of the algorithm is divided into two different cases, in which five original logic functions are used in the opposite order. Each loop takes the message word of the current packet and 160-bit cache values A, B, C, D, E as input to get a new value. Each loop uses an extra constant, and at the end of the last loop, the calculated results A, B, C, D, E and A', B', C', D', E' and the initial values of linked variables are added to produce the final output. After all the 512-bit packet processing is completed, the resulting 160-bit output is the message digest.

In addition to the 128bit and 160bit versions, there are also 256bit and 320bit versions of the RIPEMD algorithm, which together constitute four members of the RIPEMD family: RIPEMD-128, RIPEMD-160, RIPEMD-256, and RIPEMD-320. The security of the 128bit version has been questioned, and the 256bit and 320bit versions reduce the possibility of accidental collision, but compared with RIPEMD-128 and RIPEMD-160, they do not have a higher level of security, because they only modify the initial parameters and s-box to achieve the output of 256bit and 320bit.

This is the end of the content of "what is the hashing algorithm of blockchain". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report