What are the functions of digital fingerprinting 07/03 Update SLTechnology News&Howtos

What are the functions of digital fingerprinting

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "what is the function of digital fingerprint". In the operation of actual cases, many people will encounter such a dilemma. Next, let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

In the scenario of large file upload, in order to improve the user experience of large file upload, we will support resuming upload from breakpoint. In the process of uploading, we will fragment the large file, then use the md5 algorithm to calculate the hash value of the part, and then submit the content of the part and its corresponding hash value to the server.

When the server receives the hash value corresponding to the shard, it will first query whether the hash value already exists. If it exists, it means that the part has been uploaded. At this point, you can return the number of bytes that have been uploaded to the large file, so that the client can continue to upload the rest of the content.

In fact, the corresponding hash value of a fragment can also be called a "digital fingerprint", so what is a "digital fingerprint"? To understand "digital fingerprinting", we need to first understand what a message digest algorithm is.

What is message digest algorithm

Message digest algorithm is a very important branch of cryptography algorithm. It extracts fingerprint information from all data to achieve data signature, data integrity check and other functions. Because of its irreversibility, it is sometimes used to encrypt sensitive information. Message digest algorithm is also known as Hash algorithm or hash algorithm.

After any message is processed by the hash function, it will get a unique hash value. This process is called "message digest", and its hash value is called "digital fingerprint". Its algorithm is naturally "message digest algorithm". In other words, if the digital fingerprints are consistent, the message is consistent.

(picture source-https://zh.wikipedia.org/wiki/ hash function)

Message digest algorithm does not have the problem of key management and distribution, so it is suitable to be used in distributed network. Message digest algorithm is mainly used in the field of "digital signature" as a digest algorithm for plaintext. The famous abstract algorithms include RSA's MD5 algorithm and SHA-1 algorithm and a large number of variants.

1.1 Features of message digest algorithm

No matter how long the message is entered, the length of the calculated message digest is always fixed. For example, the message digested by MD5 algorithm has 128bits, the message digested by SHA-1 algorithm finally has 160bit output, and the variant of SHA-1 can produce message digest with 192bits and 256bits. It is generally believed that the longer the final output of the summary, the more secure the summary algorithm.

The message digest looks "random". These bits seem to be randomly huddled together, and a large number of inputs can be used to test whether the output is the same. Generally, different inputs will have different outputs, and the output summary messages can pass the randomness test. Generally speaking, as long as the input message is different, the summary message generated by summarizing it must be different, but the same input must produce the same output.

The message digest function is an one-way function, that is, it can only perform a positive information summary, but can not recover any messages from the summary, or even can not find any information related to the original information at all.

A good summary algorithm, no one can find "collision" or extremely difficult to find, although "collision" is certain to exist (collision is different content produces the same summary).

1.2 genealogy of message digest algorithm

Message digest algorithms are mainly divided into three categories: MD (Message Digest, message digest algorithm), SHA-1 (Secure Hash Algorithm, secure hash algorithm) and MAC (Message Authentication Code, message authentication code algorithm).

MD series algorithms include MD2, MD4 and MD5 algorithms; SHA algorithm mainly includes its representative algorithm SHA-1 and SHA-1 algorithm variants SHA-2 series algorithms (including SHA-224, SHA-256, SHA-384 and SHA-512); MAC algorithm integrates the above two algorithms, including HmacMD5, HmacSHA1, HmacSHA256, HmacSHA384 and HmacSHA512 algorithms.

Although all kinds of message digest algorithms are listed above, they still can not meet the needs of applications. Based on these message digest algorithms, RipeMD series (including RipeMD128, RipeMD160, RipeMD256, RipeMD320), Tiger, GOST3411 and Whirlpool algorithms are derived.

For most front-end developers, the one who is more exposed to is the MD5 algorithm. So, next, Brother Po will focus on the MD5 algorithm.

2. What is the MD5 algorithm

MD5 (Message Digest Algorithm 5, version 5 of message digest algorithm), which is developed from MD2, MD3 and MD4, was proposed by Ron Rivest (RSA) in 1992. it is widely used in data integrity check, data (message) summary, data signature and so on. MD2, MD4, and MD5 all produce 16-byte (128bit) check values, usually represented by 32-bit hexadecimal numbers. The algorithm of MD2 is slow but relatively safe. The speed of MD4 is very fast, but the security decreases. MD5 is safer and faster than MD4.

With the development of computer technology and the continuous improvement of computing level, more and more loopholes have been exposed in MD5 algorithm. It has been proved that there is a weakness after 1996 and can be cracked. For data that requires a high degree of security, experts generally recommend switching to other algorithms, such as SHA-2. In 2004, it was proved that MD5 algorithm can not prevent collision (collision), so it is not suitable for security authentication, such as SSL public key authentication or digital signature.

2.1 MD5 featur

Stable and fast operation.

Compressibility: input data of any length and output length fixed (128 bits).

The operation is irreversible: when the result of the operation is known, the original string cannot be obtained by inverse operation.

Highly discrete: small changes in input can lead to huge differences in calculation results.

2.2 MD5 hash

A 128bit MD5 hash is represented as a 32-bit hexadecimal number in most cases. The following is a 43-bit MD5 hash with only ASCII alphabet columns:

MD5 ("The quick brown fox jumps over the lazy dog") = 9e107d9d372bb6826bd81d3542a419d6

Even if you make a small change in the original text (such as changing dog to cog, changing only one character), the hash will change dramatically:

MD5 ("The quick brown fox jumps over the lazy cog") = 1055d3e698d289f2af8663725127bd4b

Then let's take a few more examples of MD5 hashes:

MD5 ("")-> d41d8cd98f00b204e9800998ecf8427e MD5 ("semlinker")-> 688881f1c8aa6ffd3fcec471e0391e4d MD5 ("kakuqo")-> e18c3c4dd05aef020946e6afbf9e04ef

Third, the use of MD5 algorithm

3.1 prevent tampering

3.1.1 tamper-proof document distribution

When distributing software installation packages on the Internet, for security reasons, to prevent the software from being tampered with, such as adding Trojans to the software installer. Software developers usually use message digest algorithms, such as the MD5 algorithm, to generate a digital fingerprint that matches the file, so that the receiver can use some off-the-shelf tools to check the integrity of the file.

(source of the picture-https://en.wikipedia.org/wiki/MD5)

Here, let's take a practical example. The following is the download page of MySQL Community Server version 8.0.19. The download page calculates the digital fingerprints of different software packages by MD5 algorithm, as shown in the following figure:

(source of the picture-https://dev.mysql.com/downloads/mysql/)

After downloading the corresponding installation package from the official website, users can use some MD5 verification tools to verify the downloaded files, and then compare the final MD5 fingerprints. If the result is consistent with the digital fingerprint published on the official website, it means that the installation package is safe without any modification and can basically be installed with confidence.

3.1.2 tamper-proof message transmission

Suppose you need to send an electronic document to your friend on the network. Before sending the document, MD5 the contents of the document to get the "digital fingerprint" of the electronic document, and send the "digital fingerprint" to the other party along with the electronic document. When the other party receives the electronic document, it also uses the MD5 algorithm to hash the contents of the document, and after the operation is completed, it will also get a corresponding "digital fingerprint". When the fingerprint is consistent with the "digital fingerprint" of the document you sent, it means that the document has not been tampered with in the process of transmission.

3.2 Information confidentiality

In the early days of the Internet, many websites stored users' passwords in the database in the form of plaintext, which has great security risks, such as the database was hacked, resulting in the disclosure of website user information. To solve this problem, one solution is not to use plaintext when saving user passwords, but to use message digest algorithms, such as MD5 algorithm, to hash plaintext passwords, and then save the results to the database. The use of the above scheme avoids saving passwords in clear text in the database and improves the security of the system, but this scheme is not secure, which we will analyze in detail later.

When the user logs in, the login system performs a MD5 hash operation on the password entered by the user, and then uses the MD5 "digital fingerprint" corresponding to the user's ID and password to authenticate the user. If the authentication is passed, the current user can log in to the system normally. The scheme in which user passwords are stored after MD5 hashing has at least two benefits:

Prevent internal attacks: because passwords are not saved in clear text in the database, you can prevent the passwords of users in the system from being known by people with system administrator privileges.

Protection against external attacks: the website database is hacked, and the hacker can only obtain the password after MD5 operation, not the user's plaintext password.

Fourth, an example of using MD5 algorithm

In the Node.js environment, we can use the md5 implementation provided by the crypto native module, or we can use mainstream MD5 third-party libraries, such as md5, which can run on both the server side and the client side. Before introducing the specific use, we need to install md5, a third-party library, in advance, as follows:

$npm install md5-save

4.1 examples of crypto module usage

Const crypto = require ("crypto"); const msg = "Po Brother"; function md5 (data) {const hash = crypto.createHash ("md5"); return hash.update (data) .digest ("hex");} console.log ("Node.js Crypto MD5:" + msg + "- >" + md5 (msg)

4.2 examples of using MD5 libraries

Const md5 = require ('md5'); const msg = "Po Brother"; console.log ("MD5 Lib MD5:" + msg + "- >" + md5 (msg))

After the above sample code runs properly, the following results are output in the console:

Node.js Crypto MD5: brother Po-> 8eec7fcf817f7340b791b32ecdbed570 MD5 Lib MD5: brother Po-> 8eec7fcf817f7340b791b32ecdbed570

Fifth, the defects of MD5 algorithm

Hash collision refers to different inputs but produces the same output, a good hash algorithm, no one should be able to find "collision" or extremely difficult to find, although "collision" must exist.

In 2005, Professor Wang Xiaoyun of Shandong University released the algorithm that can easily construct MD5 collision examples. since then, in 2007, foreign scholars have proposed a further MD5 prefix collision construction algorithm "chosen prefix collision" on the basis of Professor Wang Xiaoyun's algorithm, and then experts have provided open source libraries for MD5 collision construction one after another.

In 2009, Xie Tao and Feng Dengguo of the Chinese Academy of Sciences cracked MD5's collision resistance, which took only a few seconds to run on an ordinary computer, with only 220.96 of the collision algorithm complexity.

MD5 collisions are easy to construct, and it is unreliable to verify data integrity based on MD5. Considering that Google has recently successfully constructed a collision instance of SHA-1 (English: Secure Hash Algorithm 1, Chinese name: secure hash algorithm 1), SHA256 or a stronger algorithm should be used instead of data integrity.

5.1 MD5 collision sample

Let's first take a look at a sample of MD5 collisions:

5.1.1 HEX (hexadecimal) sample A1

4dc968ff0ee35c209572d4777b721587 d36fa7b21bdc56b74a3dc0783e7b9518 afbfa200a8284bf36e8e4b55b35f4275 93d849676da0d1555d8360fb5f07fea2

5.1.2 HEX (hexadecimal) sample A2

4dc968ff0ee35c209572d4777b721587 d36fa7b21bdc56b74a3dc0783e7b9518 afbfa202a8284bf36e8e4b55b35f4275 93d849676da0d1d55d8360fb5f07fea2

The difference between the two samples is shown in the following figure:

5.2 verify MD5 collision

Let's use Node.js to actually verify whether the output of sample A1 and sample A2 are consistent after MD5 operation:

5.2.1 set sample data

Let sample1 = `4dc968ff0ee35c209572d4777b721587 d36fa7b21bdc56b74a3dc0783e7b9518 afbfa200a8284bf36e8e4b55b35f4275 93d849676da0d1555d8360fb5f07fea2`; let sample2 = `4dc968ff0ee35c209572d4777b721587 d36fa7b21bdc56b74a3dc0783e7b9518 afbfa202a8284bf36e8e4b55b35f4275 93d849676da0d1d55d8360fb5f07fea2 `

5.2.2 define the getHashResult method

Function getHashResult (hexString) {const hash = crypto.createHash ("md5"); const buffer = Buffer.from (hexString.replace (/\ sWeig, ")," hex "); return hash.update (buffer) .digest (" hex ");}

5.2.3 perform collision detection

Let sample1Md5 = getHashResult (sample1); let sample2Md5 = getHashResult (sample2); if (sample1Md5 = sample2Md5) {console.log (`MD5 collision: ${sample1Md5} `);} else {console.log (`no MD5 collision`);}

After the above code runs successfully, the following results are output in the console:

MD5 collision occurs: 008ee33a9d58b51cfeb425b0959121c9

If you are interested in other MD5 collision samples, you can check out some examples of MD5 collisions in this article. Because it is not reliable to verify data integrity based on MD5, Node.js uses the SHA256 algorithm to ensure data integrity.

(source of the picture-https://nodejs.org/download/release/v15.6.0/SHASUMS256.txt.asc)

VI. MD5 password security

6.1 MD5 ciphertext reverse query

We have mentioned earlier that the security of the system can be improved by MD5 the user's password. But in fact, such security is still not high. Why? Because as long as the input is the same, it will produce the same output. Next, let's give an example. String 123456789 is a commonly used password that generates a corresponding hash value after MD5 operation:

MD5 ("123456789")-> 25f9e794323b453885f5181f1b624d0b

Because the same input produces the same result, the attacker can deduce the input based on the hash result. One of the common ways to crack is to use a rainbow table. The rainbow table is a pre-calculated table used to encrypt the inverse operation of hash functions and is often used to crack encrypted cryptographic hashes. Lookup tables are often used for encryption that contains finite character fixed-length plain text passwords. This is a typical practice of trading space for time, using less computing power and more storage space in every attempt of brute force cracking. but it uses less storage space and more computing performance than a simple search table that inputs a hash.

At present, some sites on the Internet, such as cmd5.com, have provided us with reverse query services for MD5 ciphertext. We do a simple verification with the results generated by MD5 ("123456789"), as shown in the following figure:

Since 123456789 is a common password, it's not surprising that the site gets the right results in reverse. The following is the site description of the cmd5 website, you can refer to, interested partners can personally verify it.

Now we know that if the user's password is the same, the value of MD5 will be the same. Through some reverse query of MD5 ciphertext, the password will be parsed, so that users with the same password will be affected. So how to solve the problem? The answer is the password with salt.

6.2 password with salt

Salt, in cryptography, refers to inserting a specific string at any fixed position of the hash content (for example, a password) before the hash. This way of adding strings to the hash is called "salting". Its function is to make the results of the hash with salt different from those without salt, which can add additional security in different application scenarios.

In most cases, salt does not need to be kept secret. Salt can be a randomly generated string, and its insertion position can also be arbitrary. If the hash result needs to be validated in the future (for example, to verify the password entered by the user), you need to record the salt used. To make it easier to understand, let's give a simple example.

6.2.1 example of adding salt to Node.js MD5

Const crypto = require ("crypto"); function cryptPwd (password, salt) {const saltPassword = password + ":" + salt; console.log ("original password:% s", password); console.log ("password after salt:% s", saltPassword); const md5 = crypto.createHash ("md5"); const result = md5.update (saltPassword) .digest ("hex"); console.log ("MD5 value of salt password:% s", result) } cryptPwd ("123456789", "exe"); cryptPwd ("123456789", "eft")

After the above sample code runs properly, the following results are output in the console:

Original password: 123456789 after salt password: 123456789:exe salt password MD5 value: 3328003d9f786897e0749f349af490ca original password: 123456789 password after salt: 123456789:eft salt password MD5 value: 3c45dd21ba03e8216d56dce8fe5ebabf

By observing the above results, we find that the original password is the same, but the salt value used is different, and the resulting MD5 hash value is also quite different. In addition, in order to improve the difficulty of cracking, we can randomly generate the salt value and increase the length of the salt value.

6.3 bcrypt

It is true that hashing and salting can increase the cost of attackers, but today it is far from enough. We need a more secure way to store users' passwords, which is bcrypt, which is widely used today.

Bcrypt is a cryptographic hash function designed by Niels Provos and David Mazi è res according to the Blowfish encryption algorithm, which was demonstrated in USENIX in 1999. The bcrypt algorithm is specially designed for hash passwords, so it is a relatively slow algorithm, which reduces the number of passwords that an attacker can process per second, thus avoiding dictionary attacks. In implementation, bcrypt uses a salted process to defend against rainbow table attacks, while bcrypt is an adaptive function that can resist increasing computer computing power through brute force cracking by increasing the number of iterations.

Files encrypted by bcrypt can be transferred on all supported operating systems and processors. Its password must be 8 to 56 characters and will be internally converted to a 448-bit key. However, all the characters provided are of great significance. The stronger the password, the more secure your data will be.

Let's take bcryptjs on the Node.js platform as an example to show how to use the bcrypt algorithm to deal with users' passwords. First we need to install bcryptjs:

$npm install bcryptjs-save

6.3.1 use bcryptjs to process passwords

Const bcrypt = require ("bcryptjs"); const password = "123456789"; const saltRounds = 10; async function bcryptHash (str, saltRounds) {let hashedResult; try {const salt = await bcrypt.genSalt (saltRounds); hashedResult = await bcrypt.hash (str, salt);} catch (error) {throw error;} return hashedResult;} bcryptHash (password, saltRounds) .then (console.log)

After the above sample code runs properly, the following results are output in the console:

$2a$10 $O1SrEy3KsgN0NQdQjaSU6OxjxDo0jf.j/e2goSwSEu4esz9i58dRm

It is obvious that password 123456789 gets a string of unreadable "garbled codes" after the hash operation of bcrypt. Here we have completed the first step, which is the encryption of the user's login password. The next step is to compare the login password, that is, to ensure that the user can log in to the system normally after entering the correct password.

6.3.2 use bcryptjs to verify passwords

Async function bcryptCompare (str, hashed) {let isMatch; try {isMatc = await bcrypt.compare (str, hashed);} catch (error) {throw error;} return isMatch;} bcryptCompare ("123456789", "$2a$10 $O1SrEy3KsgN0NQdQjaSU6OxjxDo0jf.j/e2goSwSEu4esz9i58dRm") .then (console.log); bcryptCompare ("123456", "$2a$10 $O1SrEy3KsgN0NQdQjaSU6OxjxDo0jf.j/e2goSwSEu4esz9i58dRm") .then (console.log)

After the above sample code runs properly, the following results are output in the console:

True false

Because our original password is 123456789, it is obvious that it does not match 123456, so we will output the above matching results.

This is the end of the content of "what are the functions of digital fingerprinting"? thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.