How to use Redis distributed lock to make sure it is foolproof 04/29 Update SLTechnology News&Howtos

How to use Redis distributed lock to make sure it is foolproof

2025-04-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the knowledge of "how to use Redis distributed locks to be foolproof". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

I. background

We often encounter some high concurrency scenarios when shopping on e-commerce websites, such as flash sale activity on e-commerce App, rush to buy limited coupons, and train ticket grabbing system of Qunar. A common feature of these scenarios is the surge in traffic. Although the system design will be optimized by current limiting, asynchronous, queuing and other methods, the overall concurrency is still several times higher than usual. In order to avoid concurrency problems, prevent inventory from being oversold, and provide users with a good shopping experience, lock mechanisms are used in these systems.

For single-process concurrency scenarios, you can use locks provided by programming languages and corresponding class libraries, such as synchronized syntax in Java and ReentrantLock classes, to avoid concurrency problems.

If in the distributed scenario, we need to use the distributed lock technology to realize the synchronous access of different client threads to the code and resources to ensure the security of dealing with shared data under multi-thread.

So what is a distributed lock? Distributed lock is a lock implementation that controls distributed systems or different systems to access shared resources together. If different systems or different hosts of the same system share a certain resource, mutual exclusion is often needed to prevent each other from interfering with each other to ensure consistency.

A relatively secure distributed lock generally requires the following characteristics:

Repulsion. Mutual exclusion is the basic feature of locks, and locks can only be held by one thread at a time to perform critical section operations.

Overtime release. Through timeout release, deadlocks can be avoided and unnecessary thread waits and resource waste can be prevented, similar to the innodblockwait_timeout parameter configuration in MySQL's InnoDB engine.

Reentrability. When a thread holds a lock, it can request a lock again to prevent the lock from being released before the thread completes the critical section operation.

High performance and high availability. The process performance overhead of locking and releasing locks should be as low as possible, while ensuring high availability to prevent accidental failure of distributed locks.

It can be seen that to achieve distributed locking, it is not just to lock resources, but also need to meet some additional features to avoid deadlock, lock failure and other problems.

Second, the implementation of distributed lock

At present, there are many ways to implement distributed locks, and the common ones are:

Memcached distributed lock

Take advantage of Memcached's add command. This command is an atomic operation, and only if the key does not exist can the add succeed, which means that the thread is locked.

Zookeeper distributed lock

The sequential temporary nodes of Zookeeper are used to realize distributed locking and waiting queues. As a framework specially designed to provide solutions for distributed applications, ZooKeeper provides some very good features, such as the automatic deletion of ephemeral-type znode. At the same time, ZooKeeper also provides watch mechanism, which can make the distributed lock use like a local lock on the client side: lock failure will block until the lock is acquired.

Chubby

The coarse-grained distributed lock service implemented by Google is somewhat similar to ZooKeeper, but there are many differences. Chubby solves the problem of lock failure caused by request delay through sequencer mechanism.

Redis distributed lock

The implementation of distributed lock based on Redis is similar to that of Memcached. Using the SETNX command of Redis, this command is also an atomic operation, and set can be successful only if key does not exist. The distributed lock Redlock based on Redis is a more secure and effective implementation mechanism proposed by antirez, the author of Redis, in order to standardize the implementation of Redis distributed lock.

This paper mainly discusses and analyzes several implementation methods and existing problems of distributed lock based on Redis.

Third, Redis distributed lock

Using Redis as a distributed lock, the essence of the goal is that a process occupies the only "manger" in the Redis. When other processes want to occupy the pit, they find that someone is already squatting there, so they have to give up or wait to try again later.

At present, there are two main types of distributed locks based on Redis, one is based on single machine, the other is based on Redis multi-computers. No matter which way of implementation is, it is necessary to implement the core elements of locking, unlocking and lock timeout.

1. Distributed lock based on Redis single machine.

1) use the SETNX instruction

The simplest way to lock is to directly use the SETNX instruction of Redis, which sets the value of key to value only if key does not exist. If key already exists, the SETNX command does not take any action. Key is the unique identification of the lock, which can be named according to the resources that the business needs to lock.

For example, if you lock an item in the flash sale activity of a mall, key can be set to lock_resource_id and value can be set to any value. After the resource is used, delete the key to release the lock using DEL. The whole process is as follows:

Obviously, this way of acquiring locks is very simple, but there is also a problem, that is, the lock timeout problem, which is one of the three core elements of distributed locks mentioned above, that is, if the process that acquired the lock has an exception in the process of business logic processing, it may cause the DEL instruction to be unable to be executed, resulting in the lock cannot be released, and the resource will be locked forever.

Therefore, after using SETNX to get the lock, you must set an expiration time for key to ensure that even if it is not explicitly released, it will be automatically released after the lock is acquired for a certain period of time, so as to prevent resources from being monopolized for a long time. Since SETNX does not support setting expiration time, additional EXPIRE instructions are required, and the whole process is as follows:

There is still a serious problem with the distributed lock implemented in this way. because the two operations of SETNX and EXPIRE are non-atomic, if an exception occurs between SETNX and EXPIRE, SETNX executes successfully, but EXPIRE does not execute, resulting in the lock becoming "immortal". This situation may lead to the lock timeout problem mentioned above, and other processes can not acquire the lock normally.

2) use SET extension instructions

In order to solve the problem of non-atomicity of SETNX and EXPIRE operations, we can use the extended parameters of Redis's SET instruction to make SETNX and EXPIRE operations atomic. The whole process is as follows:

In this SET instruction:

NX means that SET can be successful only if the key value corresponding to lock_resource_id does not exist. It ensures that only the first requesting client can acquire the lock, while no other client can acquire the lock until the lock is released.

EX 10 indicates that the lock will automatically expire after 10 seconds, and the business can set this time according to the actual situation.

However, this approach still does not completely solve the problem of distributed lock timeout:

The lock was released early. If thread A takes too long to execute the logic between locking and releasing the lock (or thread An is blocked during execution) that it is released beyond the expiration time of the lock, but thread A has not finished executing the logic in the critical section, then thread B can reacquire the lock in advance, resulting in the critical area code can not be strictly serial executed.

The lock was deleted by mistake. If thread An in the above case finishes executing, it does not know that the lock holder is thread B, thread A will continue to execute the DEL instruction to release the lock. If thread B has not finished executing the logic in the critical section, thread An actually releases thread B's lock.

In order to avoid the above situation, it is recommended not to use Redis distributed locks in scenarios where the execution time is too long. At the same time, it is safer to judge the lock before executing the DEL release lock to verify whether the current lock is owned by itself.

The specific implementation is to set the value to a unique random number (or thread ID) when adding the lock, and determine whether the random number is consistent when releasing the lock, and then perform the release operation to ensure that the lock held by other threads will not be released incorrectly, unless the lock is automatically released by the server after the lock expires. The whole process is as follows:

But judging value and deleting key are two separate operations, not atomic, so this place needs to be processed with Lua scripts, because Lua scripts can guarantee the atomic execution of multiple instructions in succession.

The distributed lock based on Redis single node is basically completed, but it is not a perfect solution, just a relatively complete one, because it does not completely solve the problem that other threads enter after the current thread execution timeout lock is released in advance.

3) distributed locks using Redisson

How can we solve the problem that the lock is released in advance?

You can make use of the reentrant feature of the lock to let the thread that acquired the lock start a daemon thread of a timer, execute once per expireTime/3, to check whether the lock exists or not, and if so, reset the expiration time of the lock to expireTime, that is, use the daemon thread to "renew" the lock to prevent the lock from being released ahead of time due to expiration.

Of course, the logic of the business to implement this daemon is still relatively complex, and there may be some unknown problems.

At present, Redisson, an open source framework widely used by Internet companies in the production environment, solves this problem very well, is very easy to use, and supports a variety of deployment architectures, such as Redis single instance, Redis Mmurs, Redis Sentinel, Redis Cluster and so on.

Interested friends can refer to the official documentation or source code:

Https://github.com/redisson/redisson/wiki

The implementation principle is shown in the figure (take the Redis cluster as an example):

2. Distributed lock Redlock based on Redis multi-computer.

In fact, there is a problem in all the above distributed locks based on Redis stand-alone, that is, locking only works on one Redis node, even though Redis ensures high availability through Sentinel, but because the replication of Redis is asynchronous and the Master node fails to complete data synchronization after acquiring the lock, threads on other clients can still acquire the lock, thus losing the security of the lock.

The whole process is as follows:

Client An acquires the lock from the Master node.

The Master node failed and the key corresponding to the lock was not synchronized to the Slave node during the master-slave replication process.

Slave is promoted to a Master node, but there is no lock data in the Master at this time.

Client B requests a new Master node and acquires the lock corresponding to the same resource.

It appears that multiple clients hold locks of the same resource at the same time, which does not satisfy the mutex of the locks.

Because of this, in Redis's distributed environment, Redis author antirez provides RedLock's algorithm to implement a distributed lock, which looks something like this:

Suppose there are N (N > = 5) Redis nodes, these nodes are completely independent of each other, and there is no master-slave replication or other cluster coordination mechanism. Ensure that locks are acquired and released on these N nodes using the same method as in a single instance of Redis.

During the process of acquiring a lock, the client should do the following:

Gets the current Unix time in milliseconds.

Try to acquire the lock from five instances sequentially using the same key and a unique value (for example, UUID). When requesting a lock from Redis, the client should set a network connection and response timeout, which should be less than the lock expiration time. For example, if the automatic failure time of the lock is 10 seconds, the timeout period should be between 5 and 50 milliseconds. This prevents the client from waiting for the result of the response when the server-side Redis has hung up. If the server does not respond within the specified time, the client should try to request a lock from another Redis instance as soon as possible.

The client uses the current time minus the time to start acquiring the lock (the time recorded in step 1) to get the time to acquire the lock. The lock is successful only if and only if the lock is obtained from most of the Redis nodes, and the time used is less than the lock failure time.

If the lock is taken, the true effective time of the key is equal to the effective time minus the time it takes to acquire the lock (the result of the calculation in step 3).

If, for some reason, the acquisition of the lock fails (the lock has not been obtained or the lock time has exceeded the valid time for at least one Redis instance in N Redis), the client should unlock it on all Redis instances (using the Redis Lua script).

The process of releasing the lock is relatively simple: the client initiates a lock release operation to all Redis nodes, including those that failed to add the lock, and also needs to perform the lock release operation, which is particularly emphasized by antirez in the algorithm description. Why?

The reason is that the response packet returned to the client after a node is successfully locked may be lost, which can happen in the asynchronous communication model: it is normal for the client to communicate with the server, but there is a problem in the opposite direction. Although locking fails for the client due to the response timeout, for the Redis node, the successful execution of the SET instruction means that the locking is successful. Therefore, when releasing the lock, the client should also make a request to those Redis nodes that failed to acquire the lock at that time.

In addition, in order to avoid lock loss after Redis node crash restart, antirez also puts forward the concept of delayed restart, that is, do not restart immediately after a node crash, but wait for a period of time before restart, which should be longer than the effective time of the lock.

For more in-depth study of Redlock, interested friends can refer to the official document: https://redis.io/topics/distlock

IV. Summary

Distributed system design is to achieve a balance between complexity and benefits, not only to be as safe and reliable as possible, but also to avoid over-design. Redlock does provide more secure distributed locks, but it also comes at a cost, requiring more Redis nodes. In actual business, generally using Redis based on a single point to implement distributed locks can meet most of the needs. Occasionally, data inconsistencies can be solved by manual intervention to supplement data. As the saying goes, "if the technology is not enough, we will do it manually."

This is the end of "how to use Redis distributed locks to be foolproof". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.