Exploring the Evolutionary History of Redis Design and implementation of 15:Redis distributed Lock 02/10 Update SLTechnology News&Howtos

Exploring the Evolutionary History of Redis Design and implementation of 15:Redis distributed Lock

2026-02-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This article is from the Internet.

This series of articles will be sorted out in my Java interview Guide warehouse on GitHub. Please check out more wonderful content in my warehouse.

Https://github.com/h3pl/Java-Tutorial

Have some trouble with Star if you like.

The article was first posted on my personal blog:

Www.how2playlife.com

This article is one of "exploring the Design and implementation of Java" on the official account of Wechat [Redis Technology]. Part of the content of this article comes from the network. In order to explain the topic of this article clearly and thoroughly, and integrate a lot of technical blog content that I think is good, quote some good blog articles, if there is any infringement, please contact the author.

This series of blog posts will show you how to start to advanced, the basic usage of Redis, the basic data structure of Redis, and some advanced usage. At the same time, you also need to further understand the underlying data structure of Redis. Then, it will also bring Redis master-slave replication, clustering, distributed locks and other related content, as well as some usage and precautions as a cache. So that you can have a more complete understanding of the whole Redis-related technology system and form your own knowledge framework.

If you have any suggestions or questions about this series of articles, you can also follow the official account [Java Technology jianghu] to contact the author. You are welcome to participate in the creation and revision of this series of blog posts.

Evolutionary History of Redis distributed Lock

In the past two years, micro-services have become more and more popular, and more and more applications are deployed in the distributed environment. In the distributed environment, data consistency has always been a problem that needs to be paid attention to and solved. Distributed lock has become a widely used technology, and the commonly used distributed implementation is Redis,Zookeeper, in which the distributed lock based on Redis is more widely used.

However, I have seen various versions of Redis distributed lock implementation at work and on the network, and each implementation has some inaccuracies, including in the code, if the distributed lock is not used correctly, it may cause serious failures in the production environment. This paper mainly collates the various distributed locks encountered at present and their defects. It also gives some suggestions on how to choose the appropriate Redis distributed lock.

Various versions of Redis distributed lock V1.0tryLock () {SETNX Key 1 EXPIRE Key Seconds} release () {DELETE Key}

This version should be the simplest version, and it is also a version that occurs frequently. First of all, an expiration time operation is added to the lock in order to avoid the application after the service is restarted or the lock cannot be released due to an exception. There will be no situation where the lock cannot be released all the time.

One problem with this solution is that each time a Redis request is submitted, if an exception or restart is applied after the first command, the lock will not expire. One improvement is to use Lua scripts (including SETNX and EXPIRE commands). However, if crash or master-slave switching occurs after Redis executes only one command, there will still be no lock expiration time, resulting in unable to release.

Another problem is that in the process of releasing distributed locks, many students release locks in finally regardless of whether the lock is acquired successfully or not. this is a misuse of locks, and this problem will be solved in subsequent V3.0 versions.

A solution to the problem that the lock cannot be released is based on the GETSET command.

V1.1 is based on GETSETtryLock () {NewExpireTime=CurrentTimestamp+ExpireSeconds if (SETNX Key NewExpireTime Seconds) {oldExpireTime = GET (Key) if (oldExpireTime)

< CurrentTimestamp){ NewExpireTime=CurrentTimestamp+ExpireSeconds CurrentExpireTime=GETSET(Key,NewExpireTime) if(CurrentExpireTime == oldExpireTime){ return 1; }else{ return 0; } } }}release(){ DELETE key } 思路： SETNX(Key,ExpireTime)获取锁如果获取锁失败，通过GET(Key)返回的时间戳检查锁是否已经过期 GETSET(Key,ExpireTime)修改Value为NewExpireTime 检查GETSET返回的旧值，如果等于GET返回的值，则认为获取锁成功注意：这个版本去掉了EXPIRE命令，改为通过Value时间戳值来判断过期问题： 1. 在锁竞争较高的情况下，会出现Value不断被覆盖，但是没有一个Client获取到锁 2. 在获取锁的过程中不断的修改原有锁的数据，设想一种场景C1，C2竞争锁，C1获取到了锁，C2锁执行了GETSET操作修改了C1锁的过期时间，如果C1没有正确释放锁，锁的过期时间被延长，其它Client需要等待更久的时间V2.0 基于 SETNXtryLock(){ SETNX Key 1 Seconds}release(){ DELETE Key} Redis 2.6.12版本后SETNX增加过期时间参数，这样就解决了两条命令无法保证原子性的问题。但是设想下面一个场景： 1. C1成功获取到了锁，之后C1因为GC进入等待或者未知原因导致任务执行过长，最后在锁失效前C1没有主动释放锁 2. C2在C1的锁超时后获取到锁，并且开始执行，这个时候C1和C2都同时在执行，会因重复执行造成数据不一致等未知情况 3. C1如果先执行完毕，则会释放C2的锁，此时可能导致另外一个C3进程获取到了锁大致的流程图

There is a problem:

1. Because the pause of C1 causes both C1 and C2 to acquire the lock and execute at the same time, it is indirectly required to ensure idempotency 2 in business implementation. C1 releases the lock V3.0tryLock () {SETNX Key UnixTimestamp Seconds} release () {EVAL (/ / LuaScript if redis.call ("get", KEYS [1]) = = ARGV [1] then return redis.call ("del", KEYS [1]) else return 0 end) that does not belong to C1.

This scheme avoids the problem that C1 releases the lock held by C2 by specifying Value as the timestamp and checking whether the Value of the lock is the Value of the lock when releasing the lock. In addition, because multiple Redis operations are involved in releasing the lock, and considering the concurrency problem of the Check And Set model, the Lua script is used to avoid the concurrency problem.

There is a problem:

In scenarios with extremely high concurrency, such as the scenario of grabbing red packets, there may be a UnixTimestamp repetition problem. In addition, due to the lack of guarantee of physical clock consistency in the distributed environment, there may also be a UnixTimestamp repetition problem, but it may be encountered in rare cases.

V3.1tryLock () {SET Key UniqId Seconds} release () {EVAL (/ / LuaScript if redis.call ("get", KEYS [1]) = = ARGV [1] then return redis.call ("del", KEYS [1]) else return 0 end)}

SET also provides a NX parameter after Redis 2.6.12, which is equivalent to the SETNX command. The official document reminds later versions that it is possible to remove SETNX, SETEX, PSETEX and replace it with the SET command. Another optimization is to use a self-increasing unique UniqId instead of a timestamp to avoid the clock problem mentioned in V3.0.

This scheme is currently the best distributed locking scheme, but if there are still problems in the Redis cluster environment:

Since the Redis cluster data synchronization is asynchronous, if the Master node crash does not complete the data synchronization after the Master node acquires the lock, then the lock can still be acquired on the new Master node, so multiple Client acquire the lock at the same time

Distributed Redis Lock: Redlock

The version of V3.1 is secure only in the case of a single instance. Foreign distributed experts have had a heated discussion on how to implement distributed Redis locks. Antirez proposed a distributed locking algorithm Redlock. You can see the detailed description of Redlock under the topic of distlock. The following is a Chinese description of the Redlock algorithm (reference)

Suppose there are N independent Redis nodes

Gets the current time (milliseconds).

The operation of acquiring the lock is performed to N Redis nodes sequentially. This acquisition operation is the same as the previous lock acquisition process based on a single Redis node, including the random string my_random_value, as well as the expiration time (such as PX 30000, that is, the effective time of the lock). To ensure that the algorithm continues to run when a Redis node is not available, the lock acquisition operation also has a timeout (time out), which is much less than the effective time of the lock (tens of milliseconds). After the client fails to acquire a lock from one Redis node, it should immediately try the next Redis node. The failure here should include any type of failure, such as the Redis node is not available, or the lock on the Redis node is already held by other clients. (note: in the original text of Redlock, only cases where the Redis node is not available, but other failures should also be included).

Calculate how long the whole process of acquiring the lock takes, by subtracting the time recorded in step 1 from the current time. If the client successfully acquires the lock from most of the Redis nodes (> = N _ lock validity time), and the total time taken to acquire the lock does not exceed the lock's effective time (effective time), then the client will consider that the lock has been acquired successfully; otherwise, the client will consider that the lock acquisition has failed.

If the lock is finally acquired successfully, then the effective time of the lock should be recalculated, which is equal to the effective time of the original lock minus the time spent acquiring the lock calculated in step 3.

If the final acquisition of the lock fails (perhaps because the number of Redis nodes that acquired the lock is less than the number of Redis Lua nodes that have acquired the lock, or because the entire process of acquiring the lock takes longer than the initial effective time of the lock), then the client should immediately initiate a lock release operation to all Redis nodes (that is, the Redis Lua script described earlier).

Release lock: initiates a release lock operation on all Redis nodes

However, Martin Kleppmann questioned this algorithm and proposed that it should be based on the fencing token mechanism (token verification is required for each operation on the resource)

1. Redlock puts forward the hypothesis on the system model, especially on the distributed clock consistency problem. In the actual scenario, there are clock inconsistency and clock jump problems, and Redlock is precisely the distributed lock based on timing. In addition, because Redlock is based on automatic expiration mechanism, it still does not solve the automatic failure of locks caused by long-term gc pause and other problems, resulting in security problems.

Then antirez replied to Martin Kleppmann's query, giving the rationality of the expiration mechanism and what to do if multiple Client accesses resources at the same time due to pause in the actual scenario.

Aiming at the problem of Redlock, this paper gives a detailed description of whether the distributed lock based on Redis is safe in Chinese, and analyzes the problems existing in Redlock algorithm.

Summary

Whether it is based on the SETNX version of Redis single-instance distributed lock or Redlock distributed lock, it is to ensure the following characteristics

1. Security: multiple Client are not allowed to hold lock 2 at the same time. Active deadlock: the lock should eventually be released, even if the Client crash or network partition (usually based on timeout mechanism) fault tolerance: as long as more than half of the Redis nodes are available, the lock can be acquired and released correctly

Therefore, it is necessary to ensure security and activity in the process of developing or using distributed locks to avoid unpredictable results.

In addition, each version of the distributed lock has some problems. In the use of the lock, it is necessary to choose the appropriate lock for the practical scenario of the lock. Generally speaking, the lock usage scenarios include:

Efficiency (efficiency): only one Client is needed to complete the operation, and there is no need to repeat it. This is a loose distributed lock, which only needs to ensure the activity of the lock.

Correctness (correctness): multiple Client guarantees strict mutual exclusion, and it is not allowed to hold locks or operate the same resource at the same time. In this scenario, it is necessary to be more strict in the selection and use of locks, and try to be idempotent in business code.

There are still many problems to be solved in the implementation of Redis distributed locks. We need to recognize these problems and know how to correctly implement a Redis distributed lock, and then reasonably choose and correctly use distributed locks in our work.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.