How does Redis delete key with 120 million specified prefix 07/16 Update SLTechnology News&Howtos

How does Redis delete key with 120 million specified prefix

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "Redis how to delete the 120 million specified prefix key", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "Redis how to delete the 120 million specified prefix key" this article.

Background

Due to the replacement of IDC, we need to migrate the cache to the new data center. Developers have proposed that the old cache has 120 million invalid key (no expiration time set) and business key that is normally in use. You can specify a prefix to delete key before migration. So the question is, how to quickly delete 120 million of key?

How to get the specified key

Everyone knows that because of Redis's single-threaded service mode, the command keys * will block normal business requests, so it definitely won't work.

Here we take advantage of the SCAN function provided by Redis. The SCAN command is a cursor-based iterator (cursor based iterator): each time the SCAN command is called, a new cursor is returned to the user, and the user needs to use this new cursor as the cursor parameter of the SCAN command in the next iteration to continue the previous iteration.

When the cursor parameter of the SCAN command is set to 0, the server starts a new iteration, and when the server returns a cursor with a value of 0 to the user, the iteration is over. The syntax of SCAN is as follows

SCAN cursor [MATCH pattern] [COUNT count]

Where cousor is a cursor and MATCH supports regular matching, we can take advantage of this feature, such as matching key with the prefix "dba_" and how many key COUNT is fetching at a time.

Redis 127.0.0.1 key:4 6379 > scan 01) "17" 2) 1) "key:12" 2) "key:8" 3) "key:4" 4) "key:14" 5) "key:16" 6) "key:17" 7) "key:15" 8) "key:10" 9) "key:3" 10) "key:7" 11) "key:1" redis 127.0.0.1 scan) 0 "2) 1)" key:5 "2)" key:18 "3)" key:0 "4)" key:2 "5)" key:19 "6)" key:13 "7)" key:6 "8)" key:9 "9)" key:11 "

In the above example, the first iteration uses 0 as the cursor to start a new iteration. The second iteration uses the cursor returned during the first iteration, that is, the command replies to the value of the first element-- 17. On the second call to the SCAN command, the command returns cursor 0, which means that the iteration is over and the entire data set (collection) has been fully traversed.

As you can see from the above example, the reply to the SCAN command is an array of two elements, the first array element is a new cursor for the next iteration, and the second array element is an array containing all the iterated elements.

Note: start a new iteration with 0 as the cursor, calling the SCAN command until the command returns cursor 0, a process we call a full iteration. We will take advantage of this feature in later code implementations.

Python's redis module provides a scan_iter iterator to traverse the key, which returns a result iterator object.

In [53]: ret=r.scan_iter ('dba_*',20) In [54]: print ret

Now that we have solved the problem of how to get the data, let's think about the second question.

How to perform deletion

This is relatively simple. Redis provides the DEL command.

127.0.1 get "dba_7"r06cVX9" 127.0.1 r06cVX9 6379 [2] > get "dba_1"ETX57PA" 127.0.1 r06cVX9 6379 [2] > del "dba_7"dba_1" (integer) 2127.0.0.16379 [2] >

In redis-py, functions of delete (key), delete (* key) are provided, where the parameter * key is a list of multiple values. At this point, we can roughly think of getting the key and then deleting it in batch.

(mytest)? Test git: (master)? Python delete_key.pyinitial keys successfully,use time: 90.2497739792normal ways end at: 68.685477972normal ways delete numbers: 1000000

It takes 68.7 seconds to delete 10W key in the conventional way. How long will it take if it is 120 million key? 6800,1000,3600,18.8 hours. Can it be faster?

How to improve the execution speed

Redis itself is based on Request/Response protocol. The client sends a command and waits for Redis to reply. Redis receives the command and responds after processing. The time when the command is sent plus the result is returned is called (Round Time Trip) RRT- round trip time. If the client sends a large number of commands to Redis, it is to wait for the reply of the previous command before executing the next command, which not only has more RTT, but also frequently calls the system IO to send network requests.

The Pipeline (pipeline) function greatly improves the above shortcomings. Pipeline can assemble a set of Redis commands, then transmit them to Redis at one time, and then return the results of this set of commands executed by Redis to the client in order.

It should be noted that although Pipeline is easy to use, the number of commands assembled by Pipline cannot be unlimited, otherwise the amount of data assembled at one time is too large, on the one hand, it increases the waiting time of the client, on the other hand, it will cause network congestion and need to be assembled in batches. The performance comparison between using Pepline and conventional methods is as follows:

Code

# encoding: utf-8

Author: yangyi@youzan.com

Time: 8:35 on 2018-3-9

Func:

Import redis

Import random

Import string

Import time

Pool = redis.ConnectionPool (host='127.0.0.1', port=6379, db=2)

R = redis.Redis (connection_pool=pool)

Def random_str ():

Return '.join (random.choice (string.ascii_letters + string.digits) for _ in range (7))

Def init_keys ():

Start_time = time.time ()

For i in xrange (0,20):

Key_name = 'dba_'+str (I)

Value_name = random_str ()

R.set (key_name, value_name)

Print 'initial keys successfully,use time:', time.time ()-start_time

Def del_keys_without_pipe ():

Start_time = time.time ()

Result_length = 0

For key in r.scan_iter (match='dba_*', count=2000):

R.delete (key)

Result_length + = 1

Print "normal ways end at:", time.time ()-start_time

Print "normal ways delete numbers:", result_length

Def del_keys_with_pipe ():

Start_time = time.time ()

Result_length = 0

Pipe = r.pipeline ()

For key in r.scan_iter (match='dba_*', count=5000):

Pipe.delete (key)

Result_length + = 1

If result_length% 5000 = 0:

Pipe.execute ()

Pip_time = time.time ()

Print "use pipeline scan time", time.time ()-start_time

Pipe.execute ()

Print "use pipeline end at:", time.time ()-pip_time

Print "use pipeline ways delete numbers:", result_length

Def main ():

Init_keys ()

Del_keys_without_pipe ()

Init_keys ()

Del_keys_with_pipe ()

If _ _ name__ = ='_ _ main__':

Main ()

The above is all the content of this article "how to delete the key of 120 million specified prefix by Redis". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.