Redis-- most people don't know the shit about cache breakdown and cache setting order. 07/02 Update SLTechnology News&Howtos

Redis-- most people don't know the shit about cache breakdown and cache setting order.

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

First, about the cache setting order:

Cache is often used in web design, and we all know that setting cache can improve the response speed of the system. When doing this cache, I don't know why, many people's process is: when reading data, read the data from cache first, then get it from DB, and then set the data to cache; when updating data, update DB first, and then update cache after success. There is nothing wrong with what seems simple here, but in fact, DB and cache are supposed to be a transaction operation that either fails or succeeds at the same time.

Therefore, there are often the following mistakes:

Error operation 1: update DB, write cache eg at the same time: process A writes cache, process B interrupts A, writes cache, and writes DB. Once again, it is process A's turn to write DB. This will result in that data written by B is saved in cache, while data written by An is saved in DB, and the final data is inconsistent, and the cache is always dirty data. If processes keep reading at this time, there is dirty data in cache. By the same token, if you write DB first and write cache, there is also a problem that may be interrupted, resulting in cache being dirty data error operation 2, delete cache first, then update DB, high concurrency problems that may occur: eg: process A first deletes cache, process B interrupts A, then reads the old data from DB, sets it to cache, and then comes back to process A to update DB, so from here on, all read requests are followed. The correct way to read old cache and dirty data all the time should be: 1. Read: first read from DB, then write to cache. 2. Update: update the data in DB first, then delete cache (it must be deleted, not update cache). However, this does not guarantee that the eg:A process will read DB,B process interrupt A, update DB, delete cache, and then write back A process to cache, just like old data in cache. And it has always been dirty data, but the read database operation is very fast, the write database operation is slow, and the relative probability of letting a slow operation interrupt fast is relatively low, so in this way, as to why you delete cache instead of updating cache, it is because if A process updates DB, B process updates DB, while updating cache,A process comes back to update cache, it will result in dirty data in cache.

Let's use the code to implement it.

Def worker_read_type1 (write_flag, user_workid):''read it from cache and cannot get it, and then read it from DB Then write to cache''num = 0 err_num = 0 while True: redis_key =' t_users:'+str (user_workid) user_name = redis_db.get (redis_key) if not user_name: sql = "select user_workid User_name from t_users where user_workid= {user_workid} limit 1 ".format (user_workid=user_workid) data = mysql_extract_db.query_one_dict (sql=sql) user_name = data.get ('user_name','') redis_db.set (redis_key User_name) num + = 1 if len (write_flag): if user_name! = write_flag [0]: err_num + = 1 print 'inconsistency-- user_name: {}-write_flag: {} Errpercent: err_num/num= {} '.format (user_name, write_flag [0], str (float (err_num*100) / num) +'%') time.sleep def worker_update_type1 (write_flag, user_workid, user_name): "'update DB first Then update cache''' while True: sql = "update t_users set user_name=' {user_name} 'where user_workid= {user_workid}" .format (user_name=user_name User_workid=user_workid) res = mysql_extract_db.execute_commit (sql=sql) write_flag [0] = user_name # value written to the database if res: redis_key = 't_users:'+str (user_workid) redis_db.set (redis_key User_name) # and then update cache time.sleep (write_flag, user_workid, user_name) def worker_update_type2:''update cache first Update DB''while True: redis_key =' t_users:'+str (user_workid) redis_db.set (redis_key, user_name) # Update cache sql = "update t_users set user_name=' {user_name} 'where user_workid= {user_workid}" .format (user_name=user_name User_workid=user_workid) res = mysql_extract_db.execute_commit (sql=sql) write_flag [0] = user_name # value written to the database time.sleep (0.01) def worker_update_type3 (write_flag, user_workid, user_name):''delete cache first Then update DB''while True: redis_key =' t_users:'+str (user_workid) redis_db.delete (redis_key) # delete cache sql = "update t_users set user_name=' {user_name} 'where user_workid= {user_workid}" .format (user_name=user_name User_workid=user_workid) res = mysql_extract_db.execute_commit (sql=sql) write_flag [0] = user_name # value written to the database time.sleep (0.01) def worker_update_type4 (write_flag, user_workid, user_name):''update DB first Then delete cache''while True: begin = time.time () sql = "update t_users set user_name=' {user_name}' where user_workid= {user_workid}" .format (user_name=user_name User_workid=user_workid) res = mysql_extract_db.execute_commit (sql=sql) write_flag [0] = user_name # value written to the database if res: redis_key = 't_users:'+str (user_workid) redis_db.delete (redis_key) # deleted Except time.sleep (0. 01) def test_check_run (read_nump=1 Wri_nump=2, readfunc=None, wrifunc=None): "run test" write_flag = Manager (). List () write_flag.append ('1') for i in range (0, wri_nump): p_write = Process (target=wrifunc, args= (write_flag, 2633)) p_write.start () for i in range (0) Read_nump): p_read = Process (target=readfunc, args= (write_flag, 2633,)) p_read.start () print'p is running' while True: pass # run the test below Generally speaking, the read request of the system is much larger than the write request. Here, 100 processes read and 2 processes write test_check_run (read_nump=100, wri_nump=2, readfunc=worker_read_type1, wrifunc=worker_update_type1) # update DB first, and then update cache The probability of data inconsistency in user_name:RobotZhu2038562---write_flag:RobotZhu669457, errpercent: err_num/num=11.4285714286% is about 11.5% test_check_run (read_nump=100, wri_nump=2, readfunc=worker_read_type1, wrifunc=worker_update_type2) # update cache first, then update DB The probability of data inconsistency in multi-process writing problems-user_name:RobotZhu4607997---write_flag:RobotZhu8633737, errpercent: err_num/num=53.8461538462% is about 50% test_check_run (read_nump=100, wri_nump=2, readfunc=worker_read_type1, wrifunc=worker_update_type3) # delete cache first, then update DB When the read process interrupts the writing process, there are serious problems and inconsistencies occur-- user_name:RobotZhu2034159---write_flag:RobotZhu4882794, errpercent: err_num/num=23.9436619718% inconsistency probability of more than 20% test_check_run (read_nump=100, wri_nump=2, readfunc=worker_read_type1, wrifunc=worker_update_type4) # update DB first, and then delete cache There is a problem with the write process interrupting the reading process. The probability of data inconsistency in user_name:RobotZhu1536990---write_flag:RobotZhu1536990, errpercent: err_num/num=7.69230769231% is about 7%, so this is better. Second, cache "breakdown" processing:

A cache with an expiration time is set, and when it expires, a large number of concurrent requests will be directly connected to the DB,DB overload problem.

Solution: 1. When the obtained data is found to be empty, the cache expires, and the DB is not connected immediately at this time. Instead, it is similar to the SETNX syntax in redis and sets a tempkey=1. If the tempkey exists, the setting fails, if it does not exist, the setting is successful, then the data is read by DB and written to cache, otherwise the delay is 30s, and you may have data if you try to read cache again. Why would you do that? Because when multiple processes are concurrent, the first to find that cache is invalid, set tempkey to read data by DB, while other processes wait for a while and then read the data because they cannot set tempkey. Code example: def get_data (key=None): value = redis.get (key) if not value: # cache invalidation if 1==redis.setnx (key+'tempkey', 1,60): # set a temporary key. If it has been set by another process, the setting fails. You will not connect db value = db.query ('select name from test') redis.set (key, value) redis.delete (key+'tempkey') else: time.sleep (10) get_data (key) # Recursive retry, or you can get else: return value directly from cache

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.