In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Today, I will talk to you about the 20 issues that must be mastered in Redis, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.
I would like to share with you 20 must-know and must-be-mastered Redis questions. I hope I can help you. Come and collect them quickly!
What is Redis?
Redis (Remote Dictionary Server) is a high-performance non-relational key-value pair database written in C language. Different from the traditional database, the data of Redis is stored in memory, so the read and write speed is very fast, so it is widely used in the cache direction. Redis can write data to disk, which ensures that the data is not lost, and the operation of Redis is atomic. [related recommendation: Redis video tutorial]
What are the advantages of Redis?
Based on memory operation, memory read and write speed is fast.
Redis is single-threaded, avoiding thread switching overhead and multi-thread competition. Single thread means that network requests are processed by one thread, that is, one thread handles all network requests, and there is more than one thread in the Redis runtime, such as another thread in the process of data persistence.
Multiple data types are supported, including String, Hash, List, Set, ZSet, etc.
Persistence is supported. Redis supports two persistence mechanisms, RDB and AOF, and persistence can effectively avoid data loss.
Support transactions. All operations of Redis are atomic, and Redis also supports the atomicity of several merged operations.
Master-slave replication is supported. The master node automatically synchronizes the data to the slave node and can separate read and write.
Why is Redis so fast?
Memory-based: Redis uses memory storage with no overhead on disk IO. The data is stored in memory and the reading and writing speed is fast.
Single-threaded implementation (before Redis 6. 0): Redis uses a single thread to process requests, avoiding the overhead of thread switching and lock resource contention between multiple threads.
IO multiplexing model: Redis adopts IO multiplexing technology. Redis uses a single thread to poll the descriptor, translating database operations into events and not wasting too much time on the network Imando O.
Efficient data structure: Redis optimizes the underlying layer of each data type in order to pursue faster speed.
Why did Redis choose single thread?
Avoid excessive context switching overhead. The program always runs in a single thread in the process, and there is no scenario of multi-thread switching.
Avoid the overhead of synchronization mechanism: if Redis chooses multithreading model and needs to consider the problem of data synchronization, some synchronization mechanisms will inevitably be introduced, which will bring more overhead in the process of operating data, increase program complexity and reduce performance at the same time.
Easy to implement and easy to maintain: if Redis uses multithreaded mode, then all underlying data structures must be designed with thread safety in mind, and the implementation of Redis will become more complex.
What are the Redis application scenarios?
Cache hot data to ease the pressure on the database.
By using the atomic self-increment operation of Redis, the function of counter can be realized, such as counting the number of user likes, the number of user visits and so on.
For simple message queuing, you can use Redis's own publish / subscribe mode or List to implement simple message queuing and asynchronous operations.
Speed limiter, which can be used to limit the frequency of a user's access to an interface, such as a second kill scenario to prevent unnecessary pressure caused by quick clicks.
Friend relationship, the use of some commands of the set, such as intersection, union, difference, etc., to achieve common friends, common hobbies and other functions.
The difference between Memcached and Redis?
Redis uses only a single core, while Memcached can use multiple cores.
MemCached has a single data structure and is only used to cache data, while Redis supports multiple data types.
MemCached does not support data persistence, and the data will disappear after restart. Redis supports data persistence.
Redis provides master-slave synchronization mechanism and cluster cluster deployment capability, which can provide highly available services. Memcached does not provide a native cluster mode, so you need to rely on the client implementation to write data to the cluster.
Redis is much faster than Memcached.
Redis uses a single-threaded multiplex IO reuse model, and Memcached uses a multithreaded non-blocking IO model.
What are the Redis data types?
Basic data types:
1. String: the most commonly used data type. The value of the String type can be a string, number, or binary, but the maximum value cannot exceed 512MB.
2. Hash:Hash is a collection of key-value pairs.
3. Set: a collection of unordered deduplicates. Set provides intersection, union and other methods, which is particularly convenient for the realization of common friends, common concern and other functions.
4. List: an ordered and repeatable collection, and the underlying layer relies on a two-way linked list.
5. SortedSet (ZSet): ordered Set. An score parameter is maintained internally to implement it. It is suitable for scenarios such as ranking lists and weighted message queues.
Special data types:
1. Bitmap: bitmap, which can be thought of as an array of bits. Each cell in the array can only store 0 or 1. The subscript of the array is called offset in Bitmap. The length of the Bitmap has nothing to do with the number of elements in the set, but with the upper limit of the cardinality.
2 、 Hyperloglog . HyperLogLog is an algorithm used to do cardinality statistics, and its advantage is that when the number or volume of input elements is very large, the space needed to calculate the cardinality is always fixed and very small. A typical usage scenario is statistical unique visitors.
3. Geospatial: it is mainly used to store geographical location information and manipulate the stored information. It is suitable for scenarios such as positioning, nearby people, etc.
Redis transaction
The principle of a transaction is to send several commands within a transaction scope to Redis, and then have Redis execute these commands in turn.
The life cycle of the transaction:
Start a transaction using MULTI
When the transaction is started, the command for each operation will be inserted into a queue, and the command will not be actually executed
The EXEC command commits the transaction.
An error in one command within the scope of a transaction does not affect the execution of other commands and does not guarantee atomicity:
First:0 > MULTI "OK" first:0 > set a 1 "QUEUED" first:0 > set b 23 4 "QUEUED" first:0 > set c 6 "QUEUED" first:0 > EXEC1) "OK" 2) "OK" 3) "OK" 4) "ERR syntax error" 5) "OK" 6) "OK" 7) "OK"
WATCH command
The WATCH command can monitor one or more keys, and once one of the keys is modified, subsequent transactions will not be executed (similar to optimistic locks). After the EXEC command is executed, the monitoring is automatically canceled.
First:0 > watch name "OK" first:0 > set name 1 "OK" first:0 > MULTI "OK" first:0 > set name 2 "QUEUED" first:0 > set gender 1 "QUEUED" first:0 > EXEC (nil) first:0 > get gender (nil)
Such as in the above code:
Watch name turned on the monitoring of name, the key.
Modify the value of name
Open transaction a
The values of name and gender are set in transaction a
Use the EXEC command to enter the commit transaction
Use the command get gender to find that it does not exist, that is, transaction an is not executed
Using UNWATCH, you can cancel the monitoring of key by the WATCH command, and all monitoring locks will be cancelled.
Persistence mechanism
Persistence is to write the data in memory to disk to prevent the loss of memory data caused by service downtime.
Redis supports persistence in two ways, one is RDB and the other is AOF. The former periodically stores the data in memory on the hard disk according to the specified rules, while the latter records the command after each execution of the command. A combination of the two is generally used.
RDB mode
RDB is the default persistence scheme for Redis. When RDB is persisted, the data in memory is written to disk, and a dump.rdb file is generated in the specified directory. Redis restart loads the dump.rdb file to recover the data.
Bgsave is the mainstream way to trigger RDB persistence, and the execution process is as follows:
Execute the BGSAVE command
The Redis parent process determines whether there is currently an executing child process, and if so, the BGSAVE command returns directly.
The parent process executes the fork operation to create the child process, and the parent process blocks during the fork operation.
After the parent process fork completes, the parent process continues to receive and process requests from the client, while the child process begins to write the data in memory to the temporary file on the hard disk
When the child process has written all the data, it replaces the old RDB file with the temporary file.
When Redis starts, it reads the RDB snapshot file and loads the data from the hard disk into memory. Through RDB persistence, once the Redis exits abnormally, the data changed since the last persistence will be lost.
How to trigger RDB persistence:
Manual trigger: the user executes the SAVE or BGSAVE command. The process of executing a snapshot with the SAVE command blocks all client requests and should be avoided in a production environment. The BGSAVE command can perform snapshot operations asynchronously in the background, and the server can continue to respond to client requests at the same time, so it is recommended to use the BGSAVE command when you need to perform snapshots manually.
Passive trigger:
Automatic snapshots are performed according to configuration rules, such as if at least 10 keys are modified within 10100 seconds of SAVE 100s.
If the slave node performs a full copy operation, the master node automatically executes the BGSAVE-generated RDB file and sends it to the slave node.
By default, when the shutdown command is executed, BGSAVE is automatically executed if AOF persistence is not turned on.
Advantages:
The way Redis loads RDB to recover data is much faster than AOF.
Using a single child process for persistence, the main process will not do any IO operation, ensuring the high performance of Redis.
Disadvantages:
Data in RDB mode cannot be persisted in real time. Because every time BGSAVE runs, it has to perform the fork operation to create a child process, which is a heavyweight operation, and the cost of frequent execution is relatively high.
RDB files are saved in a specific binary format, and there are multiple RDB versions of Redis during the upgrade process. There is a problem that the old version of Redis is not compatible with the new version of RDB format.
AOF mode
AOF (append only file) persistence: each write command is recorded as an independent log, and the command in the AOF file is re-executed when Redis is restarted to restore data. The main function of AOF is to solve the real-time performance of data persistence, and AOF is the mainstream way of Redis persistence.
By default, Redis does not enable AOF persistence, you can enable: appendonly yes through the appendonly parameter. After enabling AOF persistence, every time a write command is executed, Redis will write the command into the aof_buf buffer, and the AOF buffer will synchronize with the hard disk according to the corresponding policy.
By default, the system performs a synchronization operation every 30 seconds. In order to prevent the loss of buffer data, you can actively ask the system to synchronize the buffer data to the hard disk after Redis writes the AOF file. You can set the timing of synchronization through the appendfsync parameter.
Appendfsync always / / synchronizes every time an aof file is written, which is the safest and slowest. It is not recommended to configure appendfsync everysec / / to ensure both performance and security. It is recommended to configure appendfsync no / / to decide when to synchronize.
Next, take a look at the AOF persistence execution process:
All write commands are appended to the AOP buffer.
The AOF buffer synchronizes with the hard disk according to the corresponding policy.
As the AOF file becomes larger and larger, it is necessary to rewrite the AOF file regularly to achieve the purpose of compressing the file volume. AOF file rewriting is the process of converting data in the Redis process into write commands and synchronizing it to a new AOF file.
When the Redis server is restarted, the AOF file can be loaded for data recovery.
Advantages:
AOF can better protect data from loss. You can configure AOF to perform a fsync operation every second. If the Redis process dies, the data will be lost for up to 1 second.
AOF writes in append-only mode, so there is no disk addressing overhead, and write performance is very high.
Disadvantages:
For the same file, the AOF file is larger than the RDB data snapshot.
Data recovery is slow.
Master-slave replication
The replication function of Redis is to support data synchronization between multiple databases. The master database can read and write, and the data will be automatically synchronized to the slave database when the data in the master database changes. The slave database is generally read-only, and it receives data synchronized from the master database. A master database can have multiple slave databases, while a slave database can have only one master database.
/ / start the Redis instance as the master database redis-server / / start another instance as the slave database redis-server-- port 6380-- slaveof 127.0.0.1 6379 slaveof 127.0.0.1 6379 / stop receiving the synchronization of other databases and convert to the master database SLAVEOF NO ONE
The principle of master-slave replication?
When a slave node is started, it sends a PSYNC command to the master node
If the slave node is connected to the master node for the first time, a full copy is triggered. At this point, the master node starts a background thread and starts to generate a RDB snapshot file.
At the same time, all new write commands received from the client client are cached in memory. After the RDB file is generated, the master node will send the RDB file to the slave node, and the slave node will first write the RDB file to the local disk and then load it into memory from the local disk
Then the master node sends the write command cached in memory to the slave node, and the slave node synchronizes the data
If the network between the slave node and the master node fails and the connection is disconnected, the master node will automatically reconnect. After the connection, the master node will only synchronize part of the missing data to the slave node.
Sentinel Sentinel
Master-slave replication has the problem that it cannot fail over automatically and can not achieve high availability. The Sentinel mode solves these problems. The master and slave nodes can be switched automatically through the sentinel mechanism.
When the client connects to the Redis, the sentry will first connect to the sentry, and the sentry will tell the client the address of the Redis master node, and then the client will connect to the Redis and perform subsequent operations. When the master node is down, the sentry detects that the master node is down, re-selects a good slave node as the new master node, and then notifies other slave servers through publish and subscribe mode to switch hosts.
working principle
Each Sentinel sends a PING command to Master,Slave and other Sentinel instances it knows about at a frequency of once per second.
If an instance exceeds the specified value since the last valid reply to the PING command, the instance will be marked as subjectively offline by Sentine.
If a Master is marked as subjectively offline, all Sentinel that are monitoring the Master should confirm that the Master is actually in the subjective offline state at a frequency of once per second.
When there is a sufficient number of Sentinel (greater than or equal to the value specified in the profile) to confirm that the Master has indeed entered the subjective offline state within the specified time range, the Master will be marked as objective offline. If there is not enough Sentinel to agree that Master has been offline, the objective offline status of Master will be removed. If Master returns a valid reply to Sentinel's PING command, the subjective offline state of Master will be removed.
The Sentinel node elects the Sentinel leader to be responsible for the failover.
The Sentinel leader selects a well-behaved slave node as the new master node, and then notifies other slave nodes to update the master node information.
Redis cluster
Sentinel mode solves the problem that master-slave replication can not fail over automatically and can not achieve high availability, but it still has the problem that the write capacity and capacity of the master node are limited by stand-alone configuration. The cluster mode realizes the distributed storage of Redis, and each node stores different content, which solves the problem that the writing ability and capacity of the master node are limited by the stand-alone configuration.
The minimum configuration of the Redis cluster cluster node is more than 6 nodes (3 master and 3 slaves), in which the master node provides read and write operations, and the slave node serves as a backup node, which does not provide requests and is only used for failover.
Redis cluster uses virtual slot partitioning, and all keys are mapped to zero 16383 integer slots according to the hash function. Each node is responsible for maintaining part of the slot and the key value data mapped by the slot.
How are hash slots mapped to Redis instances?
Use the crc16 algorithm to calculate a result for the key of key-value pairs
Take the remainder of the result from 16384, and the resulting value represents the hash slot corresponding to key.
Locate the corresponding instance according to the slot information
Advantages:
No central architecture to support dynamic expansion
The data is distributed in multiple nodes according to slot storage. The data is shared among nodes, and the data distribution can be adjusted dynamically.
High availability. When some nodes are unavailable, the cluster is still available. Cluster mode can realize automatic failover (failover). Nodes exchange status information through gossip protocol, and use voting mechanism to complete the role transition from Slave to Master.
Disadvantages:
Bulk operation (pipeline) is not supported.
Data is replicated asynchronously, which does not guarantee the strong consistency of the data.
The support of transaction operation is limited, only the transaction operation of multi-key on the same node is supported, and the transaction function cannot be used when multiple key are distributed on different nodes.
As the minimum granularity of data partition, key can not map a large key object such as hash, list and so on to different nodes.
Multiple database spaces are not supported. Redis on a single machine can support up to 16 databases, while only one database space can be used in cluster mode.
Delete policy for expired keys?
1. Passive deletion (inert). If the key is found to have expired when accessing the key, the key will be deleted.
2. Delete actively (periodically). Clean the key regularly. Each cleanup will traverse all the DB in turn, and 20 key will be randomly extracted from the db. If it expires, delete it. If 5 of the key expires, then continue to clean this db, otherwise start cleaning the next db.
3. Clean up when there is not enough memory. Redis has a maximum memory limit. You can set the maximum memory through the maxmemory parameter. When the memory used exceeds the set maximum memory, the memory will be freed. When the memory is freed, the memory will be cleared according to the configured elimination policy.
What are the memory elimination strategies?
When the memory of the Redis exceeds the maximum allowed memory, the Redis will trigger the memory elimination policy and delete some less commonly used data to ensure the normal operation of the Redis server.
6 data elimination strategies are provided before Redisv4.0:
Volatile-lru:LRU (Least Recently Used), recently used. Using LRU algorithm to remove key with expiration time set
Allkeys-lru: removes the least recently used key from the dataset when there is not enough memory to hold newly written data
Volatile-ttl: select expired data to be eliminated from datasets with an expiration time set
Volatile-random: arbitrarily select data elimination from datasets with an expiration time set
Allkeys-random: data elimination by randomly selecting data from the dataset
No-eviction: data deletion is prohibited. New writes will report an error when there is not enough memory to hold new writes.
Add the following two types after Redisv4.0:
Volatile-lfu:LFU,Least Frequently Used, at a minimum, selects the least frequently used data from datasets that have an expiration time set.
Allkeys-lfu: removes the least frequently used key from the dataset when there is not enough memory to hold the newly written data.
The memory obsolescence policy can be modified through the configuration file, the corresponding configuration item is maxmemory-policy, and the default configuration is noeviction.
How to ensure the data consistency between cache and database when double writing?
1. Delete the cache before updating the database
During the update operation, the cache is deleted, and then the database is updated, and when subsequent requests are read again, the new data is updated to the cache after reading from the database.
The problem: after deleting the cached data and before the update of the database is completed, if there is a new read request during this period, the old data will be read from the database and rewritten to the cache, resulting in inconsistency again. And the subsequent readings are all the old data.
2. Update the database before deleting the cache
During the update operation, the MySQL is updated first. After success, the cache is deleted, and the new data is written back to the cache when subsequent read requests are made.
The problem: during the period of updating the MySQL and deleting the cache, the old cached data is still requested to be read, but when the database update is completed, the consistency will be restored and the impact is relatively small.
3. Update cache asynchronously
After the update operation of the database is completed, it does not directly operate the cache, but encapsulates the operation command into a message and throws it into the message queue, and then Redis consumes the updated data by itself. The message queue can ensure the consistency of the data operation order and ensure that the data of the cache system is normal.
Cache penetration, cache avalanche, cache breakdown [detailed understanding] Redis cache breakdown, penetration, avalanche concept and solution
Cache penetration
Cache traversal refers to querying a non-existent data, because the cache is written passively when the cache misses, and if the data cannot be found from DB, it is not written to the cache, which will cause the non-existent data to be queried to DB every time, thus losing the meaning of cache. When the traffic is heavy, the DB may be dead.
Cache null value, do not check the database.
The Bloom filter is used to hash all the possible data into a large enough bitmap, and the data that does not exist in the query will be intercepted by this bitmap, thus avoiding the query pressure on the DB.
The principle of Bloom filter: when an element is added to the set, the element is mapped to K points in a bit array by K hash functions, setting them to 1. When querying, the element will be mapped through the hash function to get k points. If any of these points have zeros, the checked element must not be there and will be returned directly; if it is all 1, then the query element is likely to exist and the Redis and database will be queried.
Cache avalanche
Cache avalanche means that the same expiration time is used when we set the cache, which causes the cache to expire at a certain time, and all requests are forwarded to DB,DB and hang up due to excessive pressure.
Solution: add a random value to the original failure time to disperse the expiration time.
Cache breakdown
Cache breakdown: when a large number of requests query a key at the same time, the key is invalidated, resulting in a large number of requests falling to the database. Cache breakdown is the invalid key in the query cache, while cache penetration is the key in which the query does not exist.
Solution: add a distributed lock, the first requesting thread can get the lock, the thread that gets the lock will set the cache after querying the data, and other threads that fail to acquire the lock will wait for 50ms and then go back to caching the data, so as to prevent a large number of requests from falling into the database.
Public String get (String key) {String value = redis.get (key); if (value = = null) {/ / cache value expired String unique_key = systemId + ":" + key; / / set the timeout if for 30s (redis.set (unique_key, 1, 'NX',' PX', 30000) = = 1) {/ / set successfully value = db.get (key) Redis.set (key, value, expire_secs); redis.del (unique_key);} else {/ / other threads have already fetched the value from the database and written back to the cache, so you can retry to get the cache value sleep (50); get (key); / / retry the role of} else {return value;}} pipeline?
The redis client executes a command in four processes: sending commands, queuing commands, executing commands, and returning results. Using pipeline, you can batch requests and return results in batches, and the execution speed is faster than that of one by one.
The number of commands assembled with pipeline cannot be too large, otherwise the amount of data is too large, increasing the waiting time of the client, and may also cause network congestion. You can split a large number of commands into multiple small pipeline commands to complete.
Native batch commands (mset and mget) compare to pipeline:
Native batch commands are atomic and pipeline are non-atomic. The pipeline command exited unexpectedly, and the previously successful command will not be rolled back.
Native batch commands have only one command, but pipeline supports multiple commands.
LUA script
Redis creates atomic commands through LUA scripts: when lua script commands are running, no other scripts or Redis commands are executed, realizing the atomic operation of combined commands.
There are two ways to execute Lua scripts in Redis: eval and evalsha. The eval command uses the built-in Lua interpreter to evaluate the Lua script.
/ / the first parameter is the lua script, the second parameter is the number of key name parameters, and the rest are key name parameters and additional parameters > eval "return {KEYS [1], KEYS [2], ARGV [1], ARGV [2]}" 2 key1 key2 first second1) "key1" 2) "key2" 3) "first" 4) "second"
The role of lua script
1. Lua scripts are executed by atoms in Redis, and no other commands are inserted during execution.
2. Lua script can package multiple commands at one time, which can effectively reduce the network overhead.
Application scenario
For example: limit the frequency of interface access.
Maintain a key-value pair for the number of times an interface is accessed in Redis, where key is the interface name and value is the number of visits. Each time an interface is accessed, the following actions are performed:
Intercept the requests of the interface through aop, and count the requests of the interface. Each time a request comes in, the corresponding number of API visits is count plus 1, and stored in redis.
If this is the first request, the count=1 is set and the expiration time is set. Because the combined operation of set () and expire () is not an atomic operation, the lua script is introduced to implement the atomic operation to avoid the problem of concurrent access.
If the maximum number of visits is exceeded within a given time range, an exception is thrown.
Private String buildLuaScript () {return "local c" + "\ nc = redis.call ('get',KEYS [1])" + "\ nif c and tonumber (c) > tonumber (ARGV [1]) then" + "\ nreturn c "+"\ nend "+"\ nc = redis.call ('incr',KEYS [1]) "+"\ nif tonumber (c) = = 1 then "+"\ nredis.call (' expire',KEYS [1], ARGV [2]) "+"\ nend "+"\ nreturn c; ";} String luaScript = buildLuaScript (); RedisScript redisScript = new DefaultRedisScript (luaScript, Number.class) Number count = redisTemplate.execute (redisScript, keys, limit.count (), limit.period ())
PS: the implementation of this kind of interface current limit is relatively simple, and there are many problems, so it is generally not used. Token bucket algorithm and leaky bucket algorithm are mostly used in interface current limit.
After reading the above, do you have any further understanding of the 20 issues that must be mastered in Redis? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.