Analysis of the main points of using redis 07/13 Update SLTechnology News&Howtos

Analysis of the main points of using redis

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail the analysis of the main points about the use of redis. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

I. introduction

Redis (Remote Dictionary Server), the remote dictionary service, is an open source API that is written in ANSI C language, supports the network, can be memory-based and persistent, and provides multiple languages.

Because of its quick start, high execution efficiency, a variety of data structures, support for persistence and clustering and other functions and features are used by many Internet companies. However, if used and operated improperly, it will lead to serious consequences such as memory waste and even system downtime.

II. Analysis of key points

2.1 use the correct data type

Of the five data types of Redis, the string type is the most commonly used and the simplest. However, being able to solve the problem does not mean that the correct data type is used.

For example, to save a user (name,age,city) information to Redis, there are three scenarios:

Scenario 1: use the string type, and each property is treated as a key

Set user:1:name laowangset user:1:age 40set user:1:city shanghai

Advantages: simple and intuitive, each attribute supports update operation

Disadvantages: excessive use of key, large amount of memory, poor aggregation of user information, and trouble in management and maintenance

Scenario 2: use the string type to serialize the user information into a string to save

/ / serialize user information String userInfo = serialize (user) set user:1 userInfo

Pros: simplified storage steps

Disadvantages: there is some overhead in serialization and deserialization

Scenario 3: use the hash type, using a pair of field-value for each property, but only one key

Hmset user:1 name laowang age 40 city shanghai

Advantages: simple and intuitive, reasonable use can reduce memory space

Summary: minimize key in Redis.

2.2 be vigilant against Big Key

Big key generally refers to a string type with a very large value value (greater than 10KB), or a large number of hash, list, collection, or ordered collection elements (greater than 5000).

Big key can have many negative effects on Redis:

Uneven memory: in a cluster environment, big key is allocated to a node machine. Because it does not know which node it is assigned to and occupies a large amount of memory, it is not conducive to unified memory management in the cluster environment.

Timeout blocking: because Redis is a single-threaded operation, the operation of big key is time-consuming and easy to cause blocking

Expired deletion: big key is not only slow to read and write, but also slow to delete. Deleting expired big key is also time-consuming.

Migration difficulties: due to the large size of data, backup and restore are also easy to cause blocking, operation failure

Knowing the harm of big key, how can we judge and query big key? In fact, redis-cli provides the-- bigkeys parameter, and you can query bigkey by typing redis-cli-- bigkeys.

After finding the big key, we usually split the big key into multiple small key for storage. This approach seems to contradict the 2.1 summary, but any option has its advantages and disadvantages, and the measurement of the pros and cons depends on the actual situation.

Summary: minimize big key in Redis.

Add: if you want to see the memory space occupied by a key, you can use the memory usage command. Note: this command was only provided by Redis 4.0 +. If you want to use it, you must upgrade Redis to 4.0 +.

2.3 memory consumption

Even if we properly use the correct data type to save data and split Big Key into small key, there will still be memory consumption problems, so how does Redis memory consumption occur? It is generally caused by the following three situations:

With the continuous development of business, the amount of data stored is increasing (inevitable)

Invalid / expired data is not processed in a timely manner (optimizable)

No downgrade of cold data (optimizable)

Before optimizing scenario 2, we need to know why the expired data is not processed in a timely manner, which refers to the three expired deletion strategies provided by Redis:

Timed deletion: a timer is created for each key with an expiration time set, and it will be deleted as soon as the expiration time is reached

Lazy deletion: when a key is accessed, it is determined whether the key has expired. If it expires, delete it.

Delete periodically: scan the dictionary of expired key in Redis at regular intervals and clear some expired key

Because timed deletion requires the creation of a timer, which consumes a lot of memory, and precise deletion of a large number of key also consumes a lot of CPU resources, Redis adopts both lazy deletion and scheduled deletion strategies. If the client does not request an expired key or if the periodic delete thread does not scan and clear the key, the key will always occupy memory, resulting in memory waste.

Once we know the cause of memory consumption, we can quickly come up with an optimization: delete it manually.

After using the cache, we will manually call the del method / command to delete even if the cache is set to expire. If it cannot be deleted on the spot, we can also turn on the timer in the code to delete these expired key on a regular basis. Compared with the two deletion strategies of Redis, it is much more timely to manually clear the data.

The problem of case 3 is not big, and we can adjust the phase-out strategy of Redis according to the means of its optimization.

2.4 execution of multiple commands

Redis is a synchronous request service based on one request and one response. That is, when multiple clients send commands to the Redis server, the Redis server can only receive and process the commands of one of the clients, and the other clients can only wait for the Redis server to process the current command and respond before they continue to receive and process other command requests.

Redis processing commands are divided into three processes: receiving commands, processing commands, and returning results. Because the data is processed in memory, the processing time is usually nanosecond, very fast (with the exception of big key). As a result, most time-consuming situations occur in accepting commands and returning results. When the client sends multiple commands to the Redis server, if one command is processed for a long time, the other commands can only wait, thus affecting the overall performance.

In order to solve this kind of problem, Redis provides pipeline (pipeline). The client can put multiple commands into pipeline, then send the pipeline command to the Redis server for processing at one time, and then return the result to the client at one time after the Redis server has finished processing. This process reduces the number of interactions between the client and the Redis server, thus reducing round-trip time and improving performance.

Add:

Redis pipeline is compared with native batch commands:

Native batch commands are atomic and pipeline are non-atomic

Native batch commands can only execute one command at a time, and pipeline supports the execution of multiple commands

Native batch commands are implemented on the server side, and pipeline needs to be implemented on the server side and on the client side

Considerations for using Redis pipeline:

The number of commands loaded with pipeline cannot be too large

Commands in pipeline are executed in buffered order, but may be interspersed with commands from other clients, that is, timing is not guaranteed

If an exception occurs during the execution of a certain instruction in pipeline, it will continue to execute subsequent instructions, that is, atomicity is not guaranteed.

2.5 Cache Penetration

Our usual design idea for using caching in a project is as follows:

Send a request to query data, the query rule is to check the cache first, and then query the database if there is no data in the cache, put the checked data into the cache and finally return the data to the client. If the requested data does not exist, eventually each request will be requested to the database, which is called cache penetration.

Cache penetration leads to great security risks. If someone uses a tool to send a large number of requests for data that does not exist, a large number of requests will flow into the database, resulting in increased pressure on the database, which may lead to database downtime. Then affect the normal operation of the entire application, resulting in system paralysis.

To solve such problems, the focus is on reducing access to the database, and there are usually the following options:

Cache preheating: after the system is released and launched, the relevant data is loaded directly into the cache system in advance.

Set the default value: if the request ends up in the database and the database cannot find the data, set a default value for the cache key and put it in the cache. Note: since this default value is meaningless, we need to set the expiration time to reduce the memory footprint.

Bloom filter: hash all possible data into a large enough bitmap, and a non-existent data will definitely be intercepted by bitmap.

2.6 cache avalanche

Cache avalanche: simply speaking, it refers to a large number of requests to access cached data but cannot be queried, and then to request the database, resulting in increased pressure on the database, performance degradation, overload and downtime, thus affecting the normal operation of the whole system. Even the phenomenon of system paralysis.

For example, a complete system consists of three subsystems: system A, system B and system C. their data request chain is system A-> system B-> system C-> database. If there is no data in the cache and the database is down, system C can not query the data in response, so it can only be in the stage of retry and wait, thus affecting system B and system A. An anomaly at a node leads to a series of problems, like an avalanche caused by a gust of wind from a snowy mountain.

When you see this, some readers may wonder, what's the difference between cache penetration and cache avalanches?

Cache traversal focuses on that the requested data is not in the cache, thus requesting the database, as if requesting the database directly through the cache.

The cache avalanche focuses on a series of exceptions caused by large requests because the data cannot be queried in the cache, which leads to the increase of database pressure.

To solve the cache avalanche problem, you still need to know the cause of the problem:

There is a problem with Redis itself

Hotspot data set failure

For reason 1, we can do master-slave, cluster, and try to find data in the cache as far as possible, so as to reduce access to the database.

For reason 2, when setting the expiration time for the cache, stagger the expiration time (such as adding or decreasing a random value on the base time) to avoid cache set invalidation. At the same time, we can also set up a local cache (such as ehcache), limit the flow of the interface or downgrade the service, and reduce the pressure on database access.

This is the end of this article on "Analysis of the main points of using redis". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.