In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
In this article, the editor introduces in detail "how to apply the five data types in Redis". The content is detailed, the steps are clear, and the details are handled properly. I hope that this article "how to apply the five data types in Redis" can help you solve your doubts.
Problems with MySql+Memcached architecture
The actual MySQL is suitable for massive data storage. Many companies have used this architecture to load hot data into cache through Memcached to accelerate access. However, with the continuous increase in the amount of business data and visits, we have encountered a lot of problems:
1.MySQL needs to disassemble libraries and tables constantly, and Memcached also needs to expand with it. Expansion and maintenance work take up a lot of development time.
Data consistency between 2.Memcached and MySQL databases.
3.Memcached data hit rate is low or downmachine, a large number of access directly penetrated to the DB,MySQL can not support.
4. Cache synchronization problem across computer rooms.
Many NoSQL let a hundred flowers blossom, how to choose
In recent years, many kinds of NoSQL products have emerged in the industry, so how to use these products correctly and maximize their strengths is a problem that we need to study and think deeply. In fact, in the final analysis, the most important thing is to understand the positioning of these products, and understand the tradeoffs of each product, so as to enhance their strengths and circumvent their weaknesses in practical application. Generally speaking, these NoSQL are mainly used to solve the following problems.
1. A small amount of data storage, high-speed read and write access. This kind of product ensures high-speed access by means of all in-momery of data, while providing the function of data landing. In fact, this is the most important applicable scenario of Redis. [related recommendation: Redis video tutorial]
two。 Massive data storage, distributed system support, data consistency guarantee, convenient cluster nodes to add / delete.
3. The most representative one in this aspect is the train of thought described in two papers by dynamo and bigtable. The former is a completely centerless design, the cluster information is transmitted between nodes through gossip to ensure the final consistency of data, while the latter is a centralized scheme design, which ensures strong consistency through a distributed lock service. Data writing first writes memory and redo log, and then periodic compat merges to disk, optimizing random writes to sequential writes to improve write performance.
4.Schema free,auto-sharding et al. For example, some common document databases support schema-free, directly store data in json format, and support functions such as auto-sharding, such as mongodb.
Redis is most suitable for all data in-momory scenarios. Although Redis also provides persistence, it is actually more of a disk-backed function, which is quite different from the traditional persistence, so you may have doubts. It seems that Redis is more like an enhanced version of Memcached, so when to use Memcached and when to use Redis?
If you simply compare the difference between Redis and Memcached, most will get the following point of view:
1. Redis not only supports simple KBH data, but also provides storage of data structures such as list,set,zset,hash.
2. Redis supports data backup, that is, data backup in master-slave mode.
3. Redis supports data persistence. You can keep the data in memory on disk and load it again when you restart it.
Take a look at how these different data types are described in Redis's internal memory management:
First of all, Redis uses a redisObject object to represent all the main information of key and value,redisObject, as shown in the figure above: type represents the specific data type of a value object, and encoding is how different data types are stored within redis. For example, if type=string represents a common string stored in value, then the corresponding encoding can be raw or int. If it is int, it means that the actual redis stores and represents the string internally as a numeric class, provided, of course, that the string itself can be represented as a numeric value, such as a string like "123,456".
The vm field needs to be specified here. Only when the virtual memory feature of Redis is enabled, this field will actually allocate memory, which is turned off by default. From the figure above, we can find that it is a waste of memory for Redis to use redisObject to represent all key/value data. Of course, these memory management costs are mainly to provide a unified management interface for different data types of Redis. The actual author also provides a variety of methods to help us save memory as much as possible, which we will discuss in detail later.
Redis supports five data types: string (string), hash (hash), list (list), set (collection) and zset (sorted set: ordered set).
① string is the most basic type of redis, which you can understand as exactly the same type as Memcached, with a key corresponding to a value. Value is not only a String, but also a number. The string type is binary safe. It means that the string of redis can contain any data. Such as jpg images or serialized objects. The string type is the most basic data type of Redis, and the value of the string type can store 512MB at most.
Common commands: get, set, incr, decr, mget, etc.
Application scenario: String is the most commonly used data type, ordinary key/ value storage can be classified into this category, that is, it can fully achieve the current Memcached functions, and more efficient. You can also enjoy the regular persistence of Redis, operation log and Replication and other functions. In addition to providing the same get, set, incr, decr, and so on operations as Memcached, Redis also provides the following operations:
Get string length
To the string append content
Set and get a segment of a string
Set and get a bit of a string (bit)
Set the contents of a series of strings in batch
Usage scenarios: regular key-value caching applications. Regular count: Weibo, fans.
Implementation: String is stored in redis as a string by default, which is referenced by redisObject. When it encounters operations such as incr,decr, it will be converted to numeric calculation. In this case, the encoding field of redisObject is int.
Redis 127.0.0.1 OK 6379 > SET name "runoob"OK" redis 127.0.0.1 redis 6379 > GET name "runoob"
In the above example, we used Redis's SET and GET commands. The key is name and the corresponding value is runoob.
Note: a key can store 512MB at most.
② Redis hash is a collection of key-value pairs (key = > value). Redis hash is a mapping table for field and value of type string, and hash is particularly suitable for storing objects.
Common commands: hget,hset,hgetall, etc.
Application scenario: let's give a simple example to describe the application scenario of Hash. For example, we want to store a user information object data that contains the following information:
The user ID is the found key, and the stored value user object contains name, age, birthday and other information. If you use the normal key/value structure to store it, there are two main storage methods:
The first method takes the user ID as a lookup key and encapsulates other information into an object to store in a serialized way. the disadvantage of this method is that it increases the cost of serialization / deserialization, and when one of the information needs to be modified, the whole object needs to be retrieved, and the modification operation needs to protect concurrency and introduce complex problems such as CAS.
The second method is to save as many key-value pairs as the number of members of the user information object, and use the name of the attribute corresponding to the user ID+ as the unique identification to get the value of the corresponding attribute. Although the serialization overhead and concurrency problems are eliminated, the user ID is stored repeatedly. If there is a large amount of such data, the memory waste is still considerable.
Then the Hash provided by Redis solves this problem very well. The Hash of Redis is actually an internally stored Value as a HashMap, and provides an interface to directly access this Map member, as shown below:
In other words, Key is still the user ID, value is a Map, the key of this Map is the attribute name of the member, and value is the attribute value, so the data can be modified and accessed directly through the Key of the internal Map (the key of the internal Map is called field in the Redis), that is, the corresponding attribute data can be manipulated through key (user ID) + field (attribute tag). There is no need to store the data repeatedly. It will not bring the problem of serialization and concurrency modification control, which solves the problem very well.
It should also be noted that Redis provides hgetall to fetch all attribute data directly, but if there are many members of the internal Map, then the operation of traversing the entire internal Map is involved. Because of the Redis single-thread model, this traversal operation may be time-consuming, while the requests of other clients are completely unresponsive, which requires special attention.
Usage scenarios: store some change data, such as user information, etc.
Implementation method: it has been mentioned above that the Redis Hash corresponds to the internal Value is actually a HashMap. In fact, there are two different implementations. When the members of this Hash are relatively small, the Redis will use a compact storage method similar to an one-dimensional array in order to save memory, but will not use the real HashMap structure. The encoding of the corresponding value redisObject is zipmap, and when the number of members increases, it will be automatically converted to the real HashMap, and the encoding will be ht.
Redis > HSET myhash field1 "Hello" field2 "World"OK" redis > HGET myhash field1 "Hello" redis > HGET myhash field2 "World"
In the example, we use the Redis HMSET and HGET commands. HMSET sets two field= > value pairs, and HGET gets the corresponding value of the corresponding field. Each hash can store 232-1 key-value pairs (over 4 billion).
A ③ Redis list list is a simple list of strings sorted in the order in which they are inserted. You can add an element to the head (left) or tail (right) of the list.
Common commands: lpush (add left element), rpush,lpop (remove the first element on the left), rpop,lrange (get list fragment, LRANGE key start stop), and so on.
Application scenarios: there are many application scenarios of Redis list, and it is also one of the most important data structures of Redis. For example, twitter's watch list, fan list and so on can be implemented using Redis's list structure.
List is a linked list, and anyone with a little knowledge of data structures should be able to understand its structure. Using the List structure, we can easily achieve functions such as ranking the latest messages. Another application of List is message queuing
You can take advantage of the PUSH operation of List to store the task in List, and then the worker thread uses the POP operation to fetch the task for execution. Redis also provides api for manipulating a certain segment of List. You can directly query and delete the elements of a certain segment of List.
Implementation: the implementation of Redis list is a two-way linked list, that is, it can support reverse search and traversal, which is more convenient to operate, but it brings some additional memory overhead. Many implementations within Redis, including sending buffer queues, also use this data structure.
The list of Redis is a bi-directional linked list with each child element of type String. You can add or remove elements from the head or tail of the list through push and pop operations, so that List can be used as either a stack or a queue. Getting closer to both ends of the element is faster, but it is slower to access through the index.
Use the scene:
Message queuing system: you can build a queuing system using list and even a queuing system with priority using sorted set. For example, when Redis is used as a log collector, it is actually a queue, where multiple endpoints write log information to Redis, and then a worker writes all logs to disk.
The operation of taking the latest N pieces of data: record the Id list of the first N newly logged-in users, and the out-of-range can be obtained from the database.
/ / add the current login to the linked list ret = r.lpush ("login:last_login_times", uid) / / keep the linked list with only N bits ret = redis.ltrim ("login:last_login_times", 0, NMUE 1) / / get the first N newly logged in users Id list last_login_list = r.lrange ("login:last_login_times", 0, NMu1)
For example, Weibo:
Our latest Weibo ID in Redis uses the resident cache, which is always updated. But we have set a limit of no more than 5000 ID, so our get ID function keeps asking Redis. You need to access the database only if the start/count parameter is out of this range. Our system does not "flush" the cache in the traditional way, and the information in the Redis instance is always consistent. The SQL database (or other type of database on the hard drive) is triggered only when the user needs to obtain "very remote" data, while the home page or the first comment page will not bother the database on the hard disk.
Redis 127.0.0.1 1redis 6379 > lpush runoob mongodb (integer) 2redis 127.0.0.1 1redis 6379 > lpush runoob rabitmq (integer) 3redis 127.0.1) lrange runoob 0101) "rabitmq" 2) "mongodb" 3) "redis" redis 127.0.0.1 6379 >
Lists can store up to 232-1 elements (4294967295, each list can store more than 4 billion).
④ Redis set is an unordered collection of type string. Sets are implemented through hashtable, concepts and mathematical collections are basically similar, can intersect, union, subtraction, and so on, the elements in set are out of order. So the complexity of adding, deleting and searching is all O (1).
Sadd command: add a string element to the set collection corresponding to key, and successfully return 1. If the element has already returned 0 in the collection, an error will be returned if the set corresponding to key does not exist.
Common commands: sadd,spop,smembers,sunion, etc.
Application scenario: the external function provided by Redis set is similar to that of list, except that set can automatically arrange weights. When you need to store list data and do not want to have duplicate data, set is a good choice, and set provides an important interface to determine whether a member is in a set collection, which list cannot provide.
Set is a collection, and the concept of a set is a combination of unrepeated values. Using the Set data structure provided by Redis, some collective data can be stored.
Case: in Weibo, you can store all the followers of a user in a collection and all their fans in a collection. Redis also provides intersection, union, difference and other operations for the collection, which is very convenient to achieve, such as common concern, common preferences, second-degree friends and other functions. For all the above collection operations, you can also use different commands to choose whether to return the results to the client or save to a new collection.
Implementation: the internal implementation of set is a HashMap whose value is always null. In fact, it quickly arranges the weight by calculating hash, which is why set can determine whether a member is in the collection or not.
Use the scene:
① intersection, Union, difference: (Set)
/ / book tables store book names set book:1:name "The Ruby Programming Language" set book:2:name "Ruby on rail" set book:3:name "Programming Erlang" / / tag tables use collections to store data, because collections are good at finding intersections, merging sadd tag:ruby 1sadd tag:ruby 2sadd tag:web 2sadd tag:erlang 3gambles / books that belong to both ruby and web? Inter_list = redis.sinter ("tag.web", "tag:ruby") / / that is, books that belong to ruby but not to web? Inter_list = redis.sdiff ("tag.ruby", "tag:web") / / A collection of books belonging to ruby and web? Inter_list = redis.sunion ("tag.ruby", "tag:web")
② acquires all data for a certain period of time to remove duplicate values
This set data structure using Redis is the most appropriate, just keep throwing the data into the set. Set means collection, so it will be automatically weighed.
Sadd key memberredis 127.0.0.1 sadd key memberredis 6379 > sadd runoob redis (integer) 1redis 127.0.0.1 1redis 6379 > sadd runoob mongodb (integer) 1redis 127.0.0.1 1redis 6379 > sadd runoob rabitmq (integer) 1redis 127.0.0.1 sadd runoob rabitmq (integer) 0redis 127.0.0.1) smembers runoob1) "redis" 2) "rabitmq" 3) "mongodb"
Note: rabitmq has been added twice in the above example, but depending on the uniqueness of the elements in the collection, the second inserted element will be ignored. The maximum number of members in the collection is 4294967295 (each collection can store more than 4 billion members).
⑤ Redis zset, like set, is a collection of elements of type string, and duplicate members are not allowed.
Zadd command: add an element to the collection, and update the corresponding score if the element exists in the collection.
Common commands: zadd,zrange,zrem,zcard, etc.
Usage scenario: the usage scenario of Redis sorted set is similar to that of set, except that set is not automatically ordered, while sorted set can sort members by providing an additional parameter of score, and it is inserted in order, that is, automatic sorting. When you need an ordered and non-repeating list of collections, you can choose the sorted set data structure. For example, twitter's public timeline can be stored as a score with the publication time, so that the acquisition is automatically sorted according to time. Compared with Set, Sorted Set associates a double type weight parameter score, so that the elements in the collection can be arranged in order according to score. Redis sorts the members of the collection from small to large by scores. The members of the zset are unique, but the score can be repeated. For example, for a Sorted Set that stores the scores of the whole class, the collection value can be the student number of the classmate, and the score can be the test score, so that the data is naturally sorted when the data is inserted into the collection. In addition, Sorted Set can be used to make weighted queues, such as the score of ordinary messages is 1 and the score of important messages is 2, and then the worker thread can choose to get work tasks in reverse order of score. Give priority to important tasks.
Implementation: the internal use of Redis sorted set HashMap and jump table (SkipList) to ensure data storage and order, HashMap is put in the member to score mapping, while the jump table is stored in all the members, sorting according to the score stored in HashMap, the use of jump table structure can achieve higher search efficiency, and relatively simple in implementation.
Zadd key score memberredis 127.0.0.1 1redis 6379 > zadd runoob 0redis (integer) 1redis 127.0.0.1 integer 6379 > zadd runoob 0 mongodb (integer) 1redis 127.0.0.1 1redis 127.0.0.1 integer > zadd runoob 0 rabitmq (integer) 0redis 127.0.1 Frey 6379 > ZRANGEBYSCORE runoob 0 10001) "mongodb" 2) "rabitmq" 3) "redis"
Application scenarios for each data type:
Type introduction feature scenario String (string) binary security can contain any data, such as jpg pictures or serialized objects, a key can store a maximum collection of 512M---Hash (dictionary) key-value pairs, that is, the Map type in the programming language is suitable for storing objects. And you can modify only one attribute value like update in the database (in Memcached, you need to take out the entire string and deserialize it into an object and then serialize it back) to store, read, modify user attributes List (list) linked list (two-way linked list) add and delete quickly It provides API1, latest message ranking and other functions (such as the timeline of moments) 2, message queue Set (collection) hash table implementation, element non-repetition 1, add, delete, search complexity is O (1) 2, for the collection to find intersection, union, difference and other operations 1, common friend 2, the use of uniqueness, statistics visit the site of all independent ip 3, friend recommendation According to the intersection of tag, it is recommended that Sorted Set (ordered set) add a weight parameter score to the elements in Set when the intersection is greater than a certain threshold. When the elements are inserted into the collection according to score ordered data, they have been naturally sorted 1, ranking 2, and message queues with weights.
Practical application scenario of Redis
Redis differs from other database solutions in many ways: it uses memory to provide primary storage support, while only the hard disk is used for persistent storage; its data model is very unique, using a single thread. Another big difference is that you can use the functions of Redis in your development environment, but you don't need to switch to Redis.
Of course, it is desirable to switch to Redis, and many developers have made Redis the database of choice from the beginning, but imagine that if your development environment is already set up and applications are already running on it, it is obviously not so easy to change the database framework. In addition, in some applications that require large data sets, Redis is not suitable because its data sets will not exceed the memory available to the system. So if you have a big data app and mainly read access mode, then Redis is not the right choice.
What I like about Redis, however, is that you can integrate it into your system, which can solve a lot of problems, such as tasks that are slow to process in your existing database. You can optimize these through Redis, or create new features for the application. In this article, I want to explore how to add Redis to the existing environment and use its primitive commands and other functions to solve some common problems encountered in the traditional environment. In none of these examples, Redis is the preferred database.
1. Display the latest list of projects
The following statement is often used to show the latest projects, and as there is more data, the query will no doubt get slower and slower.
SELECT * FROM foo WHERE... ORDER BY time DESC LIMIT 10
In Web applications, queries such as "list the latest responses" are common, which usually leads to scalability problems. This is frustrating because projects are created in this order, but sort operations have to be performed to output this order.
Similar problems can be solved with Redis. For example, one of our Web apps wants to list the latest 20 comments posted by users. We have a "Show all" link next to the latest comments, and you can get more comments after clicking on it.
Let's assume that each comment in the database has a unique incremental ID field. We can use pagination to make home and comment pages, use Redis's template, and each time a new comment is posted, we add its ID to a Redis list:
LPUSH latest.comments
We cut the list to a specified length, so Redis only needs to save the latest 5000 comments:
LTRIM latest.comments 0 5000
Every time we need to get the project scope of the latest comment, we call a function to do it (using pseudo code):
FUNCTION get_latest_comments (start, num_items): id_list = redis.lrange ("latest.comments", start,start+num_items-1) IF id_list.length
< num_items id_list = SQL_DB("SELECT ... ORDER BY time LIMIT ...") END RETURN id_list END 这里我们做的很简单。在Redis中我们的最新ID使用了常驻缓存,这是一直更新的。但是我们做了限制不能超过5000个ID,因此我们的获取ID函数会一直询问Redis。只有在start/count参数超出了这个范围的时候,才需要去访问数据库。我们的系统不会像传统方式那样"刷新"缓存,Redis实例中的信息永远是一致的。SQL数据库(或是硬盘上的其他类型数据库)只是在用户需要获取"很远"的数据时才会被触发,而主页或第一个评论页是不会麻烦到硬盘上的数据库了。 2、删除与过滤 我们可以使用LREM来删除评论。如果删除操作非常少,另一个选择是直接跳过评论条目的入口,报告说该评论已经不存在。 redis 127.0.0.1:6379>LREM KEY_NAME COUNT VALUE
Sometimes you want to attach different filters to different lists. If the number of filters is limited, you can simply use a different Redis list for each different filter. After all, there are only 5000 items in each list, but Redis can handle millions of items with very little memory.
3. Relevant to the ranking
Another common requirement is that data from various databases is not stored in memory, so the performance of databases is not ideal in terms of sorting by score and real-time updates, which need to be updated almost every second.
Typically, for example, the rankings of online games, such as a Facebook game, you usually want to:
-list the top 100 players with high scores
-list a user's current global ranking
These actions are a piece of cake for Redis, and even if you have millions of users, you will get millions of new scores every minute.
The pattern is that every time we get a new score, we use this code:
ZADD leaderboard
You may replace username with userID, depending on how you design it.
Getting the top 100 high-scoring users is simple: ZREVRANGE leaderboard 099.
Users' global rankings are similar, except for: ZRANK leaderboard.
4. Sort by user's vote and time
A common variant of the rankings is the pattern used by Reddit or Hacker News, where news is ranked by score according to a formula similar to the following:
Score = points / time ^ alpha
Therefore, the user's vote will dig up the news accordingly, but the time will bury the news according to a certain index. Here is our model, and of course the algorithm is up to you.
The pattern is that we start by looking at items that may be up-to-date, such as the 1000 news items on the front page are candidates, so we ignore the others first, which is easy to implement.
Each time a new news post is posted, we add ID to the list and use LPUSH + LTRIM to make sure that only the latest 1000 items are taken out.
There is a background task to get this list and continuously calculate the final score for each of the 1000 pieces of news. The ZADD command populates the generated list in the new order, and the old news is cleared. The key idea here is that sorting is done by background tasks.
5. Deal with overdue items
Another common way to sort items is to sort by time. We can use unix time as the score.
The pattern is as follows:
-every time a new item is added to our non-Redis database, we add it to the sort collection. In this case, we are using the time properties, current_time and time_to_live.
-another background task uses ZRANGE... The SCORES query sorts the collection and pulls out the latest 10 items. If the unix time is found to have expired, delete the entry in the database.
6. Count
Redis is a good counter, thanks to INCRBY and other similar commands.
I believe many times you want to add new counters to the database to get statistics or display new information, but in the end you have to give them up because of write sensitivity.
Well, now you don't have to worry about using Redis. With atomic increment (atomic increment), you can safely add various counts, reset them with GETSET, or let them expire.
For example, do this:
INCR user: EXPIRE user: 60
You can calculate the number of page views that users have recently paused between pages for no more than 60 seconds, and when the count reaches, for example, 20:00, you can display some banner prompts, or whatever you want to show.
7. Specific projects within a specific time
Another thing that is difficult for other databases, but what Redis does easily is to count how many specific users have accessed a particular resource during a specific period of time. For example, I want to know certain registered users or IP addresses, and how many of them have accessed an article.
Every time I get a new page view, I just need to do this:
SADD page:day1:
Of course, you may want to replace day1 with unix time, such as time ()-(time ()% 3600024) and so on.
Want to know the number of specific users? You just need to use the
SCARD page:day1:
Need to test whether a particular user has visited this page?
SISMEMBER page:day1:
Real-time analysis of what is happening, for data statistics and prevention of spam, etc.
We have only done a few examples, but if you study the command set of Redis and combine it, you can get a large number of real-time analysis methods, which are effective and labor-saving. Using Redis primitive commands, it is easier to implement a spam filtering system or other real-time tracking system.
9 、 Pub/Sub
Redis's Pub/Sub is very simple, stable and fast. Support pattern matching, can subscribe and cancel channels in real time.
10. Queue
You should have noticed that Redis commands like list push and list pop can easily perform queue operations, but they can do more than that: for example, Redis also has a variant of list pop that blocks queues when the list is empty.
Message queuing (Messaging) is widely used in modern Internet applications. Message queuing is used not only for communication between components within the system, but also for interaction between the system and other services. The use of message queues can increase the scalability, flexibility and user experience of the system. The speed of a system that is not based on message queuing depends on the speed of the slowest component in the system. Based on the message queue, the components in the system can be uncoupled, so that the system is no longer bound by the slowest components, and each component can run asynchronously so as to complete their work faster.
In addition, when the server is in highly concurrent operations, such as writing log files frequently. You can use message queuing to implement asynchronous processing. In order to achieve high-performance concurrent operations.
11. Caching
The cache section of Redis is worth writing a new article, and I'm just going to say it briefly here. Redis can replace memcached, changing your cache from just storing data to being able to update data, so you no longer need to regenerate data every time.
After reading this, the article "how to apply the five data types in Redis" has been introduced. If you want to master the knowledge points of this article, you still need to practice and use it yourself. If you want to know more about related articles, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.