Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What's the difference between Redis and Memcached?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you what is the difference between Redis and Memcached. I hope you will gain a lot after reading this article. Let's discuss it together.

Memcached and redis, as the most commonly used cache servers in recent years, are all too familiar. Two years ago, when I was still in school, I read their main source code. Now I write a note to compare their implementation from a personal point of view, right as a review, there are misunderstandings, welcome to correct.

Most of the pictures of the architecture classes used in this paper come from the network, and some of them are different from the latest implementation, which has been pointed out in the article.

one。 Summary

To read the source code of a software, you must first understand what the software is used for, then what are memcached and redis for? As we all know, the data is generally put in the database, but the query data will be relatively slow, especially when there are many users, frequent queries require a lot of time. What should I do? Where can I query the data quickly? It must be in memory. Memcached and redis store data in memory and query it in the way of key-value, which can greatly improve the efficiency. Therefore, they are generally used as cache servers to cache commonly used data and obtain them directly when they need to be queried, so as to reduce the number of queries to the database and improve the query efficiency.

two。 Service mode

How do memcached and redis provide services? They are independent processes and can be turned into daemon processes if necessary, so our user processes need to communicate between processes if they want to use memcached and redis services. Considering that the user process and memcached and redis are not necessarily on the same machine, it is also necessary to support inter-network communication. Therefore, memcached and redis themselves are network servers, and user processes transmit data through the network with them, obviously the simplest and most commonly used is to connect using tcp. In addition, both memcached and redis support the udp protocol. And when the user process and memcached and redis are on the same machine, you can also use unix domain sockets to communicate.

three。 Event model

Let's talk about exactly how they achieved it. First, let's take a look at their event models.

Since epoll came out, almost all web servers have abandoned select and poll and replaced them with epoll. The same is true of redis, except that it also provides support for select and poll. You can configure which one to use, but you usually use epoll. Kqueue is also supported for BSD. Memcached is based on libevent, but the underlying libevent also uses epoll, so you can assume that they all use epoll. The features of epoll will not be introduced here, and there are many articles on the Internet.

They all use epoll for event loops, but redis is a single-threaded server (redis is also multithreaded, except for the main thread, other threads do not have event loop, but do some background storage work), while memcached is multithreaded. Redis's event model is simple, with only one event loop, which is a simple reactor implementation. However, there is a bright spot in the redis event model. We know that epoll is aimed at fd, and only the fd in fd,redis is the fd of the socket connected between the server and the client. But when dealing with it, you need to find the specific client information according to this fd. How to find it? The usual way to deal with is to use a red-black tree to save fd and client information, through fd search, the efficiency is lgn. However, redis is special. The upper limit of the number of clients in redis can be set, that is, you can know the upper limit of fd opened by redis at the same time, and we know that the fd of a process will not be repeated at the same time (fd can only be reused after it is turned off), so redis uses an array, using fd as the subscript of the array, and the elements of the array are the client's information. In this way, the client information can be located directly through fd. Search efficiency is O (1), but also omitted the complex implementation of the red-black tree (I once used c to write a web server, because I wanted to maintain the correspondence between fd and connect, I didn't want to write the red-black tree myself, and then used the set in STL, resulting in the project becoming C++, and finally the project was compiled with Gmail +. Who knows about this? ). Obviously, this approach can only be used for web servers where the upper limit of the number of connection has been determined and is not too large, and http servers like nginx are not applicable. Nginx wrote the red-black tree himself.

While memcached is multithreaded, using master-worker, the main thread listens to the port, establishes a connection, and then assigns it sequentially to each worker thread. Each slave thread has an event loop that serves different clients. Pipe communication is used between the master thread and the worker thread, and each worker thread creates a pipe, then saves the writer and reader, and adds the reader to the event loop to listen for readable events. At the same time, each slave thread has a ready connection queue, after the main thread connects, put the connected item into this queue, and then write a connect command to the write end of the thread's pipe, so that the pipe reader added in the event loop will be ready to read the command from the thread, parse the command to find that there is a connection, and then go to its own ready queue to obtain the connection and process it. The advantage of multi-threading is that it can give full play to the advantage of multi-core, but it is a bit troublesome to write a program. Memcached has a variety of locks and condition variables to synchronize threads.

four。 Memory allocation

The core task of both memcached and redis is to manipulate data in memory, and memory management is naturally the core content.

First take a look at how their memory is allocated. Memcached has its own memory pool, that is, a large block of memory is allocated in advance, and then the allocated memory is allocated from the memory pool, which can reduce the number of memory allocation and improve efficiency. This is also the way most network servers are implemented, but the management of each memory pool varies according to the specific situation. On the other hand, redis does not have its own memory pool, but directly allocates when it is used, that is, when it needs to be allocated, memory management is left to the kernel, and it is only responsible for fetching and releasing (redis is both single-threaded and does not have its own memory pool). Does it feel too simple to implement? That's because it focuses on the database module. However, redis supports replacing glibc's malloc with tcmalloc, which is a product of google and faster than glibc's malloc.

Since redis does not have its own memory pool, the management of memory request and release is much simpler, and it is very convenient to directly malloc and free. Memcached supports memory pooling, so the memory request is obtained from the memory pool, and free is also returned to the memory pool, so it requires a lot of additional management operations, which is a lot of trouble to implement, which will be analyzed in the slab mechanism explanation of memcached later.

five。 Database implementation

Next, take a look at their core content, the implementation of their respective databases.

1. Memcached database implementation

Memcached only supports key-value, that is, only one key for one value. Its data is stored in memory in the same way as key-value pairs, which uses the slab mechanism.

First look at how memcached stores data, that is, storing key-value pairs. As shown in the following figure, each key-value pair is stored in an item structure that contains the relevant properties and the values of key and value.

Item is right to save key-value, when there are many item, how to find a specific item is a problem. So memcached maintains a hash table that is used to quickly find item. The hash table applies the open-chain method (like redis) to resolve key conflicts. A linked list is stored in the bucket of each hash table, and the linked list node is the pointer of the item. For example, the h_next in the figure above refers to the next node of the linked list in the bucket. The hash table supports capacity expansion (when the number of item is more than 1.5 buckets), there is a primary_hashtable and an old_hashtable, where primary_hashtable is normally applicable, but for expansion, set old_hashtable = primary_hashtable, and then set primary_hashtable to the newly applied hash table (the number of buckets multiplied by 2), and then move the data in the old_hashtable to the new hash table in turn. And use a variable expand_bucket to record and move how many buckets, and then free the original old_hashtable (redis also has two hash tables, which are also moved, but not by the background thread, but one bucket at a time). The expansion operation is completed by a thread with backend expansion. When the expansion is needed, the condition variable is used to notify it. After the expansion is completed, it will block the condition variables waiting for expansion. In this way, when you expand the capacity, looking for an item may be in any one of the primary_hashtable and old_hashtable, and you need to compare the location of its bucket and the size of the expand_bucket to determine which table it is in.

Where is the item allocated? From slab. As shown in the figure below, memcached has many slabclass, they manage slab, each slab is actually a collection of trunk, the real item is allocated in trunk, a trunk assigns an item. The size of a trunk in a slab is the same, and the size of different slab,trunk increases proportionally. When you need to apply for a new item, choose the trunk according to its size, and the rule is the smallest trunk that is larger than it. In this way, item of different sizes are assigned to different slab and managed by different slabclass. This disadvantage is that there will be some memory waste, because a trunk may be larger than item, as shown in figure 2, when allocating 100B item, choose 112B trunk, but there will be 12B waste, this part of memory resources are not used.

As shown in the figure above, the whole structure is like this. Slabclass manages slab. A slabclass has one slab_list, which can manage multiple slab. The trunk size of slab in the same slabclass is the same. Slabclass has a pointer slot, which stores the item that the unallocated item has been dropped by free (not the real free memory, just don't use it). When the item is not in use, it is put into the head of the slot, so that every time you need to allocate item in the current slab, you can just take the slot and take it, regardless of whether the item is unallocated or released.

Then, each slabclass corresponds to a linked list, with an head array and an tail array, which hold the head and tail nodes of the linked list, respectively. The node in the linked list is changed to the item assigned by slabclass, and the newly allocated item is placed in the header, indicating that it has not been used for a long time. When slabclass runs out of memory and needs to delete some expired item, it can be deleted from the end of the linked list. Yes, this linked list is designed to implement LRU. It is not enough to rely on it alone, because the query of the linked list is O (n), so when locating item, use the hash table, which already exists, all the allocated item is already in the hash table, so hash is used to find item, and then the linked list is useful to store the most recent use order of item, which is also the standard implementation of lru.

Every time you need to allocate a new item, find the linked list of the slabclass and look forward from the tail to see if the item has expired. If it expires, just use the expired item as the new item. If there is no expiration, you need to allocate the trunk from the slab. If the slab is used up, you need to add the slab to the slabclass.

Memcached supports setting the expiration time, that is, expire time, but internally does not check whether the data expires periodically, but when the client process uses the data, memcached will check the expire time and return an error directly if it expires. The advantage is that there is no need for additional cpu to check the expire time, but the disadvantage is that the expired data may not be used for a long time, it has not been released and takes up memory.

Memcached is multithreaded and maintains only one database, so there may be multiple client processes operating on the same data, which can cause problems. For example, if A has changed the data, and then B has also changed the data, then the operation of A will be overwritten, and maybe A does not know that the current state of A task data is the value after he has changed it, which may cause problems. In order to solve this problem, memcached uses the CAS protocol, which simply means that item saves a 64-bit unsigned int value, marks the version of the data, and every time it is updated (the data value is modified), the version number increases, and then each time the data is changed, it is necessary to compare the consistency between the version number sent by the client process and the version number of the item on the server side, otherwise the dirty data can be prompted.

This is an introduction to how memcached implements a key-value database.

2. Implementation of redis database

First of all, the redis database is more powerful, because unlike memcached, which only supports saving strings, redis supports five data structures: string, list and set,sorted set,hash table. For example, to store a person's information, you can use hash table, use the person's name as key, then name super, age 24, through key and name, you can get the name super, or through key and age, you can get age 24. In this way, when you only need to get the age, you don't need to get the whole information back, and then you can find the age from it and get the age directly, which is efficient and convenient.

To implement these data structures, redis defines an abstract object redis object, as shown in the following figure. Each object has five types: strings, linked lists, collections, ordered collections, and hash tables. At the same time, in order to improve efficiency, redis has prepared a variety of implementation methods for each type, and choose the appropriate implementation method according to the specific scenario. Encoding represents the implementation of the object. Then there is a record of the lru of the object, that is, the last time it was accessed, and a current time is recorded in the redis server (an approximate value, because this time is only updated at regular intervals when the server is automatically maintained). The difference between the two can only calculate how long the object has not been accessed. Then there is reference counting in redis object, which is to share the object and then determine when to delete the object. Finally, a void* pointer is used to point to the real content of the object. Formally, due to the use of abstract redis object, it is much more convenient for the database to operate data. All redis object objects can be used uniformly. When it is necessary to distinguish object types, it is judged according to type. And officially, due to the use of this object-oriented approach, the redis code looks like C++ code, but in fact it is all written in c.

/ / # define REDIS_STRING 0 / / string type / / # define REDIS_LIST 1 / / linked list type / / # define REDIS_SET 2 / / collection type (unordered), you can subtract the set Union and so on / / # define REDIS_ZSET 3 / / ordered collection type / / # define REDIS_HASH 4 / / Hash type / / # define REDIS_ENCODING_RAW 0 / * Raw representation * / raw raw / / # define REDIS_ENCODING_INT 1 / * Encoded as integer * / # define REDIS_ENCODING_HT 2 / * Encoded as hash table * / # define REDIS_ENCODING_ZIPMAP 3 / * Encoded as zipmap * / / # define REDIS_ENCODING_LINKEDLIST 4 / * Encoded as regular linked list * / # define REDIS_ENCODING_ZIPLIST 5 / * Encoded as ziplist * / # define REDIS_ENCODING_INTSET 6 / * Encoded as intset * / # define REDIS_ENCODING_SKIPLIST 7 / * Encoded as skiplist * / # define REDIS_ENCODING_EMBSTR 8 / * Embedded sds String encoding * / typedef struct redisObject {unsigned type:4 / / types of objects, including / * Object types * / unsigned encoding:4; / / bottom to save space, a kind of type data, / / can be stored in different ways unsigned lru:REDIS_LRU_BITS; / * lru time (relative to server.lruclock) * / int refcount / / reference count void * ptr;} robj

In the final analysis, redis is still a key-value database, no matter how many data structures it supports, the final storage is still in the way of key-value, but value can be linked list, set,sorted set,hash table and so on. Like memcached, all key is string, and string is used for specific storage such as set,sorted set,hash table. And c does not have ready-made string, so the first task of redis is to implement a string, named sds (simple dynamic string), the following code, a very simple structure, len storage changes to the total memory length of string, free indicates how many bytes are not used, while buf stores specific data, it is obvious that len-free is the current string length.

Struct sdshdr {int len; int free; char buf [];}

When the string is solved, all the key is saved as sds, so how do you relate key to value? The format of key-value is easy to handle in scripting languages. You can just use a dictionary. C doesn't have a dictionary. What should I do? Write one yourself (redis is very keen on building wheels). Take a look at the code below, privdata stores extra information, using very little, at least we found. Dictht is a specific hash table. One dict corresponds to two hash tables for capacity expansion (including rehashidx for capacity expansion). DictType stores the properties of the hash table. Redis also implements an iterator for dict (so it looks like C++ code).

The implementation of the hash table is similar to that of mc, which also uses the open chain method to resolve conflicts, but there are some tricks used in it. For example, using dictType to store function pointers, you can dynamically configure the operation method of the elements in the bucket. For example, the sizemask saved in dictht takes size (number of buckets)-1, and uses it to do & operation with key instead of remainder operation, speed up, and so on. In general, there are two hash tables in dict, each hash table stores the dictEntry linked list in the bucket, and dictEntry stores the specific key and value.

As mentioned earlier, the purpose of one dict for two dictht is to expand the capacity (and actually reduce the capacity). Normally, dict only uses dictht [0]. When the number of entry in dict [0] reaches a certain ratio to the number of buckets, the expansion and reduction operations will be triggered, which are collectively referred to as rehash. At this time, apply for the size of memory after rehash for dictht [1], then move the data in dictht [0] to dictht [1], and use rehashidx to record the number of barrels that have been moved. When all the buckets have been moved, rehash completes. At this point, change dictht [1] into dictht [0], the original dictht [0] into dictht [1], and null. Unlike memcached, there is no need to open a background thread to do it, but it is done in event loop, and the rehash is not completed at once, but divided into multiple times. Before each user operates the dict, redis moves a bucket of data until the rehash is completed. In this way, the movement is divided into multiple small movements, and the time cost of rehash is divided into each operation of the user. This avoids the need to wait a long time for a user to wait for a rehash when a request is caused by the user, and there is no return until the rehash is completed. However, during rehash, every operation slowed down a bit, and the user did not know that redis added the operation of moving data to his request, so he felt that Redi was too cheap:-D

Typedef struct dict {dictType * type; / / related attributes of the hash table void * privdata; / / additional information dictht ht [2]; / / two hash tables, divided into primary and secondary, are used to expand the capacity of int rehashidx; / * rehashing not in progress if rehashidx = =-1 * / / record the location of the current data migration, and the int iterators used during the expansion / * number of iterators currently running * / / the number of iterators currently in existence} dict;typedef struct dictht {dictEntry * * table; / / dictEntry is item, multiple item form a linked list in the hash bucket, and table is the pointer unsigned long size of an array composed of multiple chain header pointers / / this is the number of buckets / / sizemask takes size-1, and then when a data comes, through the calculated hashkey, let hashkey & sizemask determine the location of the bucket to be placed / / when size takes 2 ^ n, sizemask is 1. 111, which has the same effect as hashkey% size, but using & will be much faster. This is why unsigned long sizemask; unsigned long used; / / the number of dictEntry that has been numbered} dictht;typedef struct dictType {unsigned int (* hashFunction) (const void * key); / / hash method void * (* keyDup) (void * privdata, const void * key); / / key replication method void * (* valDup) (void * privdata, const void * obj) / / value replication method int (* keyCompare) (void * privdata, const void * key1, const void * key2); / / comparison between key void (* keyDestructor) (void * privdata, void * key); / / key deconstruction void (* valDestructor) (void * privdata, void * obj); / / value deconstruction} dictType;typedef struct dictEntry {void * key; union {void * val; uint64_t U64 Int64_t S64;} v; struct dictEntry * next;} dictEntry

With dict, the database is easy to implement. All data is read and stored in dict, and key is stored as key (string) in dictEntry, pointing to a redis object with void*, which can be any of the five types. As shown in the following figure, the structure is like this, but this diagram is out of date and has some inconsistencies with redis3.0.

Each of the type objects in 5 has at least two underlying implementations. There are three kinds of string: REDIS_ENCODING_RAW, REDIS_ENCIDING_INT, REDIS_ENCODING_EMBSTR, list: ordinary two-way linked list and compressed linked list. Compressed linked list simply means that the array is transformed into linked list, continuous space, and then the linked list is simulated by storing string size information. Compared with ordinary linked list, it can save space, but there are side effects, because it is continuous space, so when changing the size of memory. Reallocation is required, and because the byte size of the string is saved, it is possible to cause continuous updates (see the code for details). Set has dict and intset (which is used when all integers are stored), sorted set: skiplist and ziplist, and hashtable implementations have compressed lists and dict and ziplist. Skiplist is a jump watch, it is close to the efficiency of a red-black tree, but it is much easier to implement than a red-black tree, so it is adopted (strangely, there are no wheels here, is it because this wheel is a little difficult? ). Hash table can be implemented using dict, so in dict, key stores key in each dictentry (this is the key of key-value pairs in the hash table), while value stores value, which are all string. For dict in set, key in each dictentry holds the value of a specific element in set, and value is null. The zset (ordered set) in the figure is wrong. Zset is implemented in skiplist and ziplist. First of all, skiplist is easy to understand. Just think of it as a substitute for a red-black tree. Like a red-black tree, it can be sorted. How to use ziplist to store zset? First of all, in zset, each element in set has a score score, which is used to sort. So in ziplist, according to the size of the score, save the element first, then its score, then the next element, and then score. This is continuous storage, so when inserting or deleting, memory needs to be reallocated. So when an element exceeds a certain number, or an element has more than a certain number of characters, redis will choose to use skiplist to implement zset (if you are currently using ziplist, it will take out the data in this ziplist, store it in a new skiplist, and then delete and change ziplist, which is the underlying conversion, and other types of redis object can also be converted). In addition, how does ziplist implement hashtable? In fact, it is very simple, that is, to store a key, store a value, then store a key, and then store a value. It is also stored sequentially, similar to the zset implementation, so when an element exceeds a certain number, or when an element has more than a certain number of characters, it is converted to hashtable. Various underlying implementations are convertible, and redis can choose the most appropriate implementation according to the situation, which is the advantage of using an object-oriented implementation in this way.

It should be pointed out that when you use skiplist to implement zset, you actually use a dict, which stores the same key-value pairs. Why? Because skiplist's lookup is only lgn's (which may become n), and dict can go to O (1), use a dict to speed up the lookup, and because skiplist and dict can point to the same redis object, you don't waste too much memory. In addition, when using ziplist to implement zset, why not use dict to speed up the search? Because ziplist supports a small number of elements (when the number is large, it will be converted to skiplist), and sequential traversal is very fast, so you don't need dict.

From this point of view, the above dict,dictType,dictHt,dictEntry,redis object are very considerate, they work together to achieve a flexible and efficient database with object-oriented color. I have to say, the design of the redis database is still very impressive.

Unlike memcached, redis has more than one database, with a default of 16, numbered 0-15. Customers can choose which database to use, and database 0 is used by default. Data from different databases is not shared, that is, the same key can exist in different databases, but key must be unique in the same database.

Redis also supports the setting of expire time. Let's take a look at the redis object above. There is no field to save expire, so how does redis record the expire time of the data? Redis adds another dict for each database, this dict is called expire dict, and the key in its dict entry is a pair of key, while the value is all 64-bit int redis object, and this int is expire time. In this way, when you judge whether a key is out of date, go to the expire dict to find it, and take out the expire time to compare the current time. Why would you do that? Because not all key will set expiration time, for key that does not set expire time, saving an expire time will waste space, but if you use expire dict to save it separately, you can use memory flexibly as needed (when it is detected that key expires, it will be removed from expire dict).

What is the expire mechanism of redis? Similar to memcahed, redis is also lazy deletion, that is, when you want to use data, check whether the key expires, delete it, and then return an error. Simply rely on lazy deletion, which mentioned above may lead to memory waste, so redis also has a supplementary solution. There is a function called servercron that is executed regularly in redis, which is a function to maintain the server. In it, expired data will be deleted. Note that not all of the expired data will be deleted, but within a certain period of time, the data in the expire dict of each database will be randomly selected. If it expires, delete it, otherwise re-select it. Until the appointed time is up. That is, randomly select expired data to delete, the time of this operation is divided into two kinds, one is longer, the other is short, generally delete for a short time, and delete for a long time at regular intervals. This can effectively alleviate the problem of memory waste caused by lazy deletion of light.

This is the data implementation of redis. Unlike memcached, redis also supports data persistence, which is described below.

4.redis database persistence

The biggest difference between redis and memcached is that redis supports data persistence, which is the biggest reason why many people choose to use redis instead of memcached. The persistence of redis is divided into two policies, and users can configure to use different policies.

RDB persistence when a user executes save or bgsave, the RDB persistence operation is triggered. The core idea of RDB persistence operation is to keep the database intact in the file.

So how to store it? First, store a REDIS string to verify that it is a RDB file, then save the version information of redis, then the specific database, then store the Terminator EOF, and finally use the check and. The key is databases, as you can see by its name, it stores multiple databases, the database is stored in numbered order, and database 0 is finished, then it is 1, then 2, all the way to the last database.

Each database is stored as follows: first, a 1-byte constant SELECTDB indicates that the db has been switched, then the next one is connected to the database number, its length is variable, and then there is the data of the specific key-value pair.

Int rdbSaveKeyValuePair (rio * rdb, robj * key, robj * val, long long expiretime, long long now) {/ * Save the expiretime * / if (expiretime! =-1) {/ * If this key is already expired skip it * / if (expiretime)

< now) return 0; if (rdbSaveType(rdb,REDIS_RDB_OPCODE_EXPIRETIME_MS) == -1) return -1; if (rdbSaveMillisecondTime(rdb,expiretime) == -1) return -1; } /* Save type, key, value */ if (rdbSaveObjectType(rdb,val) == -1) return -1; if (rdbSaveStringObject(rdb,key) == -1) return -1; if (rdbSaveObject(rdb,val) == -1) return -1; return 1;} 由上面的代码也可以看出,存储的时候,先检查expire time,如果已经过期,不存就行了,否则,则将expire time存下来,注意,及时是存储expire time,也是先存储它的类型为REDIS_RDB_OPCODE_EXPIRETIME_MS,然后再存储具体过期时间。接下来存储真正的key-value对,首先存储value的类型,然后存储key(它按照字符串存储),然后存储value。 在rdbsaveobject中,会根据val的不同类型,按照不同的方式存储,不过从根本上来看,最终都是转换成字符串存储,比如val是一个linklist,那么先存储整个list的字节数,然后遍历这个list,把数据取出来,依次按照string写入文件。对于hash table,也是先计算字节数,然后依次取出hash table中的dictEntry,按照string的方式存储它的key和value,然后存储下一个dictEntry。 总之,RDB的存储方式,对一个key-value对,会先存储expire time(如果有的话),然后是value的类型,然后存储key(字符串方式),然后根据value的类型和底层实现方式,将value转换成字符串存储。这里面为了实现数据压缩,以及能够根据文件恢复数据,redis使用了很多编码的技巧,有些我也没太看懂,不过关键还是要理解思想,不要在意这些细节。 保存了RDB文件,当redis再启动的时候,就根据RDB文件来恢复数据库。由于以及在RDB文件中保存了数据库的号码,以及它包含的key-value对,以及每个key-value对中value的具体类型,实现方式,和数据,redis只要顺序读取文件,然后恢复object即可。由于保存了expire time,发现当前的时间已经比expire time大了,即数据已经超时了,则不恢复这个key-value对即可。 保存RDB文件是一个很巨大的工程,所以redis还提供后台保存的机制。即执行bgsave的时候,redis fork出一个子进程,让子进程来执行保存的工作,而父进程继续提供redis正常的数据库服务。由于子进程复制了父进程的地址空间,即子进程拥有父进程fork时的数据库,子进程执行save的操作,把它从父进程那儿继承来的数据库写入一个temp文件即可。在子进程复制期间,redis会记录数据库的修改次数(dirty)。当子进程完成时,发送给父进程SIGUSR1信号,父进程捕捉到这个信号,就知道子进程完成了复制,然后父进程将子进程保存的temp文件改名为真正的rdb文件(即真正保存成功了才改成目标文件,这才是保险的做法)。然后记录下这一次save的结束时间。 这里有一个问题,在子进程保存期间,父进程的数据库已经被修改了,而父进程只是记录了修改的次数(dirty),被没有进行修正操作。似乎使得RDB保存的不是实时的数据库,有点不太高大上的样子。 不过后面要介绍的AOF持久化,就解决了这个问题。 除了客户执行sava或者bgsave命令,还可以配置RDB保存条件。即在配置文件中配置,在t时间内,数据库被修改了dirty次,则进行后台保存。redis在serve cron的时候,会根据dirty数目和上次保存的时间,来判断是否符合条件,符合条件的话,就进行bg save,注意,任意时刻只能有一个子进程来进行后台保存,因为保存是个很费io的操作,多个进程大量io效率不行,而且不好管理。 4.2 AOF持久化 首先想一个问题,保存数据库一定需要像RDB那样把数据库里面的所有数据保存下来么?有没有别的方法? RDB保存的只是最终的数据库,它是一个结果。结果是怎么来的?是通过用户的各个命令建立起来的,所以可以不保存结果,而只保存建立这个结果的命令。 redis的AOF就是这个思想,它不同RDB保存db的数据,它保存的是一条一条建立数据库的命令。 我们首先来看AOF文件的格式,它里面保存的是一条一条的命令,首先存储命令长度,然后存储命令,具体的分隔符什么的可以自己深入研究,这都不是重点,反正知道AOF文件存储的是redis客户端执行的命令即可。 redis server中有一个sds aof_buf, 如果aof持久化打开的话,每个修改数据库的命令都会存入这个aof_buf(保存的是aof文件中命令格式的字符串),然后event loop没循环一次,在server cron中调用flushaofbuf,把aof_buf中的命令写入aof文件(其实是write,真正写入的是内核缓冲区),再清空aof_buf,进入下一次loop。这样所有的数据库的变化,都可以通过aof文件中的命令来还原,达到了保存数据库的效果。 需要注意的是,flushaofbuf中调用的write,它只是把数据写入了内核缓冲区,真正写入文件时内核自己决定的,可能需要延后一段时间。 不过redis支持配置,可以配置每次写入后sync,则在redis里面调用sync,将内核中的数据写入文件,这不过这要耗费一次系统调用,耗费时间而已。还可以配置策略为1秒钟sync一次,则redis会开启一个后台线程(所以说redis不是单线程,只是单eventloop而已),这个后台线程会每一秒调用一次sync。这里要问了,RDB的时候为什么没有考虑sync的事情呢?因为RDB是一次性存储的,不像AOF这样多次存储,RDB的时候调用一次sync也没什么影响,而且使用bg save的时候,子进程会自己退出(exit),这时候exit函数内会冲刷缓冲区,自动就写入了文件中。 再来看,如果不想使用aof_buf保存每次的修改命令,也可以使用aof持久化。redis提供aof_rewrite,即根据现有的数据库生成命令,然后把命令写入aof文件中。很奇特吧?对,就是这么厉害。进行aof_rewrite的时候,redis变量每个数据库,然后根据key-value对中value的具体类型,生成不同的命令,比如是list,则它生成一个保存list的命令,这个命令里包含了保存该list所需要的的数据,如果这个list数据过长,还会分成多条命令,先创建这个list,然后往list里面添加元素,总之,就是根据数据反向生成保存数据的命令。然后将这些命令存储aof文件,这样不就和aof append达到同样的效果了么? 再来看,aof格式也支持后台模式。执行aof_bgrewrite的时候,也是fork一个子进程,然后让子进程进行aof_rewrite,把它复制的数据库写入一个临时文件,然后写完后用新号通知父进程。父进程判断子进程的退出信息是否正确,然后将临时文件更名成最终的aof文件。好了,问题来了。在子进程持久化期间,可能父进程的数据库有更新,怎么把这个更新通知子进程呢?难道要用进程间通信么?是不是有点麻烦呢?你猜redis怎么做的?它根本不通知子进程。什么,不通知?那更新怎么办? 在子进程执行aof_bgrewrite期间,父进程会保存所有对数据库有更改的操作的命令(增,删除,改等),把他们保存在aof_rewrite_buf_blocks中,这是一个链表,每个block都可以保存命令,存不下时,新申请block,然后放入链表后面即可,当子进程通知完成保存后,父进程将aof_rewrite_buf_blocks的命令append 进aof文件就可以了。多么优美的设计,想一想自己当初还考虑用进程间通信,别人直接用最简单的方法就完美的解决了问题,有句话说得真对,越优秀的设计越趋于简单,而复杂的东西往往都是靠不住的。 至于aof文件的载入,也就是一条一条的执行aof文件里面的命令而已。不过考虑到这些命令就是客户端发送给redis的命令,所以redis干脆生成了一个假的客户端,它没有和redis建立网络连接,而是直接执行命令即可。首先搞清楚,这里的假的客户端,并不是真正的客户端,而是存储在redis里面的客户端的信息,里面有写和读的缓冲区,它是存在于redis服务器中的。所以,如下图,直接读入aof的命令,放入客户端的读缓冲区中,然后执行这个客户端的命令即可。这样就完成了aof文件的载入。 // 创建伪客户端fakeClient = createFakeClient();while(命令不为空) { // 获取一条命令的参数信息 argc, argv ... // 执行 fakeClient->

Argc = argc; fakeClient- > argv = argv; cmd- > proc (fakeClient);}

The whole aof persistence design, I think is quite wonderful. There are many places worth worshiping.

5. Redis's transaction

Another thing that makes redis more powerful than memcached is that it supports simple transactions. To put it simply, a transaction is to merge several commands and execute all the commands at once. For relational databases, transactions also have a rollback mechanism, that is, either all transaction commands are executed successfully, as long as there is a failure, it is rolled back to the state before the transaction execution. Redis does not support rollback, its transactions only ensure that commands are executed in turn, and continue to execute even if the middle command goes wrong, so it only supports simple transactions.

First look at the execution of the redis transaction. First execute the multi command to start the transaction, then enter the command that needs to be executed, and finally enter exec to execute the transaction. After receiving the multi command, the redis server will set the status of the corresponding client to REDIS_MULTI, indicating that the client is in the transaction stage and keep the specific information of the transaction command in the multiState structure of the client (of course, first of all, it will check whether the command can be recognized, and the wrong command will not be saved), that is, the number of commands and the specific commands. When the exec command is received, the redis will sequentially execute the commands saved in the multiState. Then save the return value of each command, when a command error occurs, redis will not stop the transaction, but save the error message, and then continue to execute, when all the commands have been executed, return the return values of all commands to the customer. Why doesn't redis support rollback? The explanation seen on the Internet is that the problem is due to the problem of the client program, so there is no need for server rollback, at the same time, rollback is not supported, the redis server runs much more efficiently. In my opinion, the transaction of redis is not the transaction of traditional relational database, the requirement of CIAD is so strict, or the transaction of redis is not a transaction, it just provides a way that the client can execute multiple commands at a time, and just treat the transaction as an ordinary command, and it is not necessary to support rollback.

We know that redis is a single event loop, and when a thing is actually executed (that is, after the redis receives the exec command), the execution process of the thing will not be interrupted, and all commands will be executed in an event loop. However, when the user enters the transaction commands one by one, during this period, other customers may have modified the data used in the transaction, which may cause problems. So redis also provides a watch command, and users can execute the watch command before entering multi to specify the data to be observed, so that if other clients modify the watch data before exec, exec will fail to execute the command to deal with the modified data, indicating that the data has been dirty. How is this achieved? It turns out that in every redisDb, there is a dict watched_keys,watched_kesy in which the key of dictentry is the key of the database being watch, while value is a list, which stores its client of watch. At the same time, each client also has a watched_keys, which stores the key of the current watch of the client. When executing the watch, redis finds the key in the watched_keys of the corresponding database (if not, create a new dictentry), then adds the client to its customer list, and at the same time, adds the key to the watched_keys of the client. When a customer executes a command to modify the data, redis first looks for the key in watched_keys. If it is found, it proves that client has watch it, then traverses all its client of watch, sets these client to REDIS_DIRTY_CAS, and the key of watch is dirty on the surface. When a customer executes a transaction, it will first check whether REDIS_DIRTY_CAS is set. If so, it indicates that the data is dirty and the transaction cannot be executed. An error will be returned immediately. Only when client is not set to REDIS_DIRTY_CAS can the transaction be executed. It is important to note that after the execution of exec, the key of all the watch of the client is cleared, and the client list of the key in the db also clears the client, that is, after the exec is executed, the client no longer watch any key (even if the exec does not execute successfully). So redis's transaction is a simple transaction, not a real transaction.

These are the transactions of redis. I feel that the implementation is very simple and the practical use is not too great.

6. Publish and subscribe channel of redis

Redis supports channels, that is, users who join a channel join a group, and all client in the channel can receive the messages that customers send to the channel.

The implementation is also very simple, and the watch_keys implementation is similar. In redis server, there is a dict of pubsub_channels, in which key is the name of the channel (obviously unique), and value is a linked list, which stores the client added to the channel. At the same time, each client has a pubsub_channels that saves the channels it follows. When using users to send messages to the channel, first find the change channel in the pubsub_channels in server, and then traverse the client to send messages to them. Subscribing and unsubscribing channels is not enough to operate pubsub_channels, which is easy to understand.

At the same time, redis supports mode channels. That is, through the regular matching channel, if there is a mode channel pMagne1, when sending a message to the ordinary channel p1, it will match ppforce 1. Besides sending a message to the ordinary channel, it will also send a message to the client in the ppmae1 mode channel. Note that the normal channel in the issue command is used to match the existing mode channel, instead of setting the mode channel in the issue command, and then matching the channel saved in redis. The implementation is also very simple, there is a pubsub_patterns list in redis server (why not dict here? Because the number of pubsub_patterns is generally small, there is no need to use dict, simple list is fine), what is stored in it is the pubsubPattern structure, which contains schema and client information, as shown below, one pattern, one client, so if there are multiple clint listening to a pubsub_patterns, there will be multiple pubsubPattern in the list, saving the corresponding relationship between client and pubsub_patterns. At the same time, in client, there is a pubsub_patterns list, but it stores a list of the pubsub_patterns it listens on (that is, sds), not the pubsubPattern structure.

Typedef struct pubsubPattern {redisClient * client; / / client robj * pattern; / / Mode} pubsubPattern for listening

When a user sends a message to a channel, it first looks for the channel in pubsub_channels in redis server, and then sends a message to its customer list. Then look for a matching pattern in the pubsub_patterns in redis server, and send a message to client. Duplicate customers are not removed here. Message may have been sent to a client in pubsub_channels, and then may be sent to the user again (or more) in pubsub_patterns. It is estimated that redis thinks this is the client's own problem, so it does not deal with it.

/ * Publish a message * / int pubsubPublishMessage (robj * channel, robj * message) {int receivers = 0; dictEntry * de; listNode * ln; listIter li;/* Send to clients listening for that channel * / de = dictFind (server.pubsub_channels,channel); if (de) {list * list = dictGetVal (de); listNode * ln; listIter li; listRewind (list,&li) While ((ln = listNext (& li))! = NULL) {redisClient * c = ln- > value; addReply (cMagneShared.mbulkhdr [3]); addReply (cMagneShared.messagebulk); addReplyBulk (cMagna channel); addReplyBulk (cMagnum message); receivers++ }} / * Send to clients listening to matching channels * / if (listLength (server.pubsub_patterns)) {listRewind (server.pubsub_patterns,&li); channel = getDecodedObject (channel); while ((ln = listNext (& li))! = NULL) {pubsubPattern * pat = ln- > value If (stringmatchlen ((char*) pat- > pattern- > ptr, sdslen (pat- > pattern- > ptr), (char*) channel- > ptr, sdslen (channel- > ptr), 0) {addReply (pat- > client,shared.mbulkhdr [4]) AddReply (pat- > client,shared.pmessagebulk); addReplyBulk (pat- > client,pat- > pattern); addReplyBulk (pat- > client,channel); addReplyBulk (pat- > client,message); receivers++;}} decrRefCount (channel);} return receivers;} VI. Summary

Generally speaking, redis has many more functions and more complex implementation than memcached. However, memcached is more focused on preserving key-value data (which is enough for most usage scenarios), while redis provides richer data structures and other functions. It can't be said that redis is better than memcached, but from a source reading point of view, redis may be a little more valuable. In addition, redis3.0 supports the cluster function, this part of the code has not been studied, we will follow up later.

After reading this article, I believe you have a certain understanding of the difference between Redis and Memcached, want to know more about it, welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report