Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the differences between redis and Memcached

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what is the difference between redis and Memcached". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the difference between redis and Memcached".

Redis is a database, but unlike traditional databases, the data of redis is stored in memory, so the read and write speed is very fast, so redis is widely used in the cache direction. Memcached is a high-performance distributed memory cache server. The purpose of general use is to reduce the number of database visits by caching database query results, so as to improve the speed and scalability of dynamic web applications.

Authority comparison

Salvatore Sanfilippo, author of Redis, has compared these two memory-based data storage systems:

Redis supports server-side data manipulation: Redis has more data structures and supports richer data operations than Memcached. Usually in Memcached, you need to take the data to the client to make similar modifications and then set back. This greatly increases the number of network IO and data volume. In Redis, these complex operations are usually as efficient as normal GET/SET. So, if you need caching to support more complex structures and operations, then Redis would be a good choice. Memory efficiency comparison: if you use simple key-value storage, the memory utilization of Memcached is higher, while if Redis uses hash structure to do key-value storage, its memory utilization will be higher than Memcached because of its combined compression. Performance comparison: because Redis only uses a single core, while Memcached can use multiple cores, on average, Redis has higher performance than Memcached in storing small data on each core. In the data of more than 100k, the performance of Memcached is higher than that of Redis. Although Redis has recently optimized the performance of storing big data, it is still slightly inferior to Memcached.

The specific reasons for the above conclusions are as follows:

1. Different data types are supported.

Unlike Memcached, which only supports simple key-value structure data records, Redis supports a much richer range of data types. The five most commonly used data types are String, Hash, List, Set, and Sorted Set. Redis uses a redisObject object internally to represent all key and value. The main information of redisObject is shown in the figure:

Type represents the specific data type of a value object. Encoding is how different data types are stored within redis. For example, if type=string represents a common string stored in value, then the corresponding encoding can be raw or int. If it is int, the string is stored and represented as a numeric class inside the actual redis, provided that the string itself can be represented by a numeric value. For example: strings like "123" and "456". The vm field actually allocates memory only if the virtual memory feature of Redis is turned on, which is off by default.

1) String

Common commands: set/get/decr/incr/mget, etc.

Application scenario: String is the most commonly used data type, and ordinary key/value storage can be classified as this type.

Implementation: String is stored in redis as a string by default, which is referenced by redisObject. When it encounters incr, decr and other operations, it will be converted to numeric calculation. In this case, the encoding field of redisObject is int.

2) Hash

Common commands: hget/hset/hgetall, etc.

Application scenario: we need to store a user information object data, including user ID, user name, age and birthday. Through the user ID, we want to get the user's name or age or birthday.

Implementation: the Hash of Redis is actually an internally stored Value as a HashMap, and provides direct access to this Map member. As shown in the figure, Key is the user ID and value is a Map. The key of this Map is the property name of the member, and value is the property value. In this way, the modification and access of the data can be directly through the Key of the internal Map (the key of the internal Map is called field in Redis), that is, the corresponding attribute data can be manipulated through key (user ID) + field (attribute tag). Currently, there are two ways to implement HashMap: when the number of HashMap members is relatively small, Redis will use a compact storage method similar to an one-dimensional array in order to save memory, instead of using the real HashMap structure, then the encoding of the corresponding value redisObject is zipmap, and when the number of members increases, it will be automatically converted to the real HashMap, and the encoding will be ht.

3) List

Common commands: lpush/rpush/lpop/rpop/lrange, etc.

Application scenarios: there are many application scenarios of Redis list, and it is also one of the most important data structures of Redis. For example, twitter's watch list, fan list and so on can be implemented using Redis's list structure.

Implementation: the implementation of Redis list is a two-way linked list, that is, it can support reverse search and traversal, which is more convenient to operate, but it brings some additional memory overhead. Many implementations within Redis, including sending buffer queues, also use this data structure.

4) Set

Common commands: sadd/spop/smembers/sunion, etc.

Application scenario: the external function provided by Redis set is similar to that of list, except that set can arrange weights automatically. When you need to store a list of data and do not want to have duplicate data, set is a good choice, and set provides an important interface to determine whether a member is in a set collection, which list cannot provide.

Implementation: the internal implementation of set is a HashMap whose value is always null. In fact, it quickly arranges the weight by calculating hash, which is why set can determine whether a member is in the collection or not.

5) Sorted Set

Common commands: zadd/zrange/zrem/zcard, etc.

Application scenario: the usage scenario of Redis sorted set is similar to that of set, except that set is not automatically ordered, while sorted set can sort members by providing an additional parameter of score, and it is inserted in order, that is, automatic sorting. When you need an ordered and non-repeating list of collections, you can choose the sorted set data structure. For example, twitter's public timeline can be stored as a score with the publication time, so that the acquisition is automatically sorted according to time.

Implementation: the internal use of Redis sorted set HashMap and jump table (SkipList) to ensure data storage and order, HashMap is put in the member to score mapping, while the jump table is stored in all the members, sorting according to the score stored in HashMap, the use of jump table structure can achieve higher search efficiency, and relatively simple in implementation.

2. Different memory management mechanisms

In Redis, not all data is stored in memory all the time. This is the biggest difference compared with Memcached. When physical memory is used up, Redis can swap some value that has not been used for a long time to disk. Redis only caches all key information. If Redis finds that the memory usage exceeds a certain threshold, it triggers the operation of swap, and Redis calculates which value corresponding to key needs to be swap to disk based on "swappability = age*log (size_in_memory)". The value corresponding to these key is then persisted to disk and cleared in memory. This feature allows Redis to maintain more data than the memory of its machine itself. Of course, the machine's own memory must be able to maintain all the key, after all, this data will not be swap operation. At the same time, when Redis swap the data in memory to disk, the main thread providing the service and the sub-thread performing the swap operation will share this part of memory, so if the data of swap is needed to update, Redis will block the operation until the child thread completes the swap operation. When reading data from the Redis, if the value corresponding to the read key is not in memory, then the Redis needs to load the data from the swap file and then return it to the requester. There is a problem with the I / O thread pool. By default, Redis will block, that is, all swap files will not be loaded until they are loaded. This strategy is suitable for batch operations because of a small number of clients. However, if you apply Redis to a large website application, it is obviously not enough to meet the situation of large concurrency. So when Redis runs, we set the size of the I / O thread pool and concurrently operate on read requests that need to load the corresponding data from the swap file to reduce blocking time.

For memory-based database systems such as Redis and Memcached, the efficiency of memory management is a key factor affecting system performance. Malloc/free function in traditional C language is the most commonly used method to allocate and release memory, but this method has great defects: first, the mismatched malloc and free for developers are easy to cause memory leakage; secondly, frequent calls will cause a large number of memory fragments can not be recycled, reducing memory utilization; finally, as a system call, the system overhead is much higher than that of general function calls. Therefore, in order to improve the efficiency of memory management, efficient memory management solutions will not directly use malloc/free calls. Both Redis and Memcached use self-designed memory management mechanisms, but there are great differences in implementation methods. The memory management mechanisms of both are described below.

Memcached uses Slab Allocation mechanism to manage memory by default, and its main idea is to divide the allocated memory into blocks of specific length to store key-value data records of corresponding length according to the predetermined size, so as to completely solve the problem of memory fragmentation. The Slab Allocation mechanism is only designed to store external data, that is, all key-value data is stored in the Slab Allocation system, while other memory requests of Memcached are requested through ordinary malloc/free, because the number and frequency of these requests determine that they will not affect the performance of the whole system. The principle of Slab Allocation is quite simple. As shown in the figure, it first requests a large chunk of memory from the operating system, splits it into blocks of various sizes Chunk, and divides blocks of the same size into groups of Slab Class. Among them, Chunk is the smallest unit used to store key-value data. The size of each Slab Class can be controlled by setting a Growth Factor when the Memcached is started. Assuming that the value of Growth Factor in the figure is 1.25, if the size of the first set of Chunk is 88 bytes, the size of the second set of Chunk is 112 bytes, and so on.

When Memcached receives the data sent by the client, it will first choose the most appropriate Slab Class according to the size of the received data, and then query the list of free Chunk in the Slab Class saved by Memcached to find a Chunk that can be used to store data. When a database expires or is discarded, the Chunk occupied by the record can be recycled and re-added to the free list.

From the above process, we can see that Memcached's memory management system is efficient and does not cause memory fragmentation, but its biggest disadvantage is that it will lead to space waste. Because each Chunk allocates a specific length of memory space, variable-length data cannot make full use of that space. As shown in the figure, caching 100Bytes of data into a 128byte Chunk wastes the remaining 28 bytes.

The memory management of Redis is mainly realized through two files zmalloc.h and zmalloc.c in the source code. In order to facilitate memory management, Redis will store the size of the memory in the head of the memory block after allocating a piece of memory. As shown in the figure, real_ptr is the pointer returned by redis after calling malloc. Redis stores the size of the memory block size in the header, the amount of memory occupied by size is known, is the length of type size_t, and then returns ret_ptr. When memory needs to be freed, the ret_ptr is passed to the memory manager. With ret_ptr, the program can easily calculate the value of real_ptr and then pass the real_ptr to free to release memory.

Redis records all memory allocations by defining an array whose length is ZMALLOC_MAX_ALLOC_STAT. Each element of the array represents the number of memory blocks allocated by the current program, and the size of the memory block is the subscript of that element. In the source code, this array is zmalloc_allocations. Zmalloc_allocations [16] represents the number of memory blocks of length 16bytes that have been allocated. There is a static variable used_memory in zmalloc.c to record the total amount of memory currently allocated. So, overall, Redis uses packaged mallc/free, which is much simpler than Memcached's memory management approach.

3. Data persistence support

Although Redis is a memory-based storage system, it itself supports the persistence of in-memory data and provides two main persistence strategies: RDB snapshots and AOF logs. Memcached does not support data persistence operations.

1) RDB Snapshot

Redis supports the persistence mechanism of saving snapshots of current data as a data file, that is, RDB snapshots. But how does a database with continuous writes generate snapshots? Redis draws on the copy on write mechanism of the fork command. When generating a snapshot, fork the current process into a child process, and then cycle through all the data in the child process to write the data into a RDB file. We can use the save instruction of Redis to configure the timing of RDB snapshot generation, such as 10 minutes to generate a snapshot, or 1000 writes to generate a snapshot, or multiple rules can be implemented together. The definition of these rules is in the Redis configuration file, and you can also set the rules during the Redis runtime through the Redis CONFIG SET command without restarting Redis.

Redis's RDB file will not break, because its write operation is carried out in a new process, when generating a new RDB file, the child process generated by Redis will first write the data to a temporary file, and then rename the temporary file to RDB file through atomic rename system calls, so that Redis's RDB file is always available at any time of failure. At the same time, the RDB file of Redis is also a part of the internal implementation of Redis master-slave synchronization. RDB has its deficiency, that is, once there is a problem with the database, the data saved in our RDB file is not new, and all the data from the last RDB file generation to Redis downtime has been lost. In some businesses, this is tolerable.

2) AOF log

The full name of the AOF log is append only file, which is an appended log file. Unlike the binlog of a general database, AOF files are recognizable plain text, and their contents are Redis standard commands. Only those commands that cause changes to the data are appended to the AOF file. Every command that modifies data generates a log, and the AOF file gets bigger and bigger, so Redis provides another feature called AOF rewrite. Its function is to regenerate an AOF file, and the operation of a record in the new AOF file will only occur once, unlike an old file, which may record multiple operations on the same value. Its generation process is similar to RDB, which is also a process of fork, which traverses the data directly and writes to a new temporary AOF file. During the process of writing to a new file, all write logs are still written to the old AOF file and recorded in the memory buffer. When the refinish operation is complete, the logs in all buffers are written to a temporary file at once. Then call the atomic rename command to replace the old AOF file with the new AOF file.

AOF is a write file operation whose purpose is to write the operation log to disk, so it also encounters the write process we mentioned above. After calling write write to AOF in Redis, the appendfsync option is used to control the time of calling fsync to write it to disk. The security strength of the following three settings of appendfsync gradually becomes stronger.

Appendfsync no when appendfsync is set to no, Redis does not actively call fsync to synchronize AOF log contents to disk, so it all depends on the debugging of the operating system. For most Linux operating systems, fsync is performed every 30 seconds to write the data in the buffer to disk. Appendfsync everysec when appendfsync is set to everysec, Redis will make a fsync call every other second by default to write the data in the buffer to disk. But when the fsync call takes more than 1 second this time. Redis will adopt the strategy of delaying fsync and wait one more second. That is, the fsync is performed two seconds later, and this time the fsync will take place no matter how long it takes to execute. At this point, because the file descriptor is blocked during fsync, the current write operation is blocked. So the conclusion is that, in the vast majority of cases, Redis will fsync every other second. In the worst-case scenario, a fsync operation occurs every two seconds. This operation is called group commit in most database systems, which combines the data of multiple write operations and writes the log to disk at one time. Appednfsync always when appendfsync is set to always, fsync is called once for each write operation, and the data is the safest, of course, because fsync is executed every time, so its performance is also affected.

For general business requirements, it is recommended to use RDB for persistence because the overhead of RDB is much lower than that of AOF logs. For those applications that cannot bear data loss, it is recommended to use AOF logs.

4. Differences in cluster management.

Memcached is a full memory data buffer system. Although Redis supports data persistence, full memory is the essence of its high performance after all. As a memory-based storage system, the size of machine physical memory is the maximum amount of data that the system can hold. If the amount of data that needs to be processed exceeds the physical memory size of a single machine, you need to build a distributed cluster to expand storage capacity.

Memcached itself does not support distributed storage, so distributed storage of Memcached can only be realized on the client side through distributed algorithms such as consistent hashing. The following figure shows the distributed storage implementation architecture of Memcached. Before the client sends data to the Memcached cluster, it will first calculate the target node of the data through the built-in distributed algorithm, and then the data will be sent directly to that node for storage. However, when the client queries the data, it also needs to calculate the node where the query data is located, and then send the query request directly to the node to obtain the data.

Compared with Memcached, which can only implement distributed storage on the client side, Redis prefers to build distributed storage on the server side. The latest version of Redis already supports distributed storage. Redis Cluster is an advanced version of Redis that is distributed and allows a single point of failure. It has no central node and is linearly scalable. The following figure shows the distributed storage architecture of Redis Cluster, in which nodes communicate with each other through binary protocol, and nodes communicate with clients through ascii protocol. In the data placement strategy, Redis Cluster divides the entire key value field into 4096 hash slots, and one or more hash slots can be stored on each node, that is, the maximum number of nodes supported by Redis Cluster is 4096. The distributed algorithm used by Redis Cluster is also simple: crc16 (key)% HASH_SLOTS_NUMBER.

In order to ensure the data availability under single point of failure, Redis Cluster introduces Master node and Slave node. In Redis Cluster, each Master node has two corresponding Slave nodes for redundancy. In this way, the downtime of any two nodes in the entire cluster will not cause the data to be unavailable. When the Master node exits, the cluster automatically selects a Slave node to become the new Master node.

Thank you for your reading, the above is the content of "what is the difference between redis and Memcached". After the study of this article, I believe you have a deeper understanding of the difference between redis and Memcached, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report