Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the difference between redis and memcached

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article will explain in detail what is the difference between redis and memcached, the content of the article is of high quality, so the editor will share it with you for reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Problems encountered in traditional MySQL+ Memcached architecture

The actual MySQL is suitable for massive data storage. Many companies have used this architecture to load hot data into cache through Memcached to accelerate access. However, with the continuous increase in the amount of business data and visits, we have encountered a lot of problems:

1.MySQL needs to disassemble libraries and tables constantly, and Memcached also needs to expand with it. Expansion and maintenance work take up a lot of development time.

Data consistency between 2.Memcached and MySQL databases.

3.Memcached data hit rate is low or downmachine, a large number of access directly penetrated to the DB,MySQL can not support.

4. Cache synchronization problem across computer rooms.

Many NoSQL let a hundred flowers blossom, how to choose

In recent years, many kinds of NoSQL products have emerged in the industry, so how to use these products correctly and maximize their strengths is a problem that we need to study and think deeply. In fact, in the final analysis, the most important thing is to understand the positioning of these products, and understand the tradeoffs of each product, so as to enhance their strengths and circumvent their weaknesses in practical application. Generally speaking, these NoSQL are mainly used to solve the following problems.

1. A small amount of data storage, high-speed read and write access. This kind of product ensures high-speed access by means of all in-momery of data, while providing the function of data landing. In fact, this is the most important applicable scenario of Redis.

two。 Massive data storage, distributed system support, data consistency guarantee, convenient cluster nodes to add / delete.

3. The most representative one in this aspect is the train of thought described in two papers by dynamo and bigtable. The former is a completely centerless design, the cluster information is transmitted between nodes through gossip to ensure the final consistency of data, while the latter is a centralized scheme design, which ensures strong consistency through a distributed lock service. Data writing first writes memory and redo log, and then periodic compat merges to disk, optimizing random writes to sequential writes to improve write performance.

4.Schema free,auto-sharding et al. For example, some common document databases support schema-free, directly store data in json format, and support functions such as auto-sharding, such as mongodb.

In the face of these different types of NoSQL products, we need to choose the most appropriate product according to our business scenario.

Redis is suitable for scenarios, how to use it correctly

As analyzed earlier, Redis is most suitable for all data in-momory scenarios. Although Redis also provides persistence, it is actually more of a disk-backed function, which is quite different from persistence in the traditional sense, so you may have doubts. It seems that Redis is more like an enhanced version of Memcached, so when to use Memcached and when to use Redis?

If you simply compare the difference between Redis and Memcached, most will get the following point of view:

2 Redis supports data backup, that is, data backup in master-slave mode.

3 Redis supports data persistence, which can keep the data in memory on disk and can be loaded and used again when rebooting.

In Redis, not all data is stored in memory all the time. This is the biggest difference compared with Memcached. Redis only caches all key information. If Redis finds that the memory usage exceeds a certain threshold, it will trigger the operation of swap, and Redis calculates which key corresponding value needs to be swap to disk based on "swappability = age*log (size_in_memory)". The value corresponding to these key is then persisted to disk and cleared in memory. This feature allows Redis to maintain more data than the memory of its machine itself. Of course, the machine's own memory must be able to maintain all the key, after all, this data will not be swap operation. At the same time, when Redis swap the data in memory to disk, the main thread providing the service and the sub-thread performing the swap operation will share this part of memory, so if the data of swap is needed to update, Redis will block the operation until the child thread completes the swap operation.

Comparison before and after using the Redis-specific memory model:

VM off: 300k keys, 4096 bytes values: 1.3G usedVM on: 300k keys, 4096 bytes values: 73M usedVM off: 1 million keys, 256 bytes values: 430.12M usedVM on: 1 million keys, 256 bytes values: 160.09M usedVM on: 1 million keys, values as large as you want, still: 160.09M used

When reading data from the Redis, if the value corresponding to the read key is not in memory, then the Redis needs to load the data from the swap file and then return it to the requester. There is a problem with the I / O thread pool. By default, Redis will block, that is, all swap files will not be loaded until they are loaded. This strategy is suitable for batch operations because of a small number of clients. However, if you apply Redis to a large website application, it is obviously not enough to meet the situation of large concurrency. So we set the size of the Redis O thread pool to concurrently operate on read requests that need to load the corresponding data from the swap file to reduce blocking time.

If you want to use Redis well in an environment with a lot of data, I believe it is essential to understand the memory design and blocking conditions of Redis.

Additional knowledge points:

Comparison between memcached and redis

1 Network IO model

Memcached is a multithreaded, non-blocking IO multiplexing network model, which is divided into listening main thread and worker subthread. After receiving the request, the listening thread listens to the network connection. After accepting the request, the connection description word pipe is passed to the worker thread to read and write IO. The network layer uses the event library encapsulated by libevent. The multi-threaded model can play a multi-core role, but introduces the problems of cache coherency and lock, such as the most commonly used stats command of Memcached. In actual Memcached, all operations have to lock this global variable, count and other work, resulting in performance loss.

(Memcached network IO model)

Redis uses a single-threaded IO reuse model and encapsulates a simple AeEvent event handling framework, which mainly implements epoll, kqueue and select. For simple IO operations, single thread can maximize the speed advantage, but Redis also provides some simple computing functions, such as sorting, aggregation, etc., for these operations, the single-thread model will seriously affect the overall throughput. In the process of CPU computing, The entire IO dispatch is blocked.

two。 Memory management

Memcached uses pre-allocated memory pools and uses slab and chunk of different sizes to manage memory. Item selects appropriate chunk storage according to size. Memory pool method can save the overhead of applying for / releasing memory and reduce memory fragmentation, but it will also lead to a certain degree of space waste, and when there is still a lot of memory space, new data may also be eliminated. The reason can be found in Timyang's article: http://timyang.net/data/Memcached-lru-evictions/

Redis uses on-site memory to store data, and rarely uses free-list to optimize memory allocation. There will be memory fragments to a certain extent. Redis and storage command parameters will store data with expiration time separately and call them temporary data. Non-temporary data will never be removed, even if there is not enough physical memory. As a result, swap also does not eliminate any non-temporary data (but attempts to eliminate some temporary data), which makes Redis more suitable for storage than cache.

3. Data consistency problem

Memcached provides cas commands to ensure the consistency of multiple concurrent access operations to the same data. Redis does not provide cas commands, which is not guaranteed, but Redis provides transactional functionality to ensure the atomicity of a string of commands without being interrupted by any operation.

4. Storage mode and other aspects

Memcached basically only supports simple key-value storage, does not support enumeration, does not support persistence and replication and other functions

In addition to key/value, Redis also supports many data structures such as list,set,sorted set,hash, and provides KEYS

Enumerate, but cannot be used online. If you need to enumerate online data, Redis provides tools that can directly scan its dump file and enumerate all the data. Redis also provides persistence and replication functions.

5. About client support in different languages

In terms of clients in different languages, both Memcached and Redis have rich third-party clients to choose from, but because Memcached has been developed for a longer time, at present, in terms of client support, many clients of Memcached are more mature and stable, while Redis may not catch up with the speed of third-party client follow-up because its protocol itself is more complex than Memcached, and the author continues to add new features. Sometimes you may need to make some changes on the basis of the third-party client in order to make better use.

From the above comparison, it is not difficult to see that when we do not want the data to be kicked out, or when we need more data types other than key/value, or when we need landing functionality, using Redis is more appropriate than using Memcached.

About some peripheral functions of Redis

In addition to serving as storage, Redis also provides some other functions, such as aggregate computing, pubsub, scripting, etc., for this kind of function, we need to understand its implementation principle and clearly understand its limitations before it can be used correctly, such as pubsub function, which actually does not have any persistence support, and all messages between flash breaks or reconnections of consumer connections will be lost. For example, functions such as aggregate computing and scripting are limited by the Redis single-thread model, so it is impossible to achieve high throughput and need to be used with caution.

Generally speaking, the Redis author is a very diligent developer, and we can often see the author trying a variety of new ideas and ideas, and the features in these areas require us to know more about them before using them.

Summary:

The best way to use 1.Redis is to in-memory all data.

2.Redis is more often used as a replacement for Memcached.

3. Redis is more appropriate when you need more data type support than key/value.

4. It is more appropriate to use Redis when the stored data cannot be deleted.

Talking about Memcached and Redis

1. Brief introduction to Memcached

Memcached is a high-performance distributed memory cache server led by Bard Fitzpatric of Danga Interactive, a subsidiary of LiveJurnal. It is essentially an in-memory key-value database, but does not support data persistence, and all data is lost after the server is shut down. Memcached is developed in C language and can be used on most POSIX systems such as Linux, BSD and Solaris as long as libevent is installed. It also has an unofficial version (http://code.jellycan.com/memcached/)) available under Windows. There are many client-side software implementations of Memcached, including CumberCraft, PHP, Java, Python, Ruby, Perl, Erlang, Lua and so on. At present, Memcached is widely used, such as Wikipedia, Flickr, Twitter, Youtube and WordPress in addition to LiveJournal.

Under the Window system, the installation of Memcached is very convenient, just download the executable software from the address given above and then run memcached.exe-d install to complete the installation. In systems such as Linux, we first need to install libevent, and then get the source code from make & & make install. By default, the server launcher for Memcached is installed in the / usr/local/bin directory. When starting Memcached, we can configure different startup parameters for it.

1.1 Memcache configuration

The Memcached server needs to configure the key parameters when it starts. Let's take a look at which key parameters need to be set when Memcached starts and what these parameters do.

1)-TCP listening port of p Memcached. Default is 11211.

2)-UDP listening port of U Memcached. Default is 11211. 0 means UDP snooping is disabled.

3)-the UNIX socket path of Memcached monitoring

4)-an accesses the octal mask of the UNIX socket. Default is 0700.

5)-l the IP address of the server being monitored, which defaults to all network cards

6)-d start the daemon for the Memcached server

7)-r maximum core file size

8)-u the user running Memcached, if it is currently root, you need to use this parameter to specify the user

9)-m the amount of memory allocated to Memcached in MB

10)-M instructs Memcached to return an error when memory is used up instead of using the LRU algorithm to remove data records

11)-c maximum number of concurrent connections, default is 1024

12)-v-vv-vvv sets the detail level of messages printed on the server side, where-v prints only errors and warnings,-vv prints client commands and corresponding messages on the basis of-v, and-vvv prints memory state transition information on the basis of-vv

13)-f is used to set the increment factor of chunk size

14)-n minimum chunk size, which is configured to 48 bytes by default

15)-the number of threads used by the t Memcached server, which is 4 by default

16)-L try to use a large memory page

17)-R maximum number of requests per event, default is 20

18)-C disabling CAS,CAS mode results in 8 bytes of redundancy

2. Introduction to Redis

Redis is an open source key-value storage system. Similar to Memcached, Redis stores most of the data in memory, supporting data types such as strings, hash tables, linked lists, collections, ordered collections, and related operations based on these data types. Redis is developed in C and can be used on most POSIX systems like Linux, BSD, and Solaris without any external dependencies. The client languages supported by Redis are also very rich, and common computer languages such as C, C #, C++, Object-C, PHP, Python, Java, Perl, Lua, Erlang and so on all have available clients to access the Redis server. At present, the application of Redis has been very extensive, such as Sina and Taobao at home, and Flickr and Github abroad are using Redis cache service.

Redis installation is very convenient, just get the source code from http://redis.io/download, and then make & & make install. By default, the server launcher and client programs for Redis are installed in the / usr/local/bin directory. When we start the Redis server, we need to specify a configuration file for it. By default, the configuration file is in the source directory of Redis, and the file name is redis.conf.

2.1 Redis profile

In order to have a direct understanding of the system implementation of Redis, let's first take a look at the main parameters defined in Redis's configuration file and their role.

1) daemonize no by default, redis does not run in the background. If you need to run in the background, change the value of this item to yes

2) pidfile / var/run/redis.pid when Redis is running in the background, Redis will put the pid file in / var/run/redis.pid by default, which you can configure to another address. When running multiple redis services, you need to specify different pid files and ports

3) port 6379 specifies the port on which redis is running. The default is 6379.

4) bind 127.0.0.1 specifies that redis only receives requests from that IP address, and if it is not set, all requests will be processed. It is best to set this item in a production environment

5) loglevel debug specifies the logging level, in which Redis supports a total of four levels: debug, verbose, notice and warning. The default is verbose. Debug means to record a lot of information for development and testing. Verbose means to record useful information, but not as much as debug does. Notice stands for normal verbose and is often used in production environments.

Warning indicates that only very important or serious information will be recorded in the log

6) logfile / var/log/redis/redis.log configuration log file address, default value is stdout. If background mode will be output to / dev/null

7) the number of databases available for databases 16. Default is 16, default database is 0, and database range is 0-(database-1).

8) save 9001 saves the data to disk in save format, indicating how long it takes and how many updates are performed, and then synchronizes the data to the data file rdb. It is equivalent to a condition triggering to capture a snapshot, which can be matched by multiple conditions. Save 9001 means that at least one key is changed in 900 seconds to save the data to disk.

9) whether to compress data when rdbcompression yes is stored in a local database (persisted to a rdb file). Default is yes.

10) dbfilename dump.rdb local persistent database file name. Default is dump.rdb.

11) dir. / working directory, the path where the files of the database mirrored backup are placed. The path and file name should be configured separately because when redis makes a backup, it will first write the status of the current database to a temporary file, and then replace the temporary file with the file specified above when the backup is completed. The temporary files here and the backup files configured above will be placed in the specified path, and the AOF files will also be stored in this directory. Note that you must specify a directory instead of a file here

12) slaveof master-slave replication, setting this database as the slave database of other databases. Set when the machine is a slave service, set the IP address and port of the master service. When Redis starts, it automatically synchronizes data from master

13) masterauth when the master service is password protected (with the password established by requirepass) the password for the slave service to connect to master

14) when the slave library loses connection with the host or the replication is in progress, the slave library can run in two ways: if slave-serve-stale-data is set to yes (the default), the slave library will continue the request from the corresponding client. If slave-serve-stale-data means no, any request except the INFO and SLAVOF commands will return an error "SYNC with master in progress"

15) the repl-ping-slave-period 10 slave database sends PING to the master database at a time interval, which can be set through repl-ping-slave-period. The default is 10 seconds.

16) repl-timeout 60 sets the bulk data transfer time or ping reply time interval of the main database. The default value is 60 seconds. Make sure that repl-timeout is greater than repl-ping-slave-period.

17) requirepass foobared sets the password that the client needs to use before making any other assignments after connecting. Because redis is quite fast, under a better server, an external user can try 150K passwords per second, which means you need to specify a very powerful password to prevent brute force cracking.

18) the rename-command CONFIG "" command can be renamed to a relatively dangerous command in a shared environment, such as renaming CONFIG to a character that is not easy to guess: # rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52. If you want to delete a command, just rename it to an empty character "": rename-command CONFIG ""

19) maxclients 128sets the maximum number of client connections at a time, which is unlimited by default. The number of client connections that Redis can open at the same time is the maximum number of file descriptors that the Redis process can open. If you set maxclients 0, there are no restrictions. When the number of client connections reaches the limit, Redis closes the new connection and returns max number of clients to the client

Reached error message

20) maxmemory specifies the maximum memory limit for Redis. Redis loads data into memory at startup, and after reaching the maximum memory, Redis first attempts to clear expired or expiring Key,Redis and removes empty list objects. When this method is processed, the maximum memory setting is still reached and the write operation can no longer be performed, but the read operation can still be performed. Note: Redis's new vm mechanism stores Key in memory and Value in swap area.

21) maxmemory-policy volatile-lru which data will Redis choose to delete when memory reaches its maximum value? There are five ways to choose: volatile-lru represents the use of the LRU algorithm to remove the key that sets the expiration time (LRU: recently used Least Recently Used), allkeys-lru represents the use of the LRU algorithm to remove any key,volatile-random that represents the removal of random key,allkeys_random that sets the expiration time, removes a random key,volatile-ttl that removes the key (minor TTL) that is about to expire, and noeviction represents not removing any key, but just returns a write error.

Note: for the above policy, if there is no appropriate key to remove, Redis will return an error when writing

22) by default, appendonly no asynchronously backs up database mirrors to disk in the background, but this backup is very time-consuming and cannot be backed up very frequently. If something happens, such as pulling the power, pulling the plug and so on, it will cause a wide range of data loss, so redis provides another more efficient way of database backup and disaster recovery. When appendonly mode is turned on, redis appends every write request received to the appendonly.aof file. When redis restarts, it will restore the previous state from the file, but this will cause the appendonly.aof file to be too large, so redis also supports the BGREWRITEAOF instruction to reorganize appendonly.aof, so you can at the same time

Enable asynchronous dumps and AOF

23) appendfilename appendonly.aof AOF file name, default is "appendonly.aof"

24) appendfsync everysec Redis supports three strategies for synchronizing AOF files: no represents non-synchronization, the system operates, always represents synchronization every time there is a write operation, and everysec represents the accumulation of write operations, once per second. The default is "everysec", which is the best compromise between speed and security.

25) slowlog-log-slower-than 10000 records commands that exceed a specific execution time. The execution time does not include the Istroke O calculation, such as connecting the client, returning the result, etc., but only the command execution time. You can set slowlog with two parameters: one is the parameter slowlog-log-slower-than (subtle) that tells Redis how long it takes to be recorded, and the other is slow.

Length of the log. When a new command is recorded, the earliest command is removed from the queue, and the following time is in subtle microunits, so 1000000 represents one minute. Note that setting a negative number will turn off the slow log, while a setting of 0 will force each command to be logged

26) hash-max-zipmap-entries 512 & & hash-max-zipmap-value 64 when the hash contains more than the specified number of elements and the largest element does not exceed the threshold, the hash will be stored in a special encoding (greatly reducing memory usage). These two critical values can be set here. Redis Hash is actually a HashMap inside Value. In fact, there are two different implementations. When the members of this Hash are relatively small, the Redis will use a compact storage method similar to an one-dimensional array in order to save memory, instead of using the real HashMap structure, and the encoding of the corresponding value redisObject is zipmap. When the number of members increases, it is automatically converted to the real HashMap, and the encoding is ht.

27) list-max-ziplist-entries 512 list data type how many nodes below will adopt the compact storage format of de-pointer

28) list-max-ziplist-value 64 data type node value size less than the number of bytes will be in a compact storage format

29) the internal data of set-max-intset-entries 512 set data type will be stored in a compact format if all of them are numeric and how many nodes below

30) zset-max-ziplist-entries 128 zsort data type how many nodes below will adopt the compact storage format of de-pointer

31) the number of bytes less than the node value size of the zset-max-ziplist-value 64 zsort data type is in a compact storage format.

32) activerehashing yes Redis will use 1 millisecond of CPU time every 100ms to re-hash redis's hash table, which can reduce memory usage. When you have a very strict real-time requirement in your usage scenario and cannot accept that Redis has a delay of 2 milliseconds to the request from time to time, configure this as no. If you do not have such strict real-time requirements, you can set it to yes so that memory can be freed as quickly as possible.

Common data types of Redis

Unlike Memcached, which only supports simple key-value structure data records, Redis supports a much richer range of data types. The five most commonly used data types are String, Hash, List, Set, and Sorted Set. Before describing these data types in detail, let's look at a diagram to see how these different data types are described in Redis's internal memory management.

Figure 1 Redis object

Redis uses a redisObject object internally to represent all key and value. The main information of redisObject is shown in figure 1: type represents the specific data type of a value object, and encoding is how different data types are stored within redis. For example, type=string represents a common string stored by value, then the corresponding encoding can be raw or int, and if int represents the actual redis, the string is stored and represented by numeric class. Of course, the premise is that the string itself can be represented as a numeric value, such as a string like "123" and "456". The vm field needs to be specified here. Only when the virtual memory feature of Redis is enabled, this field will actually allocate memory, which is turned off by default. Through Figure1, we can find that it is a waste of memory for Redis to use redisObject to represent all key/value data. Of course, these memory management costs are mainly to provide a unified management interface for different data types of Redis. The actual author also provides a variety of methods to help us save memory as much as possible. Let's first analyze the use and internal implementation of these five data types one by one.

1) String

Common commands: set/get/decr/incr/mget, etc.

Application scenario: String is the most commonly used data type, and ordinary key/value storage can be classified as this type.

Implementation: String is stored in redis as a string by default, which is referenced by redisObject. When it encounters incr, decr and other operations, it will be converted to numeric calculation. In this case, the encoding field of redisObject is int.

2) Hash

Common commands: hget/hset/hgetall, etc.

Application scenario: we need to store a user information object data, including user ID, user name, age and birthday. Through the user ID, we want to get the user's name or age or birthday.

Implementation: the Hash of Redis is actually an internally stored Value as a HashMap, and provides direct access to this Map member. As shown in figure 2, Key is the user ID and value is a Map. The key of this Map is the property name of the member, and value is the property value. In this way, the modification and access of the data can be directly through the Key of the internal Map (the key of the internal Map is called field in Redis), that is, the corresponding attribute data can be manipulated through key (user ID) + field (attribute tag). Currently, there are two ways to implement HashMap: when the number of HashMap members is relatively small, Redis will use a compact storage method similar to an one-dimensional array in order to save memory, instead of using the real HashMap structure, then the encoding of the corresponding value redisObject is zipmap, and when the number of members increases, it will be automatically converted to the real HashMap, and the encoding will be ht.

Figure 2 Hash data type of Redis

3) List

Common commands: lpush/rpush/lpop/rpop/lrange, etc.

Application scenarios: there are many application scenarios of Redis list, and it is also one of the most important data structures of Redis. For example, twitter's watch list, fan list and so on can be implemented using Redis's list structure.

Implementation: the implementation of Redis list is a two-way linked list, that is, it can support reverse search and traversal, which is more convenient to operate, but it brings some additional memory overhead. Many implementations within Redis, including sending buffer queues, also use this data structure.

4) Set

Common commands: sadd/spop/smembers/sunion, etc.

Application scenario: the external function provided by Redis set is similar to that of list, except that set can arrange weights automatically. When you need to store a list of data and do not want to have duplicate data, set is a good choice, and set provides an important interface to determine whether a member is in a set collection, which list cannot provide.

Implementation: the internal implementation of set is a HashMap whose value is always null. In fact, it quickly arranges the weight by calculating hash, which is why set can determine whether a member is in the collection or not.

5) Sorted Set

Common commands: zadd/zrange/zrem/zcard, etc.

Application scenario: the usage scenario of Redis sorted set is similar to that of set, except that set is not automatically ordered, while sorted set can sort members by providing an additional parameter of score, and it is inserted in order, that is, automatic sorting. When you need an ordered and non-repeating list of collections, you can choose sorted set data structures, such as

The public timeline of twitter can be stored as a score with the time of publication, so that the acquisition is automatically sorted according to time.

Implementation: the internal use of Redis sorted set HashMap and jump table (SkipList) to ensure data storage and order, HashMap is put in the member to score mapping, while the jump table is stored in all the members, sorting according to the score stored in HashMap, the use of jump table structure can achieve higher search efficiency, and relatively simple in implementation.

2.3 persistence of Redis

Although Redis is a memory-based storage system, it itself supports the persistence of in-memory data and provides two main persistence strategies: RDB snapshots and AOF logs. We will introduce these two different persistence strategies below.

2.3.1 AOF log for Redis

Redis supports the persistence mechanism of saving snapshots of current data as a data file, that is, RDB snapshots. This approach is easy to understand, but how does a database with continuous writes generate snapshots? Redis draws on the copy on write mechanism of the fork command. When generating a snapshot, fork the current process out of a child process, and then cycle all the data in the child process

Write it as a RDB file.

We can configure the timing of RDB snapshot generation through the save instruction of Redis. For example, you can configure the snapshot to be generated when there are 1000 writes within 10 minutes, or when there are 100 writes within an hour, or multiple rules can be implemented together. These rules are defined in the configuration file of Redis, and you can also run them in Redis through the CONFIG SET command of Redis.

Set the rules on the line, and you don't need to restart Redis.

Redis's RDB file will not break, because its write operation is carried out in a new process, when generating a new RDB file, the child process generated by Redis will first write the data to a temporary file, and then rename the temporary file to RDB file through atomic rename system calls, so that Redis's RDB file is always available at any time of failure. At the same time, the RDB file of Redis is also a part of the internal implementation of Redis master-slave synchronization.

However, we can clearly see that RDB has its shortcomings, that is, once there is a problem with the database, then the data saved in our RDB file is not new, and all the data from the last RDB file generation to Redis downtime has been lost. In some businesses, this is tolerable, and we recommend that these businesses be persisted using RDB, because RDB is enabled

The price is not high. But for other applications that have high requirements for data security and cannot tolerate data loss, RDB is powerless, so Redis introduces another important persistence mechanism: AOF logs.

Redis's AOF log

The full name of the AOF log is append only file, and we can tell from the name that it is an appended log file. Unlike the binlog of a general database, AOF files are recognizable plain text, and their contents are Redis standard commands. Of course, not all commands that send Redis are recorded in the AOF log, and only those commands that cause changes to the data are appended to the AOF file. So every command that modifies the data generates a log, so will the AOF file be very large? The answer is yes, the AOF file will get bigger and bigger, so Redis provides another feature called AOF rewrite. Its function is to regenerate an AOF file, and the operation of a record in the new AOF file will only occur once, unlike an old file, which may record multiple operations on the same value. Its generation process is similar to RDB, which is also a process of fork, which traverses the data directly and writes to a new temporary AOF file. During the process of writing to a new file, all write logs are still written to the old AOF file and recorded in the memory buffer. When the refinish operation is complete, the logs in all buffers are written to a temporary file at once. Then call the atomic rename command to replace the old AOF file with the new AOF file.

AOF is a write file operation, and its purpose is to write the operation log to disk, so it also encounters the five processes of write operation mentioned above. So how secure is the operation of writing AOF? In fact, this can be set, after calling write (2) to write to AOF in Redis, when to call fsync to write to disk, through the appendfsync option to control the following three settings of appendfsync, the security strength gradually becomes stronger.

1) appendfsync no

When appendfsync is set to no, Redis does not actively call fsync to synchronize AOF log contents to disk, so it all depends on the debugging of the operating system. For most Linux operating systems, fsync is performed every 30 seconds to write the data in the buffer to disk.

2) appendfsync everysec

When appendfsync is set to everysec, Redis makes a fsync call every other second by default to write the data in the buffer to disk. But when the fsync call takes more than 1 second this time. Redis will adopt the strategy of delaying fsync and wait one more second. That is, the fsync is performed two seconds later, and this time the fsync will take place no matter how long it takes to execute. At this point, because the file descriptor is blocked during fsync, the current write operation is blocked. So the conclusion is that, in the vast majority of cases, Redis will fsync every other second. In the worst-case scenario, a fsync operation occurs every two seconds. This operation is called group commit in most database systems, which combines the data of multiple write operations and writes the log to disk at one time.

3) appednfsync always

When appendfsync is set to always, fsync is called once for each write operation, so the data is the safest, and of course, because fsync is executed every time, its performance is also affected.

3. Comparison of key technologies between Memcached and Redis

As in-memory data buffering systems, both Memcached and Redis have high performance, but they have great differences in key implementation technologies, which determines that they have different characteristics and different applicable conditions. Below we will compare the key technologies of the two in order to reveal the differences between the two.

Comparison of memory Management Mechanism between Memcached and Redis

For memory-based database systems such as Redis and Memcached, the efficiency of memory management is a key factor affecting system performance. Malloc/free function in traditional C language is the most commonly used method to allocate and release memory, but this method has great defects: first, the mismatched malloc and free for developers are easy to cause memory leakage; secondly, frequent calls will cause a large number of memory fragments can not be recycled, reducing memory utilization; finally, as a system call, the system overhead is much higher than that of general function calls. Therefore, in order to improve the efficiency of memory management, efficient memory management solutions will not directly use malloc/free calls. Both Redis and Memcached use self-designed memory management mechanisms, but there are great differences in implementation methods. The memory management mechanisms of both are described below.

Memory Management Mechanism of Memcached

Memcached uses Slab Allocation mechanism to manage memory by default, and its main idea is to divide the allocated memory into blocks of specific length to store key-value data records of corresponding length according to the predetermined size, so as to completely solve the problem of memory fragmentation. The Slab Allocation mechanism is only designed to store external data, that is, all key-value data is stored in the Slab Allocation system, while other memory requests for Memcached are requested through ordinary malloc/free, because the number and frequency of these requests determine that they will not affect the performance of the entire system.

The principle of Slab Allocation is quite simple. As shown in figure 3, it first requests a large chunk of memory from the operating system, splits it into blocks of various sizes Chunk, and divides blocks of the same size into groups of Slab Class. Among them, Chunk is the smallest unit used to store key-value data. The size of each Slab Class can be controlled by setting a Growth Factor when the Memcached is started. Assume that the value of Growth Factor in Figure 1 is 1.25, so if the size of the first set of Chunk is 88 bytes, the size of the second set of Chunk is 112 bytes, and so on.

Figure 3 Memcached memory management architecture

When Memcached receives the data sent by the client, it will first choose the most appropriate Slab Class according to the size of the received data, and then query the list of free Chunk in the Slab Class saved by Memcached to find a Chunk that can be used to store data. When a database expires or is discarded, the Chunk occupied by the record can be recycled and re-added to the free list. From the above process, we can see that Memcached's memory management system is efficient and does not cause memory fragmentation, but its biggest disadvantage is that it will lead to space waste. Because each Chunk allocates a specific length of memory space, variable-length data cannot make full use of that space. As shown in figure 4, caching 100Bytes of data into a 128byte Chunk wastes the remaining 28 bytes.

Figure 4 waste of storage space in Memcached

Memory Management Mechanism of Redis

The memory management of Redis is mainly realized through two files zmalloc.h and zmalloc.c in the source code. In order to facilitate memory management, Redis will store the size of the memory in the head of the memory block after allocating a piece of memory. As shown in figure 5, real_ptr is the pointer returned by redis after calling malloc. Redis stores the size of the memory block size in the header, the amount of memory occupied by size is known, is the length of type size_t, and then returns ret_ptr. When memory needs to be freed, the ret_ptr is passed to the memory manager. With ret_ptr, the program can easily calculate the value of real_ptr and then pass the real_ptr to free to release memory.

Figure 5 Redis block allocation

Redis records all memory allocations by defining an array whose length is ZMALLOC_MAX_ALLOC_STAT. Each element of the array represents the number of memory blocks allocated by the current program, and the size of the memory block is the subscript of that element. In the source code, this array is zmalloc_allocations. Zmalloc_allocations [16] represents the number of memory blocks of length 16bytes that have been allocated. There is a static variable used_memory in zmalloc.c to record the total amount of memory currently allocated. So, overall, Redis uses packaged mallc/free, which is much simpler than Memcached's memory management approach.

Comparison of Cluster implementation Mechanism between Redis and Memcached

Memcached is a full memory data buffer system. Although Redis supports data persistence, full memory is the essence of its high performance after all. As a memory-based storage system, the size of machine physical memory is the maximum amount of data that the system can hold. If the amount of data that needs to be processed exceeds the physical memory size of a single machine, you need to build a distributed cluster to expand storage capacity.

Distributed Storage of Memcached

Memcached itself does not support distributed storage, so distributed storage of Memcached can only be realized on the client side through distributed algorithms such as consistent hashing. Figure 6 shows the distributed storage implementation architecture of Memcached. Before the client sends data to the Memcached cluster, it will first calculate the target node of the data through the built-in distributed algorithm, and then the data will be sent directly to that node for storage. However, when the client queries the data, it also needs to calculate the node where the query data is located, and then send the query request directly to the node to obtain the data.

Figure 6 Memcached client distributed storage implementation

Distributed Storage of Redis

Compared with Memcached, which can only implement distributed storage on the client side, Redis prefers to build distributed storage on the server side. Although distributed storage has not been added to the stable version of Redis that has been released, the basic functions of Redis Cluster are already available in the Redis development version. It is expected that after version 2.6, Redis will release a stable version that fully supports distribution, no later than the end of 2012. Next we will briefly introduce the core idea of Redis Cluster according to the implementation in the development version.

Redis Cluster is an advanced version of Redis that is distributed and allows a single point of failure. It has no central node and is linearly scalable. Figure 7 shows the distributed storage architecture of Redis Cluster, in which nodes communicate with each other through binary protocol and nodes and clients communicate with each other through ascii protocol. In the data placement strategy, Redis Cluster divides the entire key value field into 4096 hash slots, and one or more hash slots can be stored on each node, that is, the maximum number of nodes supported by Redis Cluster is 4096. The distributed algorithm used by Redis Cluster is also simple: crc16 (key)% HASH_SLOTS_NUMBER.

Figure 7 Redis distributed architecture

In order to ensure the data availability under single point of failure, Redis Cluster introduces Master node and Slave node. As shown in figure 4, in Redis Cluster, each Master node has two corresponding Slave nodes for redundancy. In this way, the downtime of any two nodes in the entire cluster will not cause the data to be unavailable. When the Master node exits, the cluster automatically selects a Slave node to become the new Master node.

Figure 8 Master node and Slave node in Redis Cluster

Overall comparison between Redis and Memcached

Salvatore Sanfilippo, author of Redis, has compared the two memory-based data storage systems, and on the whole, it is relatively objective, which is summarized as follows:

1) performance comparison: since Redis only uses a single core, while Memcached can use multiple cores, Redis has higher performance than Memcached in storing small data on each core. In the data of more than 100k, the performance of Memcached is higher than that of Redis. Although Redis has recently optimized the performance of storing big data, it is still slightly inferior to Memcached.

2) comparison of memory usage efficiency: if you use simple key-value storage, the memory utilization of Memcached is higher, but if Redis uses hash structure to do key-value storage, its memory utilization will be higher than Memcached because of its combined compression.

3) Redis supports server-side data operations: compared with Memcached, Redis has more data structures and supports richer data operations. Usually in Memcached, you need to take the data to the client to make similar modifications and then set back. This greatly increases the number of network IO and data volume. In Redis, these complex operations are usually as efficient as normal GET/SET. So, if you need caching to support more complex structures and operations, then Redis would be a good choice.

About what the difference between redis and memcached is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report