Comparison of advantages and disadvantages and selection of Memcache and Redis distributed cache cluster scheme features 07/19 Update SLTechnology News&Howtos

Comparison of advantages and disadvantages and selection of Memcache and Redis distributed cache cluster scheme features

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Memcache and Redis distributed cache cluster features use scenarios advantages and disadvantages comparison and selection is what kind of, I believe many inexperienced people are helpless about this, this article summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

Features of distributed cache cluster scheme Advantages and disadvantages of scenarios and selection

Distributed cache features:

1)High performance: Disk I/O tends to become a performance bottleneck when traditional databases face massive data accesses, resulting in excessive response latency. Distributed cache uses cache memory as storage medium of data object, and data is stored in key/value form, which can obtain DRAM level reading and writing performance in ideal situation;

2)Dynamic scalability: Supports elastic scalability, providing predictable performance and scalability by dynamically adding or reducing nodes to cope with changing data access loads, while maximizing resource utilization;

3)High Availability: Availability includes both data availability and service availability. Based on redundancy mechanism to achieve high availability, no single point of failure (single point of failure), support automatic failure detection, transparent implementation of failover, will not cause cache service interruption or data loss due to server failure. Automatically balance data partitioning while dynamically scaling, while ensuring continuous availability of cache services;

4)Ease of use: provides a single view of data and management;API interface is simple and topology independent; dynamic expansion or failure recovery without manual configuration; automatic selection of backup nodes; most cache systems provide graphical management console for unified maintenance;

5)Distributed code execution: Transfer the task code to each data node for parallel execution, and aggregate the results returned by the client, thus effectively avoiding the movement and transmission of cached data. The latest Java Data Grid specification JSR-347 adds support for distributed code execution and Map/reduce APIs, and major distributed caching products such as IBM WebSphere eXtreme Scale,VMware GemFire,GigaSpaces XAP, and Red Hat Infinispan support this new programming model.

Distributed cache application scenarios:

1)Page cache. Used to cache content fragments of Web pages, including HTML, CSS and images, etc., mostly used in social networking sites;

2)Apply object caching. Cache system as the second level cache of ORM framework provides services to the outside world, the purpose is to reduce the load pressure of database, accelerate application access;

3)Status cache. Cache includes session state and state data during horizontal expansion. This kind of data is generally difficult to recover and requires high availability. It is mostly used in high availability clusters.(To solve the session synchronization problem of distributed Web deployment)

4)Parallel processing. Usually involves a large number of intermediate calculations that need to be shared;

5)Event Handling. Distributed cache provides continuous query processing technology for event stream to meet real-time requirements;

6)Extreme transaction processing. Distributed cache provides high throughput and low latency solutions for transactional applications, supports high concurrency transaction request processing, and is widely used in railway, financial services, telecommunications and other fields.

7) Cloud computing field provides distributed cache services (e.g.: Qingyun,

UnitedStack etc.)

6）

Any place where cache is needed, solve the problem that the local cache data is too small. Distributed cache can effectively prevent local cache failure database avalanche phenomenon.

Comparison of two open source cache systems, Memcache VS Redis:

Redis not only supports simple k/v type data, but also provides storage of list, set, zset, hash and other data structures. memcache only supports simple data types and requires clients to handle complex objects themselves.

2. Redis supports data persistence, which can keep the data in memory in disk and load it again for use when restarting (PS: persistence in rdb, aof). Redis uses the copy on write mechanism of the fork command. When taking a snapshot, fork the current process into a child process, then loop through all the data in the child process, writing the data as an RDB file. The full name of AOF log is append only file. From the name we can see that it is an append write log file. Unlike binlogs in general databases, AOF files are recognizable plain text, and their contents are standard Redis commands. Of course, not all commands sent to Redis must be recorded in the AOF log, only those that cause data changes will be appended to the AOF file. Then each command that modifies data generates a log. (PS: memcache does not support persistent data storage)

3. Because Memcache has no persistence mechanism, all cache data is invalid due to downtime. Redis is configured to be persistent. After shutdown and restart, it will automatically load the data at the time of shutdown into the cache system. Better disaster preparedness mechanisms.

Memcache can use Magent for consistent hashing on the client side to do distribution. Redis supports server-side distribution (PS:Twemproxy/Codis/Redis-cluster multiple distributed implementations)

Memcached's simple limits are key and Value limits. The maximum key length is 250 characters. The acceptable storage data cannot exceed 1MB (modifiable configuration file size), because this is the maximum for a typical slab and is not suitable for virtual machine use. Redis supports Key lengths up to 512k.

Redis uses a single-threaded model to ensure that data is submitted in order. Memcache needs to use cas to ensure data consistency. CAS (Check and Set) is a mechanism to ensure concurrency consistency, belonging to the category of "optimistic lock"; the principle is very simple: take the version number, operate, compare the version number, if consistent, then operate, if inconsistent, give up any operation

CPU utilization. Since Redis uses only a single core, Memcached can use multiple cores, so on average Redis performs better than Memcached when storing small data per core. Memcached outperforms Redis for data over 100k. (PS: Redis can improve CPU utilization by opening multiple instances. Memcache is single-threaded by default. It needs to compile specified parameters to support multithreading.) Since distributed caches are IO-intensive systems, performance is largely limited by network traffic, memcache uses the libevent network library, and redis implements its own set of libraries for communication. Threads are also not a significant factor in throughput. For example, in the first point, in general, the speed at which the program processes memory data is much higher than the speed at which the network card receives it. The advantage of using threads is that you can handle multiple connections at once, and in extreme cases, it may improve response time. However, single-threaded is sometimes faster than multi-threaded or multi-process, and it does not increase the overhead of context switching, which means that the code is simpler and the execution efficiency is higher.)

Memcache memory management: using Slab Allocation. The principle is fairly simple, pre-assigning a series of fixed-size groups and then selecting the most appropriate block storage based on the data size. Avoid memory fragmentation. memcached By default, the maximum value of a slab is 1.25 times that of the previous slab. 8, redis memory management: Redis by defining an array to record all the memory allocation, Redis is wrapped malloc/free, compared to Memcached memory management method, to be much simpler. Because malloc first searches for available space allocations in managed memory in a linked list manner, memory fragmentation is relatively large.

Summary:

In fact, for enterprise selection Memcache and Redis, more should be seen in business use scenarios (because Memcache and Redis both have high enough performance and stability). If a business scenario requires persistent caching or caching that supports multiple data structures, Redis is the best choice.

(PS: Redis cluster solution is also better than Memcache, Memcache cluster solution with consistent hash on client side, Redis adopts server-side cluster solution without center)

To sum up: In order for the caching system to support more business scenarios, Redis is better. (More and more manufacturers are choosing Redis.)

Next, focus on the comparison of Redis three cluster solutions, Twemproxy VS Codis VS Redis-cluster

There are three common solutions for Redis clustering:

Client-side fragmentation: This scheme places fragmentation work on the business application side, and the program code directly accesses multiple Redis instances in a distributed manner according to preset routing rules. The advantage of this is that it does not rely on third-party distributed middleware, the implementation method and code are controlled by themselves, and can be adjusted at any time without worrying about stepping on the pit. This is actually a static fragmentation technique. Redis instances increase or decrease, you have to manually adjust the fragmentation program. Open source products based on this fragmentation mechanism are still rare. This fragmentation mechanism performs better than proxy (one less intermediate distribution step). But the disadvantage is that upgrading trouble, strong personal dependence on R & D personnel-need to have strong program development ability to back up. If the main programmer leaves, the new person in charge may choose to rewrite it again. Therefore, in this way, the operability is poor. In case of failure, location and solution have to be solved in cooperation with R & D and operation and maintenance, and the failure time becomes longer. Therefore, this solution is difficult to standardize operation and maintenance, and is not suitable for small and medium-sized companies (unless there is enough DevOPS).

2. Agent fragmentation: This scheme hands over the fragmentation work to a special agent program. The agent receives data requests from the business program, distributes them to the correct Redis instance and returns them to the business program according to routing rules. Under this mechanism, third-party agents are generally selected (rather than self-developed), because there are multiple Redis instances on the backend, so such programs are also called distributed middleware. The advantage of this is that the business program does not need to care about the backend Redis instance, and it is also convenient for operation and maintenance. Although there will be some performance loss, it is relatively tolerable for memory read-write applications such as Redis. This is our recommended cluster implementation. Like Twemproxy, an open source product based on this mechanism, Codis is one of them, which is widely used.

Server-side fragmentation: built on a decentralized architecture (no proxy node performance bottleneck). Redis-Cluster is the official solution based on this architecture. Redis Cluster maps all keys into 16384 slots, each Redis instance in the cluster is responsible for a part, and business programs operate through the integrated Redis Cluster client. The client can send a request to any instance. If the required data is not in the instance, the instance guides the client to automatically read and write data to the corresponding instance. The member management of Redis Cluster (node name, IP, port, status, role), etc., is regularly exchanged and updated through pairwise communication between nodes.

Next, explain the advantages and disadvantages of each solution on behalf of the product implementation:

Twemproxy：

Twemproxy is a proxy sharding mechanism open sourced by Twitter. Twemproxy, as a proxy, can accept access from multiple programs, forward it to each Redis server in the background according to routing rules, and then return to the original route. This solution logically solves the problem of the carrying capacity of a single Redis instance. Of course, Twemproxy itself is also a single point, and you need to use Keepalived to make a high availability scheme. Twemproxy has been the most widely used, stable, and proven distributed middleware over the years. However, he still had many inconveniences. The biggest pain point of Twemproxy is that it cannot expand/shrink smoothly. This increases the difficulty of operation and maintenance: the sudden increase in business volume requires the addition of Redis servers; the shrinking of business volume requires the reduction of Redis servers. But for Twemproxy, it is basically difficult to operate. Twemproxy is more like server-side static sharding. Sometimes, in order to avoid the expansion demand caused by the sudden increase in traffic, they are even forced to open a new Redis cluster based on Twemproxy. Another pain point of Twemproxy is that it is unfriendly and does not even have a control panel.

Codis：

Codis was open-sourced by Pea pod in November 2014 and is based on Go and C development. It is one of the outstanding open-source software developed by Chinese people recently. It has been widely used in various Redis business scenarios of pea pods. From various stress tests, the stability meets the requirements of efficient operation and maintenance. Performance is much improved, initially 20% slower than Twemproxy; now nearly 100% faster than Twemproxy (condition: multiple instances, average Value length). Codis has a visual O & M management interface. Codis is undoubtedly a new solution to the shortcomings of Twemproxy. So the synthesis aspect will be due to Twemproxy a lot. More and more companies are choosing Codis. Codis introduces the concept of groups, each Group consisting of one Redis Master and at least one Redis Slave, which is one of the differences from Twemproxy. The benefit of this is that if there is a problem with the current Master, operations personnel can switch to Slave "self-service" via Dashboard without having to carefully modify the program configuration file. To support Auto Rebalance, the manufacturer modified the Redis Server source code and called it Codis Server. Codis uses a pre-sharding mechanism, which is defined in advance and divided into 1024 slots (that is, it can support up to 1024 Codis Servers on the backend). These routing information are stored in ZooKeeper.

Redis-cluster：

reids-cluster was introduced in redis 3.0, supporting the Redis distributed cluster deployment model. A decentralized architecture is adopted. All redis nodes are interconnected with each other (PING-PONG mechanism), internally using binary protocol to optimize transmission speed and bandwidth. A failure of a node takes effect when it is detected by more than half of the nodes in the cluster. The client is directly connected to the redis node and does not require an intermediate proxy layer. The client does not need to connect all nodes in the cluster, but only any available node in the cluster, reducing the proxy layer and greatly improving performance. redis-cluster maps all physical nodes to [0-16383]slot,cluster is responsible for maintaining the relationship between nodeslotkeys. Jedis already supports Redis-cluster. Redis-cluster is undoubtedly the best option in terms of computing architecture or performance. (PS: Although Redis-cluster has an advantage in scheme selection, since Redis-cluster has just been launched, although the official propaganda has been released as a document version, the stability has yet to be verified)

After reading the above content, do you master the advantages and disadvantages of Memcache and Redis distributed cache cluster solution features and how to select the method? If you still want to learn more skills or want to know more related content, welcome to pay attention to the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.