Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the method of Redis high availability architecture design

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article focuses on "what is the method of Redis high-availability architecture design". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the method of Redis high availability architecture design?"

I. the design principle of high availability architecture

1. Overview

Qunar Redis cluster is a distributed high-availability architecture, and the whole architecture mainly consists of the following important parts:

Redis Server node: each node has one master, one slave and two instances, and multiple nodes form a complete cluster data, in which each node has only the master database to provide external services, and the slave database is only used for node high availability, data persistence and scheduled backup.

Zookeeper cluster: consists of five zk nodes. After the configuration of the Redis cluster is changed, the client is notified to reconnect.

Redis Sentinel cluster: consists of five Sentinel nodes for high availability of Reids Server nodes, master-slave switchover, failover, configuration updates, etc.

Configure the central cluster: a PXC cluster composed of five MySQL nodes is used to store the shard information of the Redis cluster, that is, the Master instance information of each node and the consistent hash value range of the assigned key.

Application client: listen for zk changes and obtain Redis instance information in the configuration center to connect.

2. Architecture schematic diagram

3. Client implementation

1) when the client establishes a connection according to the namespace of the Redis cluster, it will first look for the / config_addr node in the zk. The node stores the instance information of the configuration central cluster, and randomly selects a database instance to connect.

2) in the specific database table of the configuration center, query the connection configuration of the nodes of the cluster according to the namespace of Redis, and then establish a Redis connection.

3) after the client establishes a Redis connection, two threads are started:

A change in the address used to listen for zk. Each Redis cluster will have a / redis/namespace node in the zk. If the cluster configuration changes, the sentry will notify zk to update the value of this node. When the client is aware of the zk configuration change, it will go to the configuration center to obtain the new connection configuration and re-establish the connection.

A connection configuration used to poll the configuration center. To prevent zk notification from failing, the client will poll the configuration information of the configuration center every 10 seconds through this thread. If it is found that the configuration of the configuration center is different from that of the local cache, it will use the configuration of the configuration center to establish a new connection.

The diagram of the relationship between the client and other components is as follows:

4. Data slicing method

After the developer submits the Redis cluster application ticket information, DBA will plan the number of cluster sharding nodes N based on the memory size, QPS size and other major data in the work order. All nodes are equally allocated values in the range of 0,4294967295, that is, a total of 2 to the power of 32 key. After a key uses the murmurhash3 algorithm to calculate the hash value, it will only fall on one node of the cluster.

The schematic diagram of the sharding node is as follows:

The sharding node information is stored in the configuration center as follows:

5. Architectural features

The Quanr Redis High availability Architecture has the following characteristics:

Implement your own Redis client, the client no longer accesses Sentinel, and Sentinel is only responsible for high availability.

Centralized management of configuration is realized through ZK cluster and configuration center.

The port is regarded as a resource, that is, the master and slave instance of a node in the cluster uses a port, and the offline cluster port can be reused.

Weakens the status of the Sentinel machine and reduces the direct coupling between the Sentinel and the cluster.

The use of Sentinel machines has been reduced, and only 5 Sentinel machines are currently used to form clusters.

The client uses namespace to access the cluster, corresponding the port to namespace and the namespace to the business unit, making it convenient for DBA management and operation and maintenance, and transparent to the application.

6. Architectural limitations

The Quanr Redis High availability Architecture has the following limitations:

There are fewer clients supported. Currently, the client only supports Java and Python.

Rapid horizontal expansion is not supported. When the cluster is out of memory, you can quickly expand the memory size of each node instance, so as to increase the entire cluster size, but the memory size of a single instance also has a certain limit, which cannot be expanded indefinitely. When the number of cluster nodes needs to be increased, due to the change of the consistent hash range of each node, all key need to be redistributed. For larger clusters, the process is tedious and time-consuming.

The whole architecture depends on many components. Although the zookeeper, configuration center and Sentinel in the architecture are highly available clusters with multiple nodes, the more components you rely on, the greater the possibility of failure, and the difficulty and workload of operation and maintenance will increase, which undoubtedly requires higher requirements for operation and maintenance personnel.

Some Redis native features are not available. Due to client limitations, some Redis native features cannot be used, such as not supporting transactions, Lua scripts, and so on.

II. Security mechanism

Redis is designed to be accessible only to trusted users in a trusted environment, not to maximize security, but to optimize high performance and ease of use as much as possible. therefore, Redis does not have as strict permission control as relational database, so it is very dangerous to expose Redis instances directly on the network or let untrusted users directly access the TCP port of Redis.

In order to improve the security of using Redis, Redis Server used by Qunar has carried out some source code modifications on the official Redis version 4.0.14, adding a whitelist parameter trustedip, shielding some high-risk instructions, which cannot be executed by any client connection except the IP configured in trustedip. At the same time, in order to improve the performance of Redis, the master-slave instance is configured differently.

1. Clientcipher and IP whitelist

The Qunar Redis client does not connect to the Redis instance directly through TCP, but must first be verified by the cluster namespace and the unique clientcipher of the cluster, and then obtain the real connection information from the configuration center before connecting the Redis instance. At the same time, the whitelist mechanism filters the high-risk instructions in the client request to avoid unreasonable operations on the online Redis, and further enhances its security.

The client accesses the cluster using namespace and clientcipher.

Different namespace correspond to different clientcipher. When creating a cluster, the clientcipher is generated by encrypting the randomly generated password again.

Even if you know the password, you cannot use masked dangerous commands unless the IP address is on the whitelist.

Local login and IP whitelist login, the command is not restricted, convenient for DBA management and compatible with a variety of monitoring and statistics scripts.

IP whitelist can be configured dynamically, and a maximum of 32 IP whitelists are supported.

The IP whitelist feature involves modifying the code:

1) add the parameter trustedip to the configGetCommand method of config.c file

Void configGetCommand (client * c) {robj * o = c-> argv [2]; void * replylen = addDeferredMultiBulkLength (c); char * pattern = o-> ptr; char buf [128c]; int matches = 0; serverAssertWithInfo (cEncodedObject (o)); / * add trustedip parameter * / if (stringmatch (pattern, "trustedip", 0)) {sds buf = sdsempty (); int j; int numips; numips = server.trusted_ips.numips; for (j = 0) J

< numips; j++) { buf = sdscat(buf, server.trusted_ips.ips[j]); if (j != numips - 1) buf = sdscatlen(buf," ",1); } addReplyBulkCString(c,"trustedip"); addReplyBulkCString(c,buf); sdsfree(buf); matches++; } setDeferredMultiBulkLength(c,replylen,matches*2); } 2)在 server.c 文件的 processCommand 方法中增加对 issuperclient 的认证 typedef struct trustedIPArray { int numips; sds* ips; } trustedIPArray; 3)在 networking.c 文件中增加 isTrustedIP 方法 /* 判断客户端IP是否在IP白名单中 */ int isTrustedIP(int fd) { char ip[128]; int i, port; anetPeerToString(fd,ip,128,&port); if (strcmp(ip, "127.0.0.1") == 0) { return 1; } for (i = 0; i < server.trusted_ips.numips; i++) { if (strcmp(ip, server.trusted_ips.ips[i]) == 0) { return 1; } } return 0; } 4)在 networking.c 文件的 createClient 方法中增加 issuperclient 的设置 client *createClient(int fd) { client *c = zmalloc(sizeof(client)); /* passing -1 as fd it is possible to create a non connected client. * This is useful since all the commands needs to be executed * in the context of a client. When commands are executed in other * contexts (for instance a Lua script) we need a non connected client. */ if (fd != -1) { anetNonBlock(NULL,fd); anetEnableTcpNoDelay(NULL,fd); if (server.tcpkeepalive) anetKeepAlive(NULL,fd,server.tcpkeepalive); if (aeCreateFileEvent(server.el,fd,AE_READABLE, readQueryFromClient, c) == AE_ERR) { close(fd); zfree(c); return NULL; } ... /* 设置is_super_client */ if (isTrustedIP(fd)) { c->

Is_super_client = 1;} else {c-> is_super_client = 0;}. Return c;}

5) add the authentication of issuperclient to the processCommand method of server.c file

Int processCommand (client * c) {/ * The QUIT command is handled separately. Normal command procs will * go through checking for replication and QUIT will cause trouble * when FORCE_REPLICATION is enabled and would be implemented in * a regular command proc. * / if (! strcasecmp (c-> argv [0]-> ptr, "quit") {addReply (cMagneShared.ok); c-> flags | = CLIENT_CLOSE_AFTER_REPLY; return Clearer;}. / * Check if the user is authenticated * / * add is_super_client authentication * / if (! C-> is_super_client & & server.requirepass & &! C-> authenticated & & c-> cmd- > proc! = authCommand). Return paid OK;}

6) add checkCommandBeforeExec method to db.c file

/ * if it is super client or master, return 1, otherwise return 0 * because under master-slave, master (client) needs to execute a dangerous command to slave * / int checkCommandBeforeExec (client * c) {if (c-> is_super_client | | (server.masterhost & (c-> flags & CLIENT_MASTER)) {return 1;} addReplyError (c, "No permission to execute this command"); return 0;}

2. Block high-risk instructions

By modifying the Redis source code, some dangerous instructions are shielded on the Server side, stipulating that only client connections checked by whitelist can execute these instructions. Check before executing the high-risk instruction. If you need to mask the save instruction, you can add the checkCommandBeforeExec check to the first line of the saveCommand method of the rdb.c file.

Void saveCommand (client * c) {if (! checkCommandBeforeExec (c)) return; / * check before executing instructions, if not directly return * / if (server.rdb_child_pid! =-1) {addReplyError (c, "Background save already in progress"); return;} rdbSaveInfo rsi, * rsiptr; rsiptr = rdbPopulateSaveInfo (& rsi); if (rdbSave (server.rdb_filename,rsiptr) = = C_OK) {addReply (cscene shared.ok) } else {addReply (cmam shared.err);}}

The high-risk instructions to block are:

More time-consuming class instructions: info, keys *.

Clear data class instructions: shutdown, flushdb, flushall.

Data persistence class instructions: save, bgsave, bgrewriteaof.

Configuration class directives: config get, config set, config rewrite.

Operation and maintenance management instructions: slaveof, monitor, client list, client kill.

Wherever the Redis source code involves these instructions, you need to add the checkCommandBeforeExec method to check.

3. Configuration optimization

Differential configuration is carried out for the master and slave instances of each node in the cluster, because each node has only the master library to provide services, in order to maximize the concurrency ability of the master database, some time-consuming operations can be put to the slave database to perform.

Several major configurations are as follows:

The main library shuts down bgsave and bgrewriteaof functions.

Turn on the aof function from the library, schedule to rewrite aof files regularly, and free up server disk space.

Perform bgsave operations regularly from the library to back up rdb files.

Enable the slave-read-only parameter from the library, read-only.

When the Redis cluster is deployed, there will be a scheduled task to check the roles of each Redis instance on the server, modify the relevant configuration parameters according to the roles, and persist the modified ones to the configuration file.

III. Automated operation and maintenance

1. Initialize the system environment

Before deploying the cluster on the Redis server, you first need to initialize the system environment, add these environment configurations to the spec file of Redis's rpm packager, and automatically change the relevant configuration when you install the Redis package. The main system environment parameters are as follows:

Sed-I-r'/ vm.overcommit_memory.*/d' / etc/sysctl.conf sed-I-r'/ vm.swappiness.*/d' / etc/sysctl.conf sed-I-r'/ vm.dirty_bytes.*/d' / etc/sysctl.conf echo "vm.overcommit_memory = 1" > > / etc/sysctl.conf echo "vm.swappiness = 0" > > / etc/sysctl.conf echo "vm.dirty_bytes = 33554432" > > / etc/sysctl.conf / sbin/sysctl-Q-p / etc/sysctl.conf groupadd redis > / dev/null 2 > & 1 | | true useradd-M-g redis redis-s / sbin/nologin > / dev/null 2 > & 1 | | true sed-I-r'/ redis soft nofile.*/d' / etc/security/limits.conf sed-I-r'/ redis hard nofile.*/d' / etc/security/limits.conf echo "redis soft nofile 288000" > > / etc/security/limits.conf echo "redis hard nofile 288000" > > / etc/security/limits.conf sed-I-r'/ redis soft nproc.*/d' / etc/security/limits.conf sed-I-r'/ redis hard nproc.*/d' / etc/security/limits.conf echo "redis soft nproc unlimited" > > / etc/security/limits.conf echo "redis hard nproc unlimited" > > / etc/security/limits.conf echo never > / sys/kernel/mm/transparent_hugepage/enabled

2. Unified operation and maintenance management tools

The unified management suite of Qunar Redis cluster encapsulates the scripts of system environment initialization, instance installation, instance startup, instance shutdown, monitoring alarm, scheduled tasks and so on, and realizes automatic operations such as monitoring, statistics, registration and so on.

/ etc/cron.d/appendonly_switch / etc/cron.d/auto_upgrade_toolkit / etc/cron.d/bgrewriteaof / etc/cron.d/check_maxmemory / etc/cron.d/dump_rdb_keys / etc/cron.d/rdb_backup / etc/profile.d/q_redis_path.sh / xxx/collectd/etc/collectd.d/collect_redis.conf / xxx/collectd/lib/collectd/collect_redis.py / xxx/collectd/ Share/collectd/types_redis.db / xxx/nrpe/libexec/q-check-redis-cpu-usage / xxx/nrpe/libexec/q-check-redis-latency / xxx/nrpe/libexec/q-check-redis-memory-usage / xxx/nrpe/libexec/q-check-zookeeper-ruok / xxx/redis/tools/cron_appendonly_switch.sh / xxx/redis/tools/cron_bgrewrite_aof.sh / xxx/redis/tools/cron_check_maxmemory.sh / Xxx/redis/tools/cron_dump_rdb_keys.sh / xxx/redis/tools/cron_rdb_backup.sh / xxx/redis/tools/dump_rdb_keys.py / xxx/redis/tools/redis-cli5 / xxx/redis/tools/redis-latency / xxx/redis/tools/redis_install.sh / xxx/redis/tools/redis_start.sh / xxx/redis/tools/redis_stop.sh

3. Stand-alone multi-instance multi-version deployment

Qunar Redis's installation kit supports stand-alone multi-instance installation. The installation script provides options and configuration file templates to customize the installation of different versions of Redis. Currently, the supported Redis Server versions are 2.8.6,3.0.7 and 4.0.14.

/ * installation package and Redis instance directory structure * /. ├── multi │ ├── server_2800 / * Redis2.8.6 package * / │ │ ├── bin │ │ utils │ ├── server_3000 / * Redis3.0.7 package * / │ │ ├── bin │ │ └── utils │ └── server_4000 / * Redis4.0.14 package * / bin utils ├── instance data directory with port 10088 of Redis redis10088 / * Used to store the configuration files, logs, AOF files and RDB files of the instance * / │ ├── bin │ └── utils ├── redis10803 / * port 10803 Redis instance data directory, used to store the instance configuration files, logs, AOF files and RDB files * / │ ├── bin │ └── utils ├── redis11459 / * Redis instance data directory with port 11459 Configuration files, logs, AOF files and RDB files used to store the instance * / │ ├── bin │ └── utils / * usage of the Redis instance installer * / Usage: redis_install.sh-P-v-p-m required parameter:-P redis port-p redis password-v redis version to be installed It is highly recommended that version 4.0-the maximum memory size allowed for m redis instances, in G optional parameter:-- cluster cluster mode Version > = 3.0-- testenv Test Environment example: sudo redis_install.sh-P 6379-v 4.0-m 20-p 1qaz2wsx

4. Use git to manage Redis Sentinel

Use git to centrally manage all Sentinel configurations. When a change occurs in one place, all servers in the Sentinel cluster are pulled and updated synchronously at the same time. At the same time, detailed commit log makes it easy to track the modification history of the configuration file. The Qunar Redis Sentinel has the following characteristics:

A set of sentinels manage only one node, that is, monitor and fail over only a group of Redis (one master, one slave or one master and multiple slaves) instances with the same port number.

The sentry is only responsible for the high availability of the node, and the client does not need to access the Redis instance through the sentry.

The Sentinel configuration file is uniformly managed by git. The configuration file is uniformly named in the manner of [node port number + 20000] _ cluster namespace.conf, such as 30708_redis_delay_test.conf. The information of all nodes in the cluster can be obtained through the port number or namespace of any node in the cluster.

When the node monitored by the Sentinel switches, the master database configuration of the corresponding node in the configuration center and the dataVersion of the corresponding node in the zookeeper will be updated. When the client detects the change in zookeeper, it will go to the configuration center to obtain the latest information of the node for reconnection. At the same time, the Sentinel will send the handover information to DBA and the operation and maintenance event platform.

The IP of the Sentinel server is added to the whitelist of Redis instances by default, that is, any Redis instance can be accessed through the Sentinel server for all operations. Therefore, the permissions of the Sentinel server must be strictly controlled, and only DBA has the right to log in.

5. Operation and maintenance platform

The standardized processes mentioned above provide strong support for the entire operation and maintenance platform of Qunar Redis. At present, more than 90% of the operation and maintenance operations of Qunar Redis have realized platform automation, including work order application and audit, cluster deployment, instance migration, cluster vertical scaling, information viewing of different dimensions (server, cluster, instance), etc. The following mainly describes the implementation process of Qunar Redis cluster deployment and instance migration.

Cluster deployment

The main steps for Qunar Redis cluster deployment are as follows:

1) the developer submits a cluster application ticket through the platform to initiate the process, and the process is reversed to DBA after the TL audit.

2) DBA plans the cluster size according to the information of applying for a ticket, such as the number of nodes, memory size, deployment server room, Redis version, etc.

3) enter the deployment information on the Redis cluster deployment page according to the cluster plan.

4) after submitting the deployment information, the platform will automatically filter the servers with idle resources for cluster deployment.

5) after the deployment of the cluster is completed, the DBA will be notified on the Qtalk, and the clientcipher of the cluster will notify the developer by email. At the same time, the deployment of the cluster will be pushed to the company's OPS event platform to keep the operation record.

Instance migration

Instance migration is mainly divided into two categories in the process of operation and maintenance:

1) some instances are migrated. When a server does not have enough available resources, migrate some of the instances on this machine to a server where other resources are available. Enter the source host and the current host of the instance on the page, and the migration task will be automatically generated after submission.

2) Migration of the whole machine instance. It is mainly to replace the over-guaranteed server or when the server needs downtime maintenance, all instances on the machine are automatically migrated to the server of other resource comparison space. Enter the hostname that needs to be migrated on the page, and the migration task will be automatically generated after submission.

After the migration task starts, the entire migration process does not require human intervention, and the execution progress is automatically updated and the log is output.

At this point, I believe you have a deeper understanding of "what is the method of Redis high-availability architecture design". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report