Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of sentinel failover in Redis

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly shows you the "sample analysis of sentinel failover in Redis", which is easy to understand and clear. I hope it can help you solve your doubts. Let me lead you to study and learn the article "sample analysis of sentinel failover in Redis".

When more than two Redis instances form a master-slave relationship, the cluster they form has a certain degree of high availability: when master fails, slave can become the new master to provide read and write services, and this operation mechanism is called failover.

So who will find the fault of master and make failover decision?

One way is to maintain a daemo process that monitors all master-slave nodes, as shown in the following figure:

There is a master and two slave in a Redis cluster, and the daemon process monitors these three nodes. However, the availability of daemon cannot be guaranteed because it is a single node. Multiple daemon needs to be introduced, as shown in the following figure:

Multiple daemon solves usability problems, but consistency issues arise. How do you agree on whether a master is available or not? For example, if the above two daemon1 and master networks are not connected, and the daemon and master connections are smooth, does the mater node need failover at this time?

Redis's sentinel provides a set of interaction mechanism between multiple daemon. Multiple daemon form a cluster to form a sentinel cluster, and daemon nodes are also known as sentinel nodes. As shown in the following figure:

These nodes communicate, elect and negotiate with each other, and show consistency in fault discovery and failover decision of master nodes.

The sentinel cluster monitors any number of master and the slave under the master, and automatically upgrades the offline master from a slave to a new master instead of continuing to process command requests.

Start and initialize Sentinel

You can use the command to start a Sentinel:

. / redis-sentinel.. / sentinel.conf

Or command:

. / redis-server.. / sentinel.conf-- sentinel

When a Sentinel starts, it needs to perform the following steps:

Initialize the server

Sentinel is essentially a Redis server running in a special mode. It performs a different job and initialization process than an ordinary Redis server. For example, a normal Redis server initialization loads RDB or AOF files to recover data, while Sentinel does not load when it starts, because Sentinel does not use databases.

Replace the code used by an ordinary Redis server with Sentinel-specific code

Replace some of the code used by ordinary Redis servers with Sentinel-specific code. For example, an ordinary Redis server uses server.c/redisCommandTable as the server's command table:

Truct redisCommand redisCommandTable [] = {{"module", moduleCommand,-2, "as", 0moduleCommand,-2, "as", 0meme 0re0}, {"get", getCommand,2, "rF", 0meme zero}, {"set", setCommand,-3, "wm", 0meme nulliparum, {"setnx", setnxCommand,3, "wmF", 0meme zero}, {"setnx", setnxCommand,3, "wmF", 0meme nulliparum, {"setex", setexCommand,4, "wm", zero null} {"psetex", psetexCommand,4, "wm", 0 psetex, 1 append, appendCommand,3, "wm", 0 psetex, 1 append, 0 wm, {"del", delCommand,-2, "w", 0meme zero zero}, {"unlink", unlinkCommand,-2, "wF", 0meme null zero}, {"exists", existsCommand,-2, "rF", 0memnum nulliparum, {"exists", "existsCommand,-2,"rF", 0paramerie 1pime, {"setbit", "setbitCommand,4,"wm", 0pari null zero}, {"getbit", getbitCommand,3, "rF", 0NULLjue 1}, {"getbit", getbitCommand,3, "rF", 0nullide 1}, {"getbit", getbitCommand,3, "rF" 0}, {"bitfield", bitfieldCommand,-2, "wm", 0 wm, 0 bitfield, 1 setrange, setrangeCommand,4, "wm", 0 wm, {"setrange", setrangeCommand,4, "wm", 0 zero, {"getrange", getrangeCommand,4, "r", 0 zero, {"substr", getrangeCommand,4, "r", 0 zero 0}, {"substr", getrangeCommand,4, "r", 0 zero zero}, {"incr", incrCommand,2, "wmF", 0NULLML 1 0}, {"incr", incrCommand,2, "wmF" 0}, {"decr", "decrCommand,2,"wmF", 0meme1, "mget", "mgetCommand,-2,"rF", 0meme, "mget", "mgetCommand,-2,"rF", 0meme, 1meme, "0}, {" rpush "," rpushCommand,-3, "" wmF "," 0null, "0null," wmF "," 0 "," lpush "," lpushCommand,-3, "" wmF "," 0nul "," 0 "," zero "," zero, " }

Sentinel uses sentinel.c/sentinelcmds as the server list, as shown below:

Struct redisCommand sentinelcmds [] = {{"ping", pingCommand,1, "", 0meme zero, sentinel ", sentinelCommand,-2,", 0lle, 0}, {" subscribe "," subscribeCommand,-2, ", 0lle, 0}, {" unsubscribe ", unsubscribeCommand,-1,", 0nul, zero, 0}, {" psubscribe ", psubscribeCommand,-2,", zero, zero. 0 punsubscribe 0}, {"punsubscribe", punsubscribeCommand,-1, "", 0publish ", sentinelPublishCommand,3,", 0" publish "," sentinelPublishCommand,3, "," 0 ", 0" info "," sentinelInfoCommand,-1, "0}, {" sentinelInfoCommand,-1 "," 0 ", 0" role ", sentinelRoleCommand,1," l ", 0 zero 0}, {" client ", clientCommand,-2," rs ", 0 minute 0}, {" client ", clientCommand,-2," rs " 0}, {"shutdown", shutdownCommand,-1, "", 0 auth, authCommand,2, "sltF", 0 auth, 0 authCommand,2, 0 sltF, 0 auth, 0, 0, 0, 0, 0}

Initialize Sentinel statu

The server initializes a sentinel.c/sentinelState structure (saves all the state in the server related to the Sentinel function).

Struct sentinelState {char myid [config _ RUN_ID_SIZE+1]; / * This sentinel ID. * / / current era, used to implement failover uint64_t current_epoch; / * Current epoch. * / / the key of the monitored primary server / / dictionary is the name of the primary server / / the value of the dictionary is a pointer to the sentinelRedisInstances structure dict * masters; / * Dictionary of master sentinelRedisInstances. Key is the instance name, value is the sentinelRedisInstance structure pointer. * / / whether to enter tilt mode int tilt; / * Are we in TILT mode? * / / the number of scripts currently being executed int running_scripts; / * Number of scripts in execution right now. * / / time to enter tilt mode mstime_t tilt_start_time; / * When TITL started. * / / the time of the last execution of the processor mstime_t previous_time; / * Last time we ran the time handler. * / / A FIFO queue that contains all the user scripts list * scripts_queue; / * Queue of user scripts to execute that need to be executed. * / char * announce_ip; / * IP addr that is gossiped to other sentinels if not NULL. * / int announce_port; / * Port that is gossiped to other sentinels if non zero. * / unsigned long simfailure_flags; / * Failures simulation. * / int deny_scripts_reconfig; / * Allow SENTINEL SET... To change script paths at runtime? * /}

Initializes the list of monitoring master servers for Sentinel based on the given configuration file

Initializing the Sentinel state causes the initialization of the masters dictionary, while the initialization of the master dictionary is based on the loaded Sentinel configuration file.

The key of the dictionary is the name of the monitoring master server, and the value of the dictionary is the sentinel.c/sentinelRedisInstance structure corresponding to the monitored master server.

The attributes of the sentinelRedisInstance structure are as follows:

Typedef struct sentinelRedisInstance {/ / identity value, which records the type of instance and the current status of the instance int flags; / * See SRI_... The name of the defines * / / instance / / the name of the master server is set by the user in the configuration file / / the name of the slave server and the name of the Sentinel are automatically set to ip:port by Sentinel, for example, "127.0.0.1 char 26379" char * server / * Master name from the point of view of this sentinel. * / / the ID char * runid; / * Run ID of this instance, or unique ID if is a Sentinel.*/ configuration era run by the instance, which is used to implement the failover uint64_t config_epoch; / * Configuration epoch. * / / the address of the instance sentinelAddr * addr; / * Master host. The value set by the * / sentinel down-after-milliseconds option / / the number of milliseconds before the instance does not respond will be judged as subjectively down mstime_t down_after_period; / * Consider it down after that period. * / the quorum / / in the sentinel monitor option determines that this instance is the number of votes required for objective referral (objective down) unsigned int quorum;/* Number of sentinels that need to agree on failure. The numreplicas value of the * / sentinel parallel-syncs option / / when performing a failover operation, the number of slaves int parallel_syncs; / * How many slaves to reconfigure at same time that can synchronize the new master server at the same time. The value of the * / sentinel failover-timeout option / / the maximum time limit for refreshing the failover status mstime_t failover_timeout; / * Max time to refresh failover state. * /}

For example, when starting Sentinel, the following configuration file is configured:

# sentinel monitor sentinel monitor master1 127.0.0.1 6379 2# sentinel down-after-milliseconds sentinel down-after-milliseconds master1 30000# sentinel parallel-syncs sentinel parallel-syncs master1 1# sentinel failover-timeout sentinel failover-timeout master1 900000

Then Sentinel creates the instance structure shown in the following figure for the primary server master1:

The Sentinel status and masters dictionary are organized as follows:

Create a network connection to the primary server

Create a network connection to the monitored primary server, and Sentinel will become the client of the primary server, sending commands to the primary server and replying to get information from the command.

Sentinel creates two asynchronous network connections to the primary server:

Command connection, used to send commands to the master server and receive command replies

Subscribe to the connection, subscribe to the main server's _ sentinel_:hello channel

Sentinel sends and acquires information

By default, Sentinel sends INFO commands to the monitored master and slave through a command connection at a frequency of once every ten seconds.

Through the reply of master, you can get the information of master itself, including the server running ID on which the run_ id domain record is running, and the server role of the Role domain record. In addition, you will get all the slave server information under master, including the ip address and port number of slave. Sentinel does not need the user to provide address information from the server. The ip address and port number of the slave returned by the master can automatically discover the slave.

When Sentinel discovers that there is a new slave in master, Sentinel will create an instance of the new slave and Sentinel will also create a command connection and subscription connection to slave.

Based on the reply from slave's INFO command, Sentinel extracts the following information:

Running ID run_id of 1.slave

The role of 2.slave role

Ip address and port port of 3.master

Connection status of 4.master and slave master_link_status

Priority slave_priority of 5.slave

Replication offset slave_repl_offset of 6.slave

By default, Sentinel sends a message over a command connection to the _ sentinel_:hello channel of all monitored master and slave at a frequency of every two seconds.

Send a command in the following format:

PUBLISH _ sentinel_:hello ","

The meaning of the parameters related to the above command:

Parameter meaning s_ipSentinel ip address s_portSentinel port number s_runidSentinel running IDm_name master server name m_ip master server IP address m_port master server port number m_epoch master server current configuration era

After Sentinel establishes a subscription connection with master or slave, Sentinel sends a subscription to the _ sentinel_:hello channel through the subscription connection until the Sentinel is disconnected from the server

The command is as follows:

SUBSCRIBE sentinel:hello

As shown in the figure above, for each server connected to Sentinel, Sentinel can both send information to the server channel _ sentinel_:hello channel through a command connection and receive information from the server's _ sentinel_:hello channel through a subscription connection.

Sentinel will perceive each other, and the newly joined sentinel will post a message to master's _ sentinel_:hello channel, including its own message, and other sentinel subscribers to this channel will discover the new sentinel. The new sentinel and other sentinel then create persistent connections.

The interconnected Sentinel can exchange information. The sentinels dictionary in the instance structure created by Sentinel for master holds all the other Sentinel information that also monitors the primary server, except for Sentinel itself.

As mentioned earlier, sentinel creates an instance for slave (in the slaves dictionary of the master instance). Now we also know that we have also created other instances of sentinel (in the sentinels dictionary of the master instance) by exchanging information with each other through sentinel. Let's sort out the structure of an instance stored in sentinel, as shown in the following figure:

From the figure above, you can see that the keys of slave and sentinel dictionaries are composed of their ip addresses and port ports in the format of ip:port, and the values of their dictionaries are their corresponding sentinelRedisInstance instances.

Fault Discovery of master

Subjectively unavailable

By default, Sentinel sends PING commands to all master with which it has created command connections (including master, slave, and other Sentinel) at a frequency of once per second, and determines whether the instance is online by the PING command reply returned by the instance.

The PING command reply is divided into the following two situations:

Valid reply: the instance returns one of three replies: + PONG,-LOADING,-MASTERDOWN.

Invalid reply: any reply other than the valid reply above or no return within the specified time limit

If the settings in the Sentinel configuration file down-after-milliseconds millisecond (each sentinel may be configured differently), and continuously returns an invalid reply to Sentinel, then sentinel puts the instance into a subjective offline state and opens the SRI_S_DOWN flag in the flags attribute of the instance maintained in sentinel, such as master as follows:

Objectively unavailable

After the sentinel discovers the subjective unavailable state, it sends the "subjective unavailable state" to other sentinel for confirmation. When the number of confirmed sentinel nodes > = quorum, the master is determined to be objectively unavailable, and then enters the failover process.

As mentioned above, send subjective unavailable status to other sentinel using the following command:

SENTINEL is-master-down-by-addr

The meaning of each parameter is as follows:

Ip: the ip address of the master server judged by sentinel to be subjectively offline

Port: the port address of the master server judged by sentinel to be subjectively offline

Configuration era of current_epoch:sentinel, used to elect leader Sentinel

Runid: the running ID,* of * or Sentinel can be used to detect the objective offline status of the master server. The running ID of Sentinel is used to elect the leader Sentinel

Sentinel that receives the above command will return a Multi Bulk reply with three parameters:

1) the target sentinel checks the master result. 1:master has been deactivated and 2:master is not offline.

2) in two cases, * means only to detect the offline status of the master, otherwise it means the running ID of the local leader Sentinel (electing the leader Sentinel)

3) when leader_runid is, leader_epoch is always 0. If not, it indicates the configuration era of the local header Sentinel of the target Sentinel (used to elect the header Sentinel)

Where the number of nodes is limited to the quorum configured in the sentinel configuration file

Sentinel monitor

Quorum option, different sentinel configurations may be different.

When sentinel believes that master is objectively offline, it opens the SRI_O_DOWN identity of flags in the master attribute, such as master, as shown in the following figure:

Electing Sentinel Leader

When a master goes down, multiple sentinel nodes may simultaneously discover and interactively confirm each other's "subjective unavailable state", reach the "objective unavailable state" at the same time, and intend to initiate failover at the same time. But in the end, there can only be one sentinel node as the failover initiator, so the Sentinel Leader needs to be elected and a Sentinel Leader election process needs to be started.

Redis's Sentinel mechanism implements this election algorithm using a protocol similar to Raft:

The epoch variable of 1.sentinelState is similar to term (election round) in the raft protocol.

two。 Each sentinel node that confirms that master is "objectively unavailable" broadcasts its own election request around (SENTINEL is-master-down-by-addr, current_epoch is its own configuration era, run_id is its own running ID)

3. If each sentinel node that receives the election request has not received another request, it will set the intention of this round as the first sentinel and reply it (first come, first served) If you have already made an intention in this round, reject other candidates and reply with the intention (as in the Multi Bulk reply with the three parameters introduced above, the down_state is 1leaderrunid, the running ID,leader_epoch of the first received source sentinel is the configuration era of the first received source sentinel of the request.)

4. Each sentinel node that initiates an election request determines that the sentinel is leader if it receives more than half of the intention to agree to a candidate sentinel (which may be itself). If the current round lasts long enough and the leader is not selected, start the next round

After the leader sentinel is determined, leader sentinel selects one of all the slave of the master as the new master according to certain rules.

Failover failover

After the Sentinel Leader is elected, sentinel leader performs a failover to the offline master:

Sentinel leader selects a slave with good state and complete data from all the slave of the offline master, and then sends the command: SLAVEOF no one to the slave to convert the slave to master.

Let's take a look at how the new master is selected. Sentinel leader saves all slave that have been taken offline to a list and filters the filter according to the following rules:

The highest priority is identified by the replica-priority option in the slave,redis.conf configuration. The default is 100. The lower the priority, the higher the priority. 0 is a special priority and is marked that it cannot be upgraded to master.

If there are multiple slave with equal priority, the slave with the largest replication offset (offset) is selected (more complete data)

If there are multiple slave with equal priority and maximum replication offset, choose to run the slave with the lowest ID

After selecting the slave that needs to be upgraded to the new master, Sentinel Leader sends a SLAVEOF no one command to that slave. Sentinel then sends INFO to the upgraded slave at a frequency of once per second (usually once in ten seconds). When the reply role changes from slave to master, Sentinel Leader will know that it has been upgraded to master.

Sentinel leader sends a SLAVEOF command (SLAVEOF) to the slave of the offline master to copy the new master.

Set the old master to the new master's slave and continue to monitor it, and when it comes back online, Sentinel will execute the command to make it the new master's slave.

The above is all the contents of the article "sample Analysis of sentinel failover in Redis". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report