MySQL group replication background information details 07/03 Update SLTechnology News&Howtos

MySQL group replication background information details

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the details of the replication background information of the MySQL group, hoping to supplement and update some knowledge, if you have any other questions you need to know, you can continue to follow my updated article in the industry information.

The most common way to create a fault-tolerant system is to create component redundancy, in other words, components can be removed and the system should continue to operate as expected. This creates a series of challenges that raise the complexity of such systems to a completely different level. Specifically, replicated databases need to maintain and manage several server members at the same time, not just one. In addition, when multiple server work together, the system must deal with other common distributed system problems, such as network disconnection or brain fissure.

Therefore, the biggest challenge is to integrate the logic of database and data replication with the logic of several server running in a simple and consistent manner. In other words, it makes the data of multiple server members about the state of the system consistent with each change of the system. This can be summarized as making multiple server agree on each database state transition so that they all run as a separate database, or they end up in the same state. This means that they need to run as (distributed) state machine.

MySQL Group Replication provides a powerful inter-server coordination mechanism for distributed state machine replication. The server members of the group coordinate automatically. In single master mode, group replication has the feature of automatically selecting master, and only one CVM member accepts updates at a time. When running in multi-master mode, all server members can accept updates at the same time. This capability requires that applications have to address the limitations of deployment.

There is a built-in group membership service that keeps the group view consistent and is available to all server at any given point in time. When Server leaves or joins the group, the view updates accordingly. Server may also leave the group unexpectedly, and the fault detection mechanism automatically detects this and notifies the group that the view has changed.

For a transaction to commit, the decision to commit or abort the transaction is done separately by each server, but all team members must agree on the order of the transaction in the global transaction sequence. If there is a network separation that prevents an agreement between team members, the system will not continue to operate until this issue is resolved. Therefore, group replication also has an automatic brain fissure protection mechanism.

This mechanism is supported by the group communication protocol (GCS) provided by the system. The protocol ensures the fault detection mechanism, the security of the group member service and the complete orderly delivery of messages. The core of this technology is realized by Paxos algorithm, which is the key to ensure data consistency in group replication. It acts as the engine of group communication system.

18.1.1 replication technology

Before introducing the details of MySQL group replication, this section provides an overview of some background concepts and how group replication works. In this section we can learn what is needed in group replication and the difference between traditional asynchronous MySQL replication and group replication.

18.1.1.1 Master-Slave replication

Traditional MySQL replication provides a simple master-slave replication method. There is one master, and one or more slaves. The master node executes and commits transactions and then sends them (asynchronously) to the slave node for re-execution (in statement-based replication) or application (in row-based replication). This is a shared-nothing system, and by default all server members have a complete copy of the data.

Figure 18.1 MySQL asynchronous replication

There is also a semi-synchronous replication, which adds a synchronization step to the protocol. This means that the master node needs to wait for the slave node to confirm that it has received the transaction when it commits. Only in this way can the primary node continue to commit operations.

Figure 18.2 MySQL semi-synchronous replication

In the two images above, you can see a graphical representation of the traditional asynchronous MySQL replication protocol (and semi-synchronous). The blue arrow indicates the exchange of information between different server or between server and client applications.

18.1.1.2 Group replication

Group replication is a technology that can be used to implement fault-tolerant systems. A replication group is a server cluster that interacts with each other through messaging. The communication layer provides guarantee mechanisms such as atomic message (atomic message) and completely ordered information exchange. These are very powerful features, and we can design more advanced database replication solutions based on this architecture.

Based on these functions and architectures, MySQL group replication implements multi-master updates based on replication protocols. A replication group consists of multiple server members, and each server member in the group can perform transactions independently. However, all read-write (RW) transactions are committed only after conflict detection is successful. Read-only (RO) transactions do not need to be detected in conflict and can be committed immediately. In other words, for any RW transaction, the commit operation is not determined unidirectionally by the originating server, but by the group. Specifically, on the originating server, when the transaction is ready to commit, the server broadcasts the write value (the changed row) and the corresponding write set (the unique identifier of the updated row). A global order is then established for the transaction. Ultimately, this means that all server members receive the same set of transactions in the same order. Therefore, all server members apply the same changes in the same order to ensure consistency within the group.

Transactions executed concurrently on different server may conflict. According to the conflict detection mechanism of group replication, the write sets of two different concurrent transactions are detected. Conflicts occur if two concurrent transactions that update the same row are performed on different server members. The first transaction can be committed on all server members, and the second transaction is rolled back on the source server and deleted on other server in the group. This is the distributed first submit election rule.

Figure 18.3 MySQL Group replication Protocol

Finally, group replication is a share-nothing replication scenario where each server member has its own complete copy of the data.

The figure above depicts the MySQL group replication protocol and some differences can be seen by comparing it with MySQL replication (MySQL semi-synchronous replication). It is important to note that this picture does not contain some basic consensus and Paxos-related information.

18.1.2 Group replication use cases

Group replication enables you to create a fault-tolerant system with redundancy based on the state of the replication system in a set of server. Therefore, as long as it does not fail all or most of the server, even if there are some server failures, the system is still available, at best, with reduced performance and scalability. Server failures are isolated and independent. They are monitored by the group member service, which relies on a distributed fault detection system that can signal when any server leaves the group voluntarily or because it stops unexpectedly. They use a distributed recovery program to ensure that when server joins the group, they automatically update the group information to the latest. And multi-master updates ensure that updates are not blocked even in the event of a single server failure, eliminating the need for server failover. Therefore, MySQL group replication ensures that the database service is continuously available.

It is worth noting that although database services are available, when a server crashes, the client connecting to it must redirect or fail over to a different server. This is not a problem for group replication. Connectors, load balancers, routers, or other forms of middleware are more suitable for dealing with this problem.

In short, MySQL group replication provides high availability, high resiliency, and reliable MySQL services.

18.1.2.1 sample use case scenario

The following example is a typical use case for group replication.

Elastic replication-requires a very smooth replication infrastructure environment in which the number of server must dynamically grow or shrink with as few side effects as possible. For example, cloud database services.

Highly available sharding-sharding is a common method to achieve write extension. High availability shards are achieved using MySQL group replication, where each shard is mapped to a replication group.

Alternative master-slave replication-in some cases, using a single master server creates a single point of contention, and writing to the entire group may be more scalable.

Automated systems-in addition, you can deploy MySQL group replication directly to automated systems with existing replication protocols (as described in this and previous chapters). 18.1.3 Group replication details

This section provides more information about the basic service for group replication.

18.1.3.1 Fault detection

Group replication provides a fault detection mechanism that finds and reports which server members are unresponsive and assumes that these server are dead. At a higher level, fault detection is a distributed service that provides information (guesses) about which server may be dead. Then, if the group agrees that the guess may be true, the group determines that the given server is indeed failed. This means that the rest of the group makes a coordinated decision to exclude a given member.

A guess is triggered when a server does not respond, and when server A does not receive a message from server B within a given period of time, a timeout occurs and the guess is triggered.

If a server is isolated from the rest of the group, it suspects that all other server failed. Since an agreement could not be reached with the group (as it could not guarantee the number of arbitration members), its suspicion would not have consequences. When the server is isolated from the group in this way, it cannot perform any local transactions.

18.1.3.2 Group membership

MySQL group replication depends on the group member service. This is a built-in plug-in. It defines which server is online and in the group. Online server lists are often referred to as views. Therefore, each server in the group has a consistent view of the members who are actively participating in the group at a given time.

The same group server needs to agree not only on transaction commit, but also on the current view. Therefore, if the same group server agrees to join the new server, the group itself will be reconfigured to add the server to it and trigger the view update. Conversely, if server leaves the group, the group dynamically replans its configuration and triggers view updates, whether voluntary or forced.

Note that when a member leaves voluntarily, it first starts the dynamic reconfiguration of the group. This triggers a process in which all members must agree not to include new views that have left the server. However, if a member leaves due to an accident (for example, it stops unexpectedly or the network connection is disconnected), the fault detection mechanism will propose a reconfiguration of the group to remove the failed member. As mentioned above, this requires agreement from most of the servers in the group. If the group cannot agree (for example, when most servers are not online), the system cannot change the configuration dynamically and the system locks up to prevent brain fissure. In the end, this means that the administrator needs to step in and resolve the problem.

18.1.3.3 Fault tolerance

MySQL group replication is built on the basis of Paxos distributed algorithm implementation to provide distributed coordination between different server. Therefore, it requires that most server be active to reach the number of quorum members in order to make a decision. This has a direct impact on the number of faults that the system can tolerate that do not affect itself and its overall function. The number of server required to tolerate f failures (n) is n = 2 × f + 1.

In practice, this means that in order to tolerate a failure, the group must have three server. Therefore, if one server fails, there are still two servers forming a majority (2/3) to allow the system to continue to run automatically. However, if the second server accidentally fail, the group (leaving one server) locks up because there is no majority to reach a decision.

The following is a small table illustrating the above formula.

Number of immediate failures allowed for most group sizes 110220321431532642743

Read the above about the MySQL group replication background information details, hope to bring some help to everyone in the practical application. Due to the limited space in this article, it is inevitable that there will be deficiencies and need to be supplemented. If you need more professional answers, you can contact us on the official website for 24-hour pre-sales and after-sales to help you answer questions at any time.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.