Introduction of several common high availability schemes for MySQL database 07/01 Update SLTechnology News&Howtos

Introduction of several common high availability schemes for MySQL database

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "introduction of several common high availability schemes of MySQL database". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "introduction to several common highly available solutions for MySQL databases".

I. Overview

When we consider the highly available architecture of the MySQL database, we mainly consider the following aspects:

If the database has failures such as downtime or unexpected interruption, it can restore the availability of the database as soon as possible, reduce downtime as much as possible, and ensure that the business will not be interrupted because of the failure of the database.

The data of the non-primary node used as backup, read-only copy, and so on, should be consistent with the data of the primary node in real time or eventually.

When the database switch occurs in the business, the database content before and after the switch should be consistent, and the business will not be affected because of missing data or data inconsistency.

We will not discuss the classification of high availability in detail here, but only discuss the advantages and disadvantages of commonly used high availability schemes and the selection of high availability schemes.

II. High availability options

1. Master-slave or master-master semi-synchronous replication

Use a two-node database to build one-way or two-way semi-synchronous replication. In the later version, due to the introduction of some new features such as lossless replication and logical multithreaded replication, MySQL native semi-synchronous replication is more reliable.

Common architectures are as follows:

Cdn.com/5bcb6ae8b54525a44d1a87d244328bd26071185c.jpeg ">

It is usually used at the same time with third-party software such as proxy and keepalived, which can be used to monitor the health of the database and execute a series of management commands. If the primary database fails, you can continue to use the database after switching to the standby database.

Advantages:

The architecture is relatively simple, using native semi-synchronous replication as the basis for data synchronization

With two nodes, there is no master selection problem after the host downtime. You can switch directly.

Dual nodes, low resource requirements and simple deployment

Disadvantages:

It is completely dependent on semi-synchronous replication. If semi-synchronous replication degenerates to asynchronous replication, data consistency cannot be guaranteed.

Additional consideration should be given to the high availability mechanisms of haproxy and keepalived.

two。 Semi-synchronous replication optimization

The semi-synchronous replication mechanism is reliable. If semi-synchronous replication is always in effect, then the data can be considered consistent. However, due to network fluctuations and other objective reasons, semi-synchronous replication timeout and switch to asynchronous replication, then the consistency of data can not be guaranteed. Therefore, the consistency of data can be improved by ensuring semi-synchronous replication as much as possible.

This scheme also uses two-node architecture, but makes functional optimization on the basis of the original semi-synchronous replication, which makes the mechanism of semi-synchronous replication more reliable.

The optimization schemes that can be referenced are as follows:

(1) dual-channel replication

Semi-synchronous replication breaks due to timeout, and when replication is established again, two channels are established at the same time, one of which starts replication from the current location to ensure that the slave machine knows the progress of the current host execution. Another asynchronous replication channel begins to catch up with the data lagging behind the slave machine. When the asynchronous replication channel catches up to the starting position of semi-synchronous replication, semi-synchronous replication is resumed.

(2) binlog file server

Build two semi-synchronous replication channels, in which the semi-synchronous channel connected to the file server is not normally enabled. When the master-slave semi-synchronous replication is degraded by network problems, start the semi-synchronous replication channel with the file server. When the master-slave semi-synchronous replication is restored, close the semi-synchronous replication channel with the file server.

Advantages:

Dual nodes, low resource requirements and simple deployment

The structure is simple, there is no problem of choosing the master, just switch directly.

Compared with native replication, the optimized semi-synchronous replication can better ensure the consistency of data.

Disadvantages:

You need to modify the kernel source code or use the mysql communication protocol. Need to have a certain understanding of the source code, and be able to do a certain degree of secondary development.

It still relies on semi-synchronous replication, which does not fundamentally solve the problem of data consistency.

3. High availability architecture optimization

Extend a two-node database to a multi-node database or a multi-node database cluster. You can choose a cluster of one master and two slaves, one master and multiple slaves or multiple masters and multiple slaves according to your own needs.

Because of the semi-synchronous replication, it is considered that the semi-synchronous replication is successful when the successful reply of a slave is received, so the reliability of multi-slave semi-synchronous replication is better than that of single-slave semi-synchronous replication. And the probability of multi-node downtime is also less than that of single-node downtime, so to some extent, multi-node architecture can be considered that high availability is better than two-node architecture.

However, due to the large number of databases, database management software is needed to ensure the maintainability of the database. You can choose MMM, MHA, or various versions of proxy, and so on.

Common scenarios are as follows:

(1) MHA+ multi-node cluster

MHA Manager regularly detects the master nodes in the cluster, and when the master fails, it can automatically upgrade the slave of the latest data to the new master, and then redirect all other slave to the new master, and the whole failover process is completely transparent to the application.

MHA Node runs on each MySQL server, and its main function is to handle binary logs during handover to ensure that data is lost as little as possible.

MHA can also be extended to the following multi-node clusters:

Advantages:

Automatic fault detection and transfer can be carried out

It has good scalability and can expand the number and structure of MySQL nodes as needed.

Compared with two-node MySQL replication, three-node / multi-node MySQL is less likely to be unavailable.

Disadvantages:

At least three nodes are needed, and more resources are needed than two nodes.

The logic is more complex, troubleshooting problems after a fault occurs, and locating the problem is more difficult.

Data consistency is still guaranteed by native semi-synchronous replication, and there is still a risk of data inconsistency.

It may be due to the occurrence of brain fissure in the network partition.

(2) zookeeper+proxy

Zookeeper uses distributed algorithms to ensure the consistency of cluster data, and the use of zookeeper can effectively ensure the high availability of proxy and avoid the phenomenon of network partition.

Advantages:

It ensures the high availability of the whole system, including proxy and MySQL.

It has good scalability and can be extended to a large-scale cluster.

Disadvantages:

Data consistency still depends on native mysql semi-synchronous replication

With the introduction of zk, the logic of the whole system becomes more complex

4. Shared storage

Shared storage realizes the decoupling of database server and storage device. Data synchronization between different databases no longer depends on the native replication function of MySQL, but ensures data consistency by means of disk data synchronization.

(1) SAN shared storage

The concept of SAN is to allow a direct high-speed network connection (compared to LAN) between the storage device and the processor (server) to achieve centralized data storage.

Common architectures are as follows:

When using shared storage, the MySQL server can mount the file system and operate normally. If the master database goes down, the slave database can mount the same file system to ensure that the master database and slave database use the same data.

Advantages:

It only takes two nodes, and the deployment is simple and the switching logic is simple.

It is good to ensure the strong consistency of data.

Data inconsistencies will not occur because of MySQL logic errors.

Disadvantages:

Need to consider the high availability of shared storage

High price

(2) DRBD disk replication

DRBD is a software-based, network-based block replication storage solution, which is mainly used to mirror the data of disks, partitions and logical volumes between servers. When the user writes the data to the local disk, it will also send the data to the disk of another host in the network, so that the data of the local host (primary node) and the remote host (standby node) can be synchronized in real time.

Common architectures are as follows:

When there is a problem with the local host, the remote host still retains a copy of the same data and can continue to use it, ensuring the security of the data.

DRBD is a fast-level synchronous replication technology implemented by linux kernel module, which can achieve the same shared storage effect as SAN.

Advantages:

It only takes two nodes, and the deployment is simple and the switching logic is simple.

Compared with SAN storage network, the price is low.

Ensure strong consistency of data

Disadvantages:

It has a great influence on the performance of io

No read operation is provided from the slave library

5. Distributed protocol

Distributed protocols can solve the problem of data consistency. The more common scenarios are as follows:

(1) MySQL cluster

MySQL cluster is the deployment scheme of the official cluster, which achieves high availability and data consistency of the database by using the NDB storage engine to back up redundant data in real time.

Common architectures are as follows:

Advantages:

All use official components and do not rely on third-party software

Strong consistency of data can be achieved.

Disadvantages:

It is less used in China.

The configuration is complex, and the NDB storage engine is required, which is different from the conventional MySQL engine.

At least three nodes

(2) Galera

MySQL high availability cluster based on Galera is a multi-master data synchronization MySQL cluster solution, which is easy to use, no single point of failure and high availability.

Common architectures are as follows:

Advantages:

Multi-master write, no delay replication, can ensure strong data consistency

There are mature communities and Internet companies are using them on a large scale.

Automatic failover, automatic addition and elimination of nodes

Disadvantages:

Native MySQL nodes need to be patched with wsrep

Only innodb storage engine is supported

At least three nodes

(3) POAXS

The problem solved by Paxos algorithm is how a distributed system can agree on a certain value (resolution). This algorithm is considered to be the most effective of its kind. The combination of Paxos and MySQL can achieve strong consistency of distributed MySQL data.

Common architectures are as follows:

Advantages:

Multi-master write, no delay replication, can ensure strong data consistency

Have a mature theoretical basis

Automatic failover, automatic addition and elimination of nodes

Disadvantages:

Only innodb storage engine is supported

At least three nodes

At this point, I believe that you have a deeper understanding of the "introduction of several common high-availability schemes of MySQL database". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.