Introduction of how to realize load balancing in MySQL 07/12 Update SLTechnology News&Howtos

Introduction of how to realize load balancing in MySQL

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Load balance "target=" _ blank "href=" http://undefined">

The following will give you an introduction on how to achieve MySQL load balancing, hoping to give you some help in practical application. Load balancing involves many things, and there are not many theories. There are many books on the Internet. Today, we will use the accumulated experience in the industry to do an answer.

The basic idea of load balancing is simple: the average load in a CVM cluster as much as possible. Based on this idea, our common practice is to set up a load balancer at the front end of the server. The role of the load balancer is to route the requested connection to the most idle available server.

Figure 1 shows the load balancing settings for a large website. One of them is responsible for HTTP traffic and the other is for MySQL access.

Load balancing has five common purposes:

Scalability. Load balancing is helpful for some extensions, such as reading data from the standby database when reading and writing are separated. High efficiency. Load balancing helps to use resources more efficiently because it can control where requests are routed. Availability. Flexible load balancing scheme can greatly improve the availability of services. Transparency. The client does not need to know whether a load balancer exists or how many machines are behind the load balancer. What is presented to the client is a transparent server. Consistency. If the application is stateful (database transactions, Web site sessions, etc.), the load balancer can point related queries to the same server to prevent state loss.

For the implementation of load balancing, there are generally two ways: direct connection and the introduction of middleware.

Related tutorials: mysql Video tutorial

1 direct connection

Some people think that load balancing is configured directly between the application and the MySQL server, but in fact this is not the only method of load balancing. Next, we will discuss the common methods of applying direct connection and the related precautions.

1.1 read-write separation of replication

In this way, it is easy to have the biggest problem: dirty data. A typical example is when a user comments on a blog post and then reloads the page without seeing any new comments.

Of course, we should not abandon the separation of reading and writing because of the problem of dirty data. In fact, for many applications, it is possible to have a high tolerance for dirty data, so you can boldly introduce this way at this time.

So for applications with low tolerance for dirty data, how to separate reading and writing? Next, we make a further distinction between the separation of reading and writing, and believe that you can always find a strategy that suits you.

1) based on query separation

If only a small number of data in the application cannot tolerate dirty data, we can assign all reads and writes that cannot tolerate dirty data to the master. Other read queries are assigned on the slave. This strategy is easy to implement, but if there are few queries that tolerate dirty data, it is likely that the standby database will not be used effectively.

2) based on dirty data separation

This is a small improvement on the query separation strategy. Some additional work needs to be done, such as having the application check replication delays to determine whether the database data is up-to-date. Many report applications can use this strategy: you only need to copy the data loaded at night to the standby interface, and you don't care whether you are fully keeping up with the main library.

3) based on session separation

This strategy goes a little deeper than the dirty data separation strategy. It is to determine whether the user has modified the data, the user does not need to see the latest data of other users, just need to see their own updates.

Specifically, you can set a tag bit in the session layer to indicate whether the user has made an update. Once the user has made an update, the user's query will be directed to the main database within a period of time.

This strategy makes a good compromise between simplicity and effectiveness, and is a more recommended strategy.

Of course, if you have enough ideas, you can combine the session-based separation strategy with the replication latency monitoring strategy. If the user updates the data 10 seconds ago, and all the standby libraries are delayed within 5 seconds, they can boldly read the data from the standby database. It is important to note that remember to select the same standby library for the entire session, otherwise the delay of multiple backup libraries will be inconsistent, which will cause trouble to the user.

4) based on global version / session separation

Confirm whether the backup database updates the data by recording the log coordinates of the main database and comparing the copied coordinates of the standby database. When the application points to a write operation, a SHOW MASTER STATUS operation is performed after the transaction is committed, and then the main database log coordinates are stored in the cache as the version number of the modified object or session. When the application connects to the slave library, SHOW SLAVE STATUS is executed and the coordinates on the slave database are compared with the version number in the cache. If the standby database is newer than the main database record point, it indicates that the standby database has updated the corresponding data, so you can rest assured to use it.

In fact, many read-write separation strategies need to monitor replication latency to determine the allocation of read queries. Note, however, that the value of the Seconds_behind_master column obtained by SHOW SLAVE STATUS does not accurately represent the delay. We can use the pt-heartbeat tool in Percona Toolkit to better monitor latency.

Dns' target='_blank' href=' https://www.yisu.com/dns/'>dns- name "> 1.2 modify DNS name

For some relatively simple applications, DNS can be created for different purposes. The easiest way is for the read-only server to have a DNS name (read.mysql-db.com) and another DNS name (write.mysql-db.com) for the server responsible for the write operation. If the standby library can keep up with the main library, point the read-only DNS name to the standby library, otherwise, point to the main library.

This strategy is very easy to implement, but there is a big problem: there is no complete control over DNS.

The modification of DNS is not immediate, nor is it atomic. It takes a long time to transmit the changes of DNS to the whole network or between networks. DNS data is cached everywhere, and its expiration time is recommended rather than mandatory. An application or server restart may be required for the modified DNS to take full effect.

This strategy is dangerous, and even if you can modify the / etc/hosts file to avoid problems that DNS can't fully control, it's still ideal.

1.3 transfer IP addr

Load balancing is achieved by transferring virtual addresses between servers. Does it feel like modifying DNS? But it's actually a completely different thing. Transferring IP addresses allows DNS names to remain the same, and we can use the ARP command (not knowing ARP, see here) to force rapid and atomic notification of IP address changes to the local network.

A convenient technique is to assign a fixed IP address to each physical server. The IP address is fixed on the server and will not be changed. You can then use a virtual IP address for each logical "service" (which can be understood as a container).

In this way, IP can be easily transferred between servers without the need to reconfigure the application, and it is easier to implement.

2 introduction of middleware

The above strategy assumes that the application is connected to the MySQL server, but many load balancers introduce a middleware as a proxy for network communication. It accepts all communications on one side, distributes these requests on the designated server on the other, and sends the execution results back to the request machine. Figure 2 shows this architecture.

2.1 load balancer

There are a lot of load balancing hardware and software, but few are designed specifically for MySQL servers. Web servers usually need more load balancing, so many multi-purpose load balancing devices support HTTP, while there are only a few basic features for other uses.

MySQL connections are just normal TCP/IP connections, so you can use a multi-purpose load balancer on MySQL. However, due to the lack of MySQL proprietary features, there will be more restrictions:

Distributing requests may not be a good load balancing. There is insufficient support for MySQL sessions, and you may not know how to "pin" all connection requests sent from a single HTTP session to a MySQL server. Connection pooling and persistent connections may prevent load balancers from distributing connection requests. Can not do a good health and load check on the MySQL server. 2.2 load balancing algorithm

There are many algorithms to determine which server accepts the next connection. Each vendor has its own algorithm, and there are the following common methods:

Randomly assigned. Randomly select a server from the available server pool to process the request. Polling. Send requests to the server in circular order, for example: a, B, C, A, B, C. Hashi. Hash the connected source IP address and map it to the same server in the pool. The quickest response. Assign the connection to the server that can process the request the fastest. Minimum number of connections. Assign connections to the server with the least active connections. Weight. According to the machine performance and other conditions, different machines are configured with different weights, so that high-performance machines can handle more connections.

The above methods are not the best, only the most suitable, depending on the specific workload.

In addition, we only describe the algorithm of real-time processing. But sometimes it may be more efficient to use a queuing algorithm. For example, an algorithm may maintain only a given number of database server concurrency, allowing no more than N active transactions at a time. If there are too many active transactions, put the new request in a queue and let the list of available servers process it.

2.3 load balancing between one master and multiple backups

The most common replication structure is one main library plus multiple standby libraries. The scalability of this architecture is poor, but we can achieve better results by combining load balancing with some methods.

Functional partition. For the factory's functions including reporting, analysis, data warehouse and full-text indexing, configure one or a set of standby libraries to expand the capacity of a single function. Make sure that the backup library keeps up with the main library. The problem with database preparation is dirty data. For this, we can use the function MASTER_POS_WAIT () to block the operation of the main library until the standby library catches up with the set synchronization point of the main library. In addition, we can also use the replication heartbeat to check for latency.

We cannot and should not want to turn the architecture into an Ali-like architecture at the beginning of the application. The best way is to implement what the application clearly needs right now and plan ahead for possible rapid growth.

In addition, it makes sense to set a digital goal for scalability, just as we set a precise goal for performance to meet 10K or 100K concurrency. In this way, through the relevant theory, we can avoid the overhead problems such as serialization or interaction brought into our application.

In terms of MySQL expansion strategy, when a typical application grows to a very large size, it is usually transferred from a single server to an extended architecture with a backup database, and then to data fragmentation or functional partitioning. It should be noted here that we do not advocate suggestions such as "slice as soon as possible, as much as possible". In fact, slicing is complex and expensive, and the most important thing is that many applications may not be needed at all. Instead of spending a lot of money to slice, take a look at the changes in the new hardware and the new version of MySQL. Maybe these new changes will surprise you.

Summary of direct connection heavy "separation", equalizer and algorithm have limitations.

Quantitative indicators for scalability.

Finally, I hope this article will be helpful to you.

These are the details of MySQL load balancing, please pay attention to other related articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.