In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the knowledge of "how to understand the read-write separation of database clusters". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
"Soul torture:
What are the solutions to solve the bottleneck of database read and write?
What problems do these solutions solve?
What are the advantages and disadvantages of these schemes?
There must be a high-performance database cluster behind a system that can resist high concurrent traffic, just like there is always a strong woman behind every successful man. Database cluster is distributed in deployment mode, but CAP principle is not applicable to distributed database.
As a common solution, sub-database and sub-table has almost become a sharp sword for interviewers, but few people pay attention to its side effects. In fact, the idea of dividing databases and tables is to use the idea of divide and conquer to solve the bottleneck of database. This solution also solves the bottlenecks of concurrent reading and writing, and uses data slicing to resist the impact of high traffic by stacking hardware. Of course, it brings some problems such as cross-database query and cross-table join, but these problems can always be solved by other solutions.
Database read-write separation is another solution to solve the bottleneck of database performance. compared with the scheme of sub-database and sub-table, they are essentially different. Sub-database sub-table will spread the data in multiple database tables, and then use the rules of data fragmentation to read and write data, while read-write separation is the use of "redundancy" to deal with the impact of large traffic.
Principle of separation of reading and writing
The basic principle of read-write separation is to distribute data reads and writes to different database nodes. Write operations generally occur only in the master node, and it is acceptable for a small amount of delayed read operations to occur on the slave node.
Image
As for the implementation of read-write separation:
Multiple database server components are clustered and master-slave relationships are configured
The master node is responsible for read and write operations, and the slave node is only responsible for read operations.
The master node synchronizes data from the master node to all slave nodes through the data replication mechanism.
The business side sends the write operation to the master node and the read operation to the slave node using the program or middleware.
Read-write separation advantage
The general system will meet the 28 principle, that is, 80% of the operations are read operations and 20% of the operations are write operations. The greater the proportion of read operations in the system, the more obvious the advantage of read-write separation, because the read operation can be solved by simply adding database slave nodes, of course, the increase of slave nodes is not unlimited. When the number of slave nodes reaches a certain number, it will inevitably affect the efficiency of master-slave synchronization and reduce the performance of the master node. At this time, we need to consider the balance of consistency and availability.
In addition, there are certain data statistics requirements in many businesses. When there is a stand-alone database, the sql executed by these statistical requirements is mixed with the business sql, which will affect the normal business operation to a certain extent, especially those business scenarios with a large amount of data. After making the read-write separation strategy, the statistical business can monopolize a slave database for statistics, even if it is a more time-consuming operation, it will not affect the normal operation of the business.
The read-write separation scheme of the database has the greatest advantage in all read operation scenarios.
Disadvantage of separation of reading and writing
There is a problem that many systems will encounter in database read-write separation, that is, some businesses need to read data in real time after successful write operations, but there is a certain time delay in synchronizing the data from the master node to the slave node. so in many cases, the business side can not read the correct data in real time from the slave node, this kind of business scenario is actually a typical scenario in which the master node also needs to provide read operations. Of course, if the system has a cache module, the cache can be updated synchronously after the master node write operation is successful, so as to meet the requirements of real-time data needed by the business.
Routing mechanism
Read-write separation has strict requirements in write operations, and write operations must occur on the master node, because read-write separation is a cluster based on the idea of centralization, which requires that the data on the master node must be up-to-date and complete. This requires that the caller must distinguish between the primary node.
Code encapsulation
Encapsulating read-write separation logic with program code requires abstracting a data access layer in the code, in which operation separation and database connection management are realized.
Image
It is not easy to use code to encapsulate read-write separation logic on the ground, and it takes a long time to test before it can be put on the production environment. If there are multiple language development teams within the company, each language may need to be implemented once, and the amount of development is still large. However, in view of different businesses, can achieve customized requirements, in the landing process also need to consider that if the master-slave switch occurs, the code must have a similar election process.
Database middleware
Database middleware is a set of system developed based on SQL protocol provided by database, which is independent of specific business. Its function is to realize operation separation and database connection management. It is also an abstraction layer for read-write separation, but this abstraction layer is based on database protocol. For business users, it is as convenient as accessing a single database.
Image
Synchronization delay
No distributed system can escape the problem of consistency. The same is true of the master-slave architecture of the database, where operations that occur on the master node need to be synchronized to each slave library. For example, the master-slave replication of MySQL depends on binlog. Master-slave replication is to copy the data in binlog from the master database to the slave database. Generally, this process is asynchronous, because in the case of network delay, synchronization will greatly reduce the availability of the master library.
In the process of binlog replication, there is a very low probability that the disk is broken or downmachine occurs before binlog can be refreshed to the disk. The final result is the inconsistency between master and slave data, but this irresistible factor is generally tolerated.
There is also a phenomenon that when data is copied from the master node to the slave node, the single-thread mode is turned on. If the master database generates new data faster than the speed of synchronization, it may further increase the delay time of master-slave synchronization. Is it possible to consider starting multithreading or using cache modules to shield synchronization delays?
Active and standby scheme
When it comes to the master-slave architecture deployment of the database, there is a similar scheme: master and standby. The active and standby node uses a redundant node as a backup node, but this node does not provide services and makes a real "spare tire" under the normal operation of the primary node. When the primary node dies, the standby node will take the place of the primary node and become the primary node to provide services.
The active and standby mode can be automated by using a simple mechanism similar to keepalive, and there is no need for election operation in theory. What are the characteristics of using active / standby mode to achieve high availability of database?
Availability is guaranteed by keepalive mechanism. The switching process is transparent to the business, and the business side does not need to modify any code.
Reading and writing are carried out on the main library, so it is easy to cause a single point of bottleneck. Since there is no data synchronization process of other nodes, the data can be consistent.
In the active and standby architecture, the standby database is only a simple backup, and the overall resource utilization rate is 50%, because the standby database has been idle all the time.
The expansibility is relatively poor, and it is impossible to scale horizontally, but the problem of expansibility can be solved by using sub-libraries and tables.
The resource utilization of one master, one slave or one master and multi-slave solution is very low, so a multi-master architecture has emerged. Multi-master architecture means that there will be multiple master libraries, each of which provides read and write functions. This involves the way of data synchronization between multiple master libraries. Although the performance is higher than that of one master, it is difficult to achieve data consistency. So many Internet companies do not recommend this kind of solution.
This is the end of the content of "how to understand the read-write separation of database clusters". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.