In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the method of data segmentation in mysql, which has a certain reference value and can be used for reference by friends who need it. I hope you will learn a lot after reading this article. Next, let the editor take you to learn about it.
Mysql to achieve data segmentation methods: 1, the use of vertical data segmentation; 2, the use of data horizontal segmentation; 3, using MySQL Proxy to achieve data segmentation and integration; 4, using Amoeba to achieve data segmentation; 5, using HiveDB to achieve data segmentation and integration.
The method of data segmentation in mysql:
What is data segmentation?
To put it simply, it means that through some specific conditions, the data stored in the same database will be scattered to multiple databases (hosts), in order to achieve the effect of dispersing the load of a single device. Data segmentation can also improve the overall availability of the system, because after a single device Crash, only some part of the overall data is not available, not all of the data.
The Sharding of data can be divided into two segmentation modes according to the type of segmentation rules. One is to split the data into different databases (hosts) according to different tables (or Schema), which can be called vertical (vertical) segmentation of data; the other is to split the data in the same table to multiple databases (hosts) according to certain conditions according to the logical relationship of the data in the table, which is called horizontal (horizontal) segmentation of data.
The most important feature of vertical segmentation is that the rules are simple and the implementation is more convenient, which is especially suitable for systems with very low coupling between businesses, little interaction and clear business logic. In this system, it is easy to split the tables used by different business modules into different databases. Splitting according to different tables will have less impact on the application, and the splitting rules will be relatively simple and clear.
Horizontal segmentation is slightly more complicated than vertical segmentation. Because you want to split different data in the same table into different databases, the split rule itself is more complex for the application than splitting based on the table name, and later data maintenance will be more complex.
When the data volume and access volume of a table (or some) is so large that it still cannot meet the performance requirements after putting it on a separate device through vertical sharding, it is necessary to combine vertical sharding with horizontal sharding, first vertically, and then horizontally, in order to solve the performance problem of this kind of super-large table.
The following is the corresponding analysis of the architecture implementation of vertical, horizontal and combined data segmentation and the integration of data after segmentation.
Vertical segmentation of data
Let's first take a look at how the vertical segmentation of the data is divided. The vertical segmentation of data can also be called vertical segmentation. Think of a database as consisting of many large "data blocks" (tables), cut these "data blocks" vertically, and then spread them over multiple database hosts. Such a segmentation method is vertical (vertical) data segmentation.
The overall function of a well-designed application system must be composed of many functional modules, and the data needed by each functional module corresponds to one or more tables in the database. In the architecture design, the more unified and fewer interaction points between each functional module, the lower the coupling degree of the system, and the better the maintainability and expansibility of each module of the system. In such a system, it is easier to realize the vertical segmentation of data.
The clearer the functional module is and the lower the degree of coupling is, the easier it is to define the rules of data vertical segmentation. Data can be segmented according to functional modules. Data from different functional modules are stored in different database hosts, which can easily avoid the existence of cross-database Join. At the same time, the system architecture is also very clear.
Of course, it is difficult for a system to achieve that the tables used by all functional modules are completely independent, there is no need to access each other's tables at all, or it is necessary to Join the tables of both modules. In this case, the trade-off must be evaluated according to the actual application scenario. Decide whether to accommodate the application to store all the relevant modules of the tables that require Join in the same database, or to let the application do more-- get the data from different databases entirely through the module interface, and then complete the Join operation in the program.
Generally speaking, if it is a system with relatively low load and frequent table correlation, it is possible that the database concession and the scheme of merging several related modules together to reduce the work of the application can reduce more workload. is a feasible solution.
Of course, through the concession of the database, allowing multiple modules to centrally share the data source, in fact, indirectly acquiesced in the development of the increased coupling of the architecture of each module, which may worsen the future architecture. Especially when it develops to a certain stage and finds that the database can not bear the pressure brought by these tables and has to be split again, the cost of architecture transformation may be far greater than the architecture design that uses segmentation in the first place.
Therefore, when the database is divided vertically, how to split and to what extent it is a difficult problem to test people. Only by balancing the costs and benefits of all aspects in the actual application scenario, can we analyze a really suitable split scheme.
For example, in the example database of the example system used in this article, we briefly analyze it, and then design a simple sharding rule for a vertical split.
The system function can be divided into four functional modules: users, group messages, photo albums and events, which correspond to the following tables:
User module table: user,user_profile,user_group,user_photo_album
Group discussion table: groups,group_message,group_message_content,top_message
Album related tables: photo,photo_album,photo_album_relation,photo_comment
Event information table: event
At first glance, no module can exist independently from other modules, there is a relationship between modules and modules, is it impossible to split?
Of course not, after a little in-depth analysis, we can find that although there are associations between the tables used by each module, the relationship is clear and simple.
The association between the group discussion module and the user module is mainly through the user or group relationship. Generally, it will be associated through the user's id or nick_name and the id of group, and it won't bring much trouble to realize it through the interface between modules.
The photo album module only has a user association with the user module. The correlation between the two modules is basically only through the content associated by the user id, which is simple and clear, and the interface is clear.
The event module may be associated with each module, but they all only focus on the ID information of the objects in each module, which is also easy to split.
Therefore, in the first step, the database can be divided vertically according to the tables related to the functional module, and the tables involved in each module can be divided into a separate database, and the table association between the module and the module can be processed through the interface on the application side. As shown in the diagram of vertical segmentation of data (figure 1):
After such vertical segmentation, the service that can only be provided through one database is split into four databases to provide services, and the service capacity is naturally increased several times.
Advantages of vertical slicing:
The splitting of the database is simple and clear, and the splitting rules are clear.
The application module is clear and easy to integrate.
Data maintenance is easy and easy to locate.
Disadvantages of vertical slicing:
Partial table association cannot be done at the database level, but in the program
There is still a performance bottleneck for tables with extremely frequent access and a large amount of data, which may not necessarily meet the requirements.
Transaction processing is relatively complex
After the segmentation reaches a certain degree, the scalability is limited.
Excessive segmentation can make the system too complex to maintain.
For vertical segmentation may encounter data segmentation and transaction problems, it is very difficult to find a better processing scheme at the database level. In practical application cases, most of the vertical segmentation of the database corresponds to the module of the application system, and the data source of the same module is stored in the same database, which can solve the problem of data association within the module. Between modules, the required data is provided to each other through the application in the way of service interface. Although this does increase the overall number of operations of the database, it is beneficial in terms of overall system scalability and architectural modularization. It is possible that the single response time for some operations will increase slightly, but the overall performance of the system is likely to be improved. The expansion bottleneck problem can only be solved by relying on the data horizontal segmentation architecture that will be described in the next section.
Horizontal segmentation of data
The above section analyzes the vertical segmentation of data, and this section analyzes the horizontal segmentation of the data. The vertical segmentation of data can basically be understood simply as dividing the data according to a table or module, while horizontal segmentation is different. Generally speaking, simple horizontal segmentation is mainly to distribute a table with extremely mediocre access into multiple tables according to some rules of a field, each table containing part of the data.
To put it simply, the horizontal segmentation of data can be understood as the segmentation of data rows, that is, some rows in the table are split into one database, while other rows are split into other databases. Of course, in order to easily determine which database each row of data is split into, segmentation always needs to be done according to certain rules: for example, based on a specific number of numeric type fields, the range of a certain time type field, or the hash value of a character type field. If most of the core tables in the entire system can be associated with a field, then this field is naturally the best choice for horizontal partitioning, except in very special cases that cannot be used.
Generally speaking, like the very popular Web 2.0 type websites, basically most of the data can be related through member user information, and many core tables may be very suitable for horizontal segmentation of data through member ID. For example, the forum community discussion system is easier to split, and it can be divided horizontally according to the forum number. After segmentation, there is basically no interaction between the various libraries.
If all the data of the example system is associated with the user, then the data of different users can be split horizontally according to the user, and the data of different users can be split into different databases. Of course, the only difference is that the groups table in the user module is not directly related to the user, so groups cannot be split horizontally based on the user. For tables in this special case, they can be separated and placed in a separate database. In fact, this approach can be said to take advantage of the "vertical segmentation of data" method described in the previous section, which will be described in more detail in the next section, which is used in both vertical and horizontal segmentation.
So, for the sample database, most of the tables can be split horizontally based on the user ID. The data related to different users are segmented and stored in different databases. For example, the ID of all users is fetched by 2 and then stored in two different databases. Each table associated with the user ID can be split in this way. In this way, basically the data related to each user is in the same database, and it is very easy to implement even if it needs to be associated.
The information about horizontal segmentation can be shown more intuitively through the horizontal segmentation schematic diagram (figure 2):
Advantages of horizontal slicing:
Table association can basically be completed on the database side.
There will not be bottlenecks in some very large data volumes and high load tables.
There are relatively few changes to the overall architecture of the application side
Transaction processing is relatively simple
As long as the segmentation rules can be well defined, it is basically difficult to encounter scalability limitations.
Disadvantages of horizontal slicing:
The segmentation rules are relatively complex, so it is difficult to abstract a segmentation rule that can satisfy the whole database.
In the later stage, it is more difficult to maintain the data, and it is more difficult to locate the data manually.
The coupling degree of each module of the application system is high, which may cause some difficulties to the migration and separation of the later data.
The use of combined vertical and horizontal segmentation
In the first two sections, we learned about the implementation of "vertical" and "horizontal" and the architecture information after segmentation, as well as the respective advantages and disadvantages of the two architectures. However, in the actual application scenario, except for those systems whose load is not too large and the business logic is relatively simple, the scalability problem can be solved by one of the above two segmentation methods. I am afraid that most other systems with complex business logic and heavy system load can not achieve good scalability through any of the above data segmentation methods, so it is necessary to combine the above two segmentation methods. Different scenarios use different segmentation methods.
This section will combine the advantages and disadvantages of vertical segmentation and horizontal segmentation to further improve the overall architecture and improve the scalability of the system.
In general, it is difficult for all tables in a database to be associated by one (or a few) fields, so horizontal segmentation of the data alone cannot solve all the problems. Vertical sharding can only solve part of the problem. For those systems with very high load, even a single table cannot bear its load through a single database host. We must combine the two segmentation methods of "vertical" and "horizontal" to make full use of their advantages and avoid their disadvantages.
The load of each application system increases step by step. When they encounter performance bottlenecks, most architects and DBA will choose to split the data vertically first, because this cost is the lowest and is most in line with the maximum input-output ratio pursued in this period. However, with the continuous expansion of business and the continuous growth of system load, after a period of stability of the system, the database cluster after vertical split may once again be overwhelmed and encounter performance bottlenecks.
How to choose at this time? Is it to further subdivide the module again, or to find other solutions? If we continue to subdivide the module and split the data vertically as we did at the beginning, we may encounter the same problem in the near future. And with the continuous refinement of the module, the architecture of the application system will become more and more complex, and the whole system is likely to get out of control.
At this time, we must use the advantage of horizontal data segmentation to solve the problems encountered. Moreover, it is not necessary to push down the results of data vertical segmentation before using horizontal data segmentation, but on its basis, we use the advantages of horizontal segmentation to avoid the disadvantages of vertical segmentation and solve the problem of increasing complexity of the system. The disadvantages of horizontal splitting (the rules are difficult to unify) have also been solved by the previous vertical segmentation, so that horizontal segmentation can be carried out easily.
For the example database, it is assumed that the vertical segmentation of the data has been carried out at the beginning, but with the continuous growth of the business, the database system has encountered a bottleneck, so we choose to reconstruct the architecture of the database cluster. How to reconstruct? Considering that the vertical segmentation of the data has been done before, and the module structure is clear and clear, and the momentum of business growth is getting stronger and stronger, even if we split the module again now, it will not last long. Therefore, we choose to split horizontally on the basis of vertical segmentation.
After vertical segmentation, each database in the database cluster has only one functional module, and basically all the tables in each functional module are associated with a field. For example, all user modules can be segmented through user ID, group discussion module can be segmented through group ID, photo album module can be segmented according to album ID, and the final event notification information table takes into account the time limit of the data (only accessing the information of the most recent event segment), then it is segmented by time.
Combined sharding shows the entire architecture after sharding:
In fact, in many large-scale application systems, vertical segmentation and horizontal segmentation basically coexist, and are often carried out alternately to increase the scalability of the system. When we deal with different application scenarios, we should also fully take into account the limitations and advantages of these two segmentation methods, and use different ways in different periods (load pressure).
Advantages of joint segmentation:
We can make full use of the respective advantages of vertical segmentation and horizontal segmentation to avoid their respective defects.
Maximize the expansibility of the system.
Disadvantages of joint segmentation:
The structure of database system is more complex, and it is more difficult to maintain.
The application architecture is also more complex.
Data segmentation and integration scheme
Through the previous chapters, it has been clear that data segmentation through the database can greatly improve the scalability of the system. However, after the data in the database is stored in different database hosts after vertical and / or horizontal segmentation, the biggest problem faced by the application system is how to better integrate these data sources. perhaps this is also a problem that many readers are very concerned about. The main content of this section is to analyze the overall solutions that can help us achieve data segmentation and data integration.
Data integration is difficult to rely on the database itself. Although MySQL has a Federated storage engine, which can solve some similar problems, it is difficult to make good use of it in practical application scenarios. So how to integrate these data sources scattered across MySQL hosts?
In general, there are two ways to solve the problem:
Configure and manage one (or more) data sources in each application module, access each database directly, and complete the data integration within the module.
All data sources are managed uniformly through the intermediate agent layer, and the back-end database cluster is transparent to the front-end applications.
It is possible that more than 90% of people tend to choose the second when faced with both solutions, especially when the system is getting bigger and more complex. Indeed, this is a very correct choice, although the cost may be relatively large in the short term, but it is very helpful for the scalability of the whole system.
Therefore, there is not too much analysis of the first solution, and the following focuses on some of the solutions in the second idea.
Develop the intermediate agent layer by ourselves
After deciding to solve the architectural direction of data source integration through the intermediate agent layer of the database, many companies (or enterprises) have developed their own proxy layer applications in line with their own application-specific scenarios.
Self-development of the intermediate agent layer can maximize the characteristics of its own application, maximize customized personalized needs, and can also respond flexibly in the face of changes. This should be the biggest advantage of self-development agent layer.
Of course, when you choose to develop on your own and enjoy the maximum fun of personalized customization, you naturally need to invest more cost to carry out pre-R & D and later continuous upgrade and improvement work, and your own technical threshold may be higher than simple Web applications. Therefore, it is still necessary to conduct a more comprehensive assessment before deciding to develop on our own.
Since self-development is more focused on how to better adapt to its own application system and deal with its own business scenarios, it is not easy to analyze too much here. The following will mainly analyze several popular data source integration solutions.
Using MySQL Proxy to realize data segmentation and integration
MySQL Proxy is an official database proxy layer product provided by MySQL. Like MySQL Server, it is also an open source product based on the GPL open source protocol. Can be used to monitor, analyze or transmit communication between them. Its flexibility allows maximum use of it. At present, it mainly has the functions of connection routing, Query analysis, Query filtering and modification, load balancing, and basic HA mechanism.
In fact, MySQL Proxy itself does not have all of the above functions, but provides the basis for implementing them. To achieve these functions, we also need to write our own LUA scripts.
MySQL Proxy actually establishes a connection pool between the client request and the MySQL Server. All client requests are sent to MySQL Proxy, and then analyzed by MySQL Proxy to determine whether the read operation or write operation is distributed to the corresponding MySQL Server. For multi-node Slave clusters, it can also play the effect of load balancing. Such as the MySQL Proxy basic architecture diagram (figure 4):
Through the architecture diagram above, you can clearly see where MySQL Proxy is in the practical application and the basic things that can be done. The detailed implementation rules of MySQL Proxy are introduced and illustrated in great detail in the official documents of MySQL. Interested readers can download them directly from the official MySQL website or read them online, so I won't repeat them here.
Data Segmentation using Amoeba
Amoeba is an open source framework based on Java, focusing on solving distributed database data source integration Proxy programs, based on GPL3 open source protocol. At present, Amoeba already has Query routing, Query filtering, read-write separation, load balancing, HA mechanism and other related content, as shown in figure 5.
Amoeba mainly solves the following problems:
Integration of complex data sources after data segmentation
Provide data segmentation rules and reduce the impact of data segmentation rules on the database
Reduce the number of connections between database and client
Read-write separate routing.
As you can see, what Amoeba does is exactly what is needed to improve the scalability of the database through data segmentation.
Amoeba is not an agent layer Proxy program, but a framework for developing database agent layer Proxy programs. At present, there are two Proxy programs based on Amoeba, Amoeba For MySQL and Amoeba For Aladin.
Amoeba For MySQL is a special solution for MySQL database. The protocol requested by the front-end application and the data source database connected to the back-end must be MySQL. For any application on the client side, Amoeba For MySQL is no different from a MySQL database, and any client request using the MySQL protocol can be parsed by Amoeba For MySQL and processed accordingly. Amoeba For can tell us the architecture of Amoeba For MySQL (from the Amoeba developer blog):
Amoeba For Aladin is a more widely used and more powerful Proxy program. It can connect data sources of different databases at the same time to provide services for front-end applications, but only accept requests from client applications that conform to the MySQL protocol. In other words, as long as the front-end application is connected through the MySQL protocol, Amoeba For Aladin will automatically analyze the Query statement and automatically identify which physical host of the Query's data source is located in what type of database according to the data requested in the Query statement. The Amoeba For Aladdin architecture diagram (figure 6) shows the architectural details of Amoeba For Aladin (from the Amoeba developer blog).
At first glance, the two seem exactly the same. If you look closely, you will find that the main difference between the two is that after MySQL Protocal Adapter processing, you can judge the data source database according to the analysis results, and then choose a specific JDBC driver and the corresponding protocol to connect to the back-end database.
In fact, through the above two architecture diagrams you may have found the characteristics of Amoeba, it is only a development framework, we in addition to choose it has provided For MySQL and For Aladin these two products, but also based on their own needs for secondary development, to get more suitable for their own application characteristics of the Proxy program.
But for using MySQL databases, both Amoeba For MySQL and Amoeba For Aladin can be used very well. Of course, considering that the more complex any system is, there will be a certain loss of performance, and the maintenance cost will naturally be higher. Therefore, it is recommended to use Amoeba For MySQL when you only need to use the MySQL database.
The use of Amoeba For MySQL is very simple. All the configuration files are standard XML files, with a total of 4 files, as follows:
Amoeba.xml-- main configuration file, configuring parameters for all data sources and Amoeba itself
Rule.xml-- configure information for all Query routing rules
FunctionMap.xml-- is configured to parse the Java implementation class corresponding to the function in Query
RullFunctionMap.xml-- configures the implementation class of a specific function that needs to be used in routing rules.
If your rules are not too complex, you can basically do all the work using only the first two of the above four configuration files. The common functions of Proxy programs, such as read-write separation, load balancing and so on, are configured in amoeba.xml. In addition, Amoeba has supported automatic routing for vertical and horizontal data segmentation, and routing rules can be set in rule.xml.
Using HiveDB to realize data segmentation and integration
Like the previous MySQL Proxy and Amoeba, HiveDB is also an open source framework for data sharding and integration based on Java for MySQL databases, but the current HiveDB only supports horizontal data sharding. It mainly solves the problems of database expansibility and high-performance data access under a large amount of data, while supporting data redundancy and basic HA mechanism.
The implementation mechanism of HiveDB is different from that of MySQL Proxy and Amoeba. It does not use the Replication function of MySQL to achieve data redundancy, but implements the data redundancy mechanism on its own, and its underlying layer is mainly based on Hibernate Shards to achieve data segmentation.
In HiveDB, data is distributed among multiple MySQL Server through various user-defined Partition keys (that is, data segmentation rules are made). When the Query request is run on access, the filter condition is automatically analyzed, the data is read from multiple MySQL Server in parallel, and the result set is merged and returned to the client application.
Purely in terms of function, HiveDB may not be as powerful as MySQL Proxy and Amoeba, but its idea of data segmentation is not essentially different from that of the former two. In addition, HiveDB is not just a content shared by open source enthusiasts, but an open source project supported by commercial companies.
The schematic diagram of the HiveDB architecture on the official HiveDB website (figure 7) describes how HiveDB organizes the basic information of the data, although it does not show the architectural information in detail, but it can basically show its unique side in data segmentation.
Other solutions for data segmentation and integration
In addition to the overall solutions for data sharding and integration described above, there are many other solutions, such as HSCALE based on MySQL Proxy, Spock Proxy built through Rails, and Pyshards based on Pathon, and so on.
No matter which solution we choose to use, there should be no change in the overall design idea, that is, to enhance the overall service capability of the database through vertical and horizontal segmentation of data, so that the overall expansion ability of the application system can be improved as far as possible, and the expansion mode is as convenient as possible.
As long as the problems of data segmentation and data source integration are well solved through the middle-tier Proxy application, the linear expansibility of the database will be as convenient as the application: as long as the cheap PC Server server is added, the overall service capacity of the database cluster can be linearly increased, so that the database will no longer easily become the performance bottleneck of the application system.
Possible problems in data segmentation and integration
Here, we should have a certain understanding of the implementation of data segmentation and integration, perhaps many readers have basically selected the solution suitable for their own application scenarios according to the advantages and disadvantages of various solutions, and the main work behind is to prepare for implementation.
Before implementing the data segmentation scheme, some possible problems still need to be analyzed. Generally speaking, the main problems that may be encountered are as follows:
The problem of introducing distributed transactions
The problem of cross-node Join
Merge sorting paging problems across nodes.
The problem of introducing distributed transactions
Once data sharding is stored in multiple MySQL Server, no matter how perfect the sharding rules are designed (in fact, there are no perfect sharding rules), it is possible that the data involved in some transactions is no longer in the same MySQL Server.
In such a scenario, if the application still follows the old scheme, then the potential must be solved by introducing distributed transactions. Among the various versions of MySQL, distributed transactions are supported only by versions starting from MySQL 5.0.At present, only Innodb provides distributed transaction support. However, even if we happen to use the MySQL version that supports distributed transactions and also use the Innodb storage engine, distributed transactions themselves consume a lot of system resources, and the performance is not too high. The introduction of distributed transactions will bring a lot of difficult problems in exception handling.
What shall I do? In fact, this problem can be solved by a flexible method. The first thing to consider is: is the database the only place where transactions can be solved? In fact, this is not the case, can be combined with the database and applications to solve. Each database solves its own transactions, and then controls the transactions on multiple databases through the application.
In other words, if we want, we can split a distributed transaction across multiple databases into multiple small transactions that are only on a single database, and control each small transaction through the application. Of course, this requires that the application must be sufficiently robust, and of course it will bring some technical difficulties to the application.
The problem of cross-node Join
After describing the possible introduction of distributed transactions, let's take a look at the issues that require cross-node Join. After data sharding, some old Join statements may no longer be used because the data sources used by Join may be split into multiple MySQL Server.
What shall I do? From the point of view of MySQL database, if it has to be solved directly on the database side, it can only be dealt with by Federated, a special storage engine of MySQL. The Federated storage engine is MySQL's solution to problems such as Oracle's DB Link. The main difference from Oracle DB Link is that Federated keeps a copy of the definition of the remote table structure locally. At first glance, Federated is indeed a very good solution for cross-node Join. But we should also be clear that if the remote table structure changes, the local table definition information will not change accordingly. If the local Federated table definition information is not updated when updating the remote table structure, the Query run is likely to go wrong and fail to get the correct results.
To deal with this kind of problem, it is recommended to deal with it through the application, first take out the driver result set in the MySQL Server where the driver table is located, and then extract the corresponding data in the MySQL Server where the driver table is located according to the driver result set. Many readers may think that this will have a certain impact on performance, yes, there will be some negative impact, but there are basically not many other better solutions. Moreover, because the load of each MySQL Server can be better controlled after a better expansion of the database, the response time of a single Query may be higher than that before unsegmented, so the negative impact on performance is not too great. What's more, there are not many requirements similar to this cross-node Join, and it may only be a small part of the overall performance. Therefore, it is worthwhile to sacrifice a little bit for the overall performance. After all, system optimization itself is a process of trade-off and balance.
Cross-node merge sorting paging problem
Once the data is split horizontally, it may not be only the cross-node Join that cannot run properly, and the data source of some sorted paging Query statements may also be sliced to multiple nodes, and the direct result is that these sorted paging Query cannot continue to run normally. In fact, this is the same as cross-node Join. Data sources exist on multiple nodes, and to be solved through a Query is a cross-node Join operation. Similarly, Federated can also be partially solved, but the risks are the same. But there is one difference: most of the time, Join has a relationship between driven and driven, so there is usually a sequential relationship between the data reads of multiple tables involved. But sorting paging is different. The data source of sorting paging is basically a table (or a result set), and there is no order relationship, so the process of fetching data from multiple data sources can be completely parallel. In this way, the fetch efficiency of sorted paged data can be higher than that of cross-library Join, so the performance loss is relatively small, and in some cases it may be more efficient than in databases that have not previously split the data. Of course, whether it is cross-node Join or cross-node sorting paging, the application server consumes more resources, especially memory resources, because the process of reading access and merging result sets requires more data than not processing merging.
Thank you for reading this article carefully. I hope it is helpful for everyone to share the methods and contents of mysql to achieve data segmentation. At the same time, I also hope you can support us, pay attention to the industry information channel, and find out if you encounter problems. Detailed solutions are waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.