In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
How to achieve sub-database and sub-table in MySQL, I believe that many inexperienced people do not know what to do about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
1. Database bottleneck
Whether it is IO bottleneck or CPU bottleneck, it will eventually lead to an increase in the number of active connections to the database, and then approach or even reach the threshold of the number of active connections that the database can hold. From a business Service point of view, there are few or no connections available to the database. Then you can imagine (concurrency, throughput, crash).
1. IO bottleneck
First: disk read IO bottleneck, too much hot data, database cache, each query will generate a large number of IO, reduce the query speed-> sub-database and vertical sub-table.
The second is the bottleneck of network IO, which requires too much data and insufficient network bandwidth-> sub-library.
2. CPU bottleneck
The first: SQL problems, such as join,group by,order by in SQL, conditional query of non-index fields, etc., increase the operation of CPU operation-> SQL optimization, establish an appropriate index, and carry out business calculation in the business Service layer.
Second: the amount of data in a single table is too large, too many rows are scanned when querying, the efficiency of SQL is low, and CPU is the first to appear bottleneck-> horizontal sub-table.
Second, sub-library sub-table 1, horizontal sub-library
Img
Concept: split the data in one library into multiple databases based on fields and according to certain policies (hash, range, etc.).
Results:
The structure of each library is the same
The data in each library is different and there is no intersection.
The union of all libraries is full data.
Scenario: the absolute concurrency of the system has come up, the sub-table is difficult to solve the problem fundamentally, and there is no obvious business attribution to divide the database vertically.
Analysis: with more libraries, the pressure on io and cpu can be alleviated exponentially.
2. Horizontal subtable
Img
Concept: split the data in one table into multiple tables based on fields and according to certain policies (hash, range, etc.).
Results:
The structure of each table is the same
The data of each table is different and there is no intersection.
The union of all tables is full data.
Scenario: the absolute concurrency of the system has not come up, but there is too much data in a single table, which affects the efficiency of SQL and increases the burden of CPU, so that it becomes a bottleneck.
Analysis: the amount of data in the table is less, and the execution efficiency of a single SQL is high, which naturally reduces the burden of CPU.
3. Vertical sub-library
Img
Concept: based on tables, different tables are split into different libraries according to different business attribution.
Results:
The structure of each library is different.
The data in each library is also different, and there is no intersection.
The union of all libraries is full data.
Scenario: the absolute concurrency of the system is up, and individual business modules can be abstracted.
Analysis: at this point, we can basically become service-oriented. For example, with the development of business, there are more and more common configuration tables, dictionary tables, and so on. At this time, these tables can be split into separate libraries, or even serviced. In addition, with the development of the business, a set of business model has been hatched, at this time, the relevant tables can be split into a separate library, or even service.
4. Vertical subtable
Img
Concept: based on the field, the fields in the table are split into different tables (main table and extended table) according to the activity of the field.
Results:
The structure of each table is different.
The data of each table is also different. Generally speaking, the fields of each table have at least one column intersection, usually the primary key, which is used to associate the data.
The union of all tables is full data.
Scenario: the absolute concurrency of the system does not come up, the records of the table are not many, but there are many fields, and the hot data and non-hot data together, the storage space of single row data is larger. As a result, the number of data rows in the database cache is reduced, and a large number of random reads of IO are generated when querying, resulting in IO bottlenecks.
Analysis: list pages and details pages can be used to help understand. The splitting principle of vertical split tables is to put hot spot data (data that may be redundant and often queried together) together as primary tables and non-hot data together as extended tables. In this way, more hot data can be cached, thus reducing the number of random reads of IO. After disassembly, if you want to get all the data, you need to associate two tables to fetch the data. But remember, never use join, because join not only puts a burden on CPU but also couples two tables (which must be on a database instance). The associated data should be written in the business Service layer, obtaining the master table and extended table data respectively, and then associating all the data with the associated fields.
Tools for sub-database and sub-table
Sharding-sphere:jar, formerly known as sharding-jdbc
TDDL:jar,Taobao Distribute Data Layer
Mycat: middleware.
Note: please investigate the pros and cons of the tools by yourself. the official website and community are preferred.
IV. Steps of sub-database and sub-table
Evaluate the number of sub-databases or tables according to capacity (current capacity and growth)-> select key (uniform)-> sub-table rules (hash or range, etc.)-> execute (generally double write)-> expand capacity (minimize data movement).
5. The problem of sub-database and sub-table 1. The query problem of non-partition key
Based on the horizontal database table, the splitting strategy is the commonly used hash method.
Note: when writing, user_id is generated by genetic method, as shown in the figure. With regard to the xbit gene, for example, it is divided into 8 tables, 23 to 8, so x takes 3, that is, the 3bit gene. When querying according to user_id, you can directly take the model and route to the corresponding sub-database or sub-table. When querying according to user_name, first generate user_name_code through user_name_code generating function, and then route it to the corresponding sub-library or sub-table. Id generates common snowflake algorithms.
Note: when querying by order_id or buyer_id, it is routed to the db_o_buyer library, and when queried by seller_id, it is routed to the db_o_seller library. It feels like putting the cart before the horse! Is there any other good way? What about changing the technology stack?
In addition to partition key, there is only one non-partition key query on the end.
Mapping method
Gene method
In addition to partition key, there is more than one non-partition key as a conditional query.
Mapping method
Redundancy method
In addition to partition key, there are various non-partition key combination condition queries in the background.
NoSQL method
Redundancy method
2. Non-partition key cross-database cross-table paging query problem
Based on the horizontal database table, the splitting strategy is the commonly used hash method.
Note: solve the problem by * * NoSQL method * * (ES, etc.).
3. Capacity expansion
Based on the horizontal database table, the splitting strategy is the commonly used hash method.
Horizontal expansion library (upgrade from library method)
Note: the capacity expansion is multiplied.
Horizontal expansion table (double write migration method) the first step: (synchronous double write) modify the application configuration and code, add double write, and deploy; the second step: (synchronous double write) copy the old data in the old library to the new library; the third step: (synchronous double write) proofread the old data in the new database; step 4: (synchronous double write) modify the application configuration and code, remove double writing, and deploy.
Note: double writing is a general scheme.
VI. Summary of sub-databases and tables.
Sub-database sub-table, first need to know where the bottleneck is, and then can be reasonably split (sub-library or sub-table? Horizontal or vertical? How many? ). And can not be split for the purpose of subdatabase and table.
It is important to choose key, taking into account both uniform splitting and non-partition key queries.
As long as the demand is met, the split rules are as simple as possible.
After reading the above, have you mastered the method of how to achieve sub-library and sub-table in MySQL? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.