Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Turn to: summary of Mysql partition and sub-table

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Quote

Some time ago, the project needs, has been studying mysql sharding, read some of the information in this area, but also personally tested some data. Here, to make a general note, it is convenient to review the knowledge later, in fact, most of them learn from the predecessors on the network, and then practice with a learning attitude and accumulate things of their own.

Split strategy selection

In fact, splitting is very flexible, and there are plenty of vertical splits, splitting a library into two or more, and putting associated tables in one library. There are plenty of horizontal splits that split tables with large amounts of data according to a certain logic. I feel that vertical sharding relatively relieves the bottleneck of IO, while horizontal sharding is designed to reduce the pressure of reading and writing to a single table or some tables. Our project is divided according to individual needs, using the level of segmentation, not to sub-database. Then we'll see what kind of segmentation needs to be used. Learned are: sub-table, partition, MERGE engine sub-table.

Brief introduction of MERGE engine Subtabl

First introduce the merge table, this method is only applicable to MyISAM. The tables in my database are all based on the InnoDB engine, so they were pass at first, but let's briefly introduce them here. It's in the mysql 5.1 manual.

An alternative to a MERGE table is a partitioned table, which stores partitions of a single table in separate files. Partitioning enables some operations to be performed more efficiently and is not limited to the MyISAM storage engine.

Changing to the MERGE engine table means becoming a partitioned table so that the partitions of a single table are stored in separate files. Partitioning can make some operations more efficient and free from the limitations of the MyISAM storage engine. Poor English, please take care of it. )

The above should be the main reasons for using merge tables.

Create and use

The requirement to be able to create MERGE tables starts with a set of tables with exactly the same data structure and the storage engine MyISAM.

Let's create one first.

Mysql > CREATE TABLE T1 (- > an INT NOT NULL AUTO_INCREMENT PRIMARY KEY,-> message CHAR (20)) ENGINE=MyISAM;mysql > CREATE TABLE T2 (- > an INT NOT NULL AUTO_INCREMENT PRIMARY KEY,-> message CHAR (20)) ENGINE=MyISAM;mysql > INSERT INTO T1 (message) VALUES ('Testing'), (' table'), ('T1'); mysql > INSERT INTO T2 (message) VALUES ('Testing'), (' table'), ('T2') Mysql > CREATE TABLE total (- > an INT NOT NULL AUTO_INCREMENT,-> message CHAR (20), INDEX (a))-> ENGINE=MERGE UNION= (T1 ~ T2) INSERT_METHOD=LAST

Query after

Mysql > SELECT * FROM total;+---+-+ | a | message | +-- +-+ | 1 | Testing | | 2 | table | | 3 | T1 | | 1 | Testing | | 2 | table | | 3 | T2 | +-+-+

You created the total table, which is equivalent to creating the table on the basis of T1 T2. It should be noted that the primary key or unique index in a single table can no longer be used as the only index reference in the total table after MERGE, which should be easier to understand but still want to talk about it. At the same time you can drop or ALTER TABLE tbl_name UNION= (...) Change the dataset of the table so that it can change dynamically and eliminate what is not needed.

Working with scen

If your data records show a certain time pattern, such as some logs that need to be recorded every day, you may only need the last month or the last few months, so that you can create a data table every day or at a certain time. When you need to query the data for a certain period of time, you only need to create a total MERGE table for this period of time. So that the data set can be controlled, not bad. So easy .

Sub-table

In fact, the idea of sub-table is very simple, as the name implies, is to split an existing table with a large amount of data. If the performance bottleneck of the database is on several key tables, you can include sub-tables in your consideration.

Problems encountered

Let me talk about the problems I encountered in the experiment and the related solutions.

1. How to divide the table according to what strategy divides the data in the existing table into multiple tables, and also takes into account the future scalability. This discussion in Germany can be used for reference.

"what are the mysql sub-tables and splitting strategies? Under what circumstances do you apply each? "" the strategy of breaking up the net again "

Is to create an index table, and the user id corresponds to the database id. (here he divides the tables with the same structure into different databases to further reduce the pressure, but at the same time, the synchronization of the data also needs to be solved by other means.), its essence is to divide the tables and databases at the same time. The advantage of this is that it is easy to expand in the future, but it loses a little bit of performance, because there will be one more query.

In my personal opinion, index tables may become a new bottleneck unless users don't keep growing. My approach belongs to another way. I write an algorithm that calculates a column of values and distributes the data roughly evenly in each sub-table according to a certain rule. As for scalability, the problem of increasing the number of subtables in the future is taken into account when writing the algorithm. Which strategy to choose depends on the business characteristics of your own table, the method has no absolute advantages and disadvantages, or should be chosen according to your own needs.

two。 After the sub-table, the primary key is automatically incremented bigint before the maintenance of the sub-table. So the format of the primary key has been determined in advance, and something like uuid has been dropped directly by pass. Also thought of writing a primary bond generation program, using the Atomic atomic weight feature of Java, but considering the need to increase the workload and high concurrency, this is likely to be a hidden danger. There is also the management of primary keys on the application layer, such as the atomic increment in redis. The more famous strategy on the Internet is "Ticket Servers: Distributed Unique Primary Keys on the Cheap", which roughly means to use a MyISAM storage engine table named Tickets64, which is specially used to store primary keys. There is only one row of data.

REPLACE INTO Tickets64 (stub) VALUES ('a'); SELECT LAST_INSERT_ID ()

Come and get it. And set up two libraries, the same method, but each growth step is different, to prevent one from falling, but also can run stably. Other good articles "Database Division Table (sharding) Series (II) Global Primary key Generation Strategy", "about Primary key Management", "solution to Global uniqueness of Primary key after Database Division Table (sharding)"

two。 After the dynamic selection of table names and table scores, the problem comes again, our project in the database layer uses the Mybatis framework. The SQL statements are written in the xml file, and now I need to set the table name dynamically. In fact, setting mybatis itself can solve this problem.

One of statementType STATEMENT,PREPARED or CALLABLE. This causes MyBatis to use Statement,PreparedStatement or CallableStatement, respectively, with a default value of PREPARED

As long as the property statementType is set to STATEMENT, the table name can be passed in as a parameter. When passing parameters, pass parameters with the dollar sign ${columnName}. As for the difference between Statement,PreparedStatement, I think everyone should know.

Another solution is to use the "shardbatis plug-in", which is open source and can achieve the function of data level segmentation, which can be learned by interested friends.

Partition table

After mysql5.1, a table of partition engine is provided. Look at this sentence.

In effect, different portions of a table are stored as separate tables in different locations. In fact, parts of a table can be stored in different locations as separate individual tables (slightly crappy)

As far as I understand it, if you partition a table and put different partitions on different disk locations, is it more beneficial to read as a whole?

Advantages and disadvantages of partition tables

Here is mainly read the mysql manual, I also played a role as a translator.

Partitioning makes it possible to store more data in one table than can be held on a single disk or file system partition. Compared to a table, it can only be stored in a hard disk or file system partition. Partitioning makes it possible to store more data.

Data that loses its usefulness can often be easily removed from a partitioned table by dropping the partition (or partitions) containing only that data. Conversely, the process of adding new data can in some cases be greatly facilitated by adding one or more new partitions for storing specifically that data. Invalid data can be removed more easily by removing partitions that contain only this data by dropping. On the contrary, it is easier to store some new data by adding new partitions.

Some queries can be greatly optimized in virtue of the fact that data satisfying a given WHERE clause can be stored only on one or more partitions, which automatically excludes any remaining partitions from the search. Because partitions can be altered after a partitioned table has been created, you can reorganize your data to enhance frequent queries that may not have been often used when the partitioning scheme was first set up. This ability to exclude non-matching partitions (and thus any rows they contain) is often referred to as partition pruning, and was implemented in MySQL 5.1.6. This sentence is very difficult to translate, let me give you a general meaning, when you partition by a column, the query will be optimized if you can specify a specific partition or a range in the query statement where. In fact, it is easy to understand, because you specify a partition in where, the query will only retrieve the partition you specify, not other data. The latter part says that you can modify the partition on the created partition to make it more reasonable.

Queries involving aggregate functions such as SUM () and COUNT () can easily be parallelized. Those aggregate functions, such as SUM () and COUNT (), are easily processed in parallel. (sounds cool.)

These two articles are well written, "advantages and disadvantages of MySQL partitioned tables" and "limitations of mysql partitioned tables on partitioning functions". There is another thing to consider when choosing a mysql partitioning scheme. In mysql's bug, there is a bug about the mysql partitioned table query cache: "Partitioning + Query Cache". Because of this problem, mysql has cached the query of the partitioned table disable, whether you enable query caching or not, query caching will not be enabled. If you care about this, please choose the plan carefully.

Real case

Some good sharding examples on the Internet, attach links, and share "Database Sharding at Netlog, with MySQL and PHP" with you. The design of the database sub-database in the framework of the net again. "Amazon's Dynamo". Ticket Servers: Distributed Unique Primary Keys on the Cheap. Some need to climb down the wall to see, as for how to climb the wall, I believe that every god has his own way.

The above is about the summary of the three split schemes of mysql. The information is all found by ourselves, so it is inevitable that some of them will be inaccurate. If you find anything, please let us know. I hope to grow with you.

Note: later, we will also consider how to actually operate in the database layer, set up partition tables and data import tests.

Https://my.oschina.net/OpenSourceBO/blog/353464

Https://my.oschina.net/u/914897/blog/492421

Http://haitian299.github.io/2016/05/26/mysql-partitioning/

Http://www.itmmd.com/201411/208.html

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report