Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the reason why the partitioning key in the MySQL partition table must be part of the primary key?

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the MySQL partition table partition key must be part of the primary key is part of the relevant knowledge, the content is detailed and easy to understand, simple and fast operation, has a certain reference value, I believe that after reading this MySQL partition table partition key must be part of the reason why the article will have a harvest, let's take a look at it.

Foreword:

Partitioning is a design pattern of tables. Generally speaking, table partitioning is to divide a large table into several small tables according to conditions. But for applications, partitioned tables are the same as non-partitioned tables. In other words, partitioning is transparent to the application, but the database reorganizes the data.

With the continuous development of business, there will be more and more data in the database, accordingly, the amount of data in a single table will become larger and larger, up to a critical value, the query performance of a single table will decline.

This critical value can not be generalized, it is related to hardware capability and specific business.

Although in many MySQL operation and maintenance specifications, it is recommended that the single table should not exceed 500w or 1000W.

But in fact, in the production environment, I have also seen the size of more than 2T, record hundreds of millions of tables, at the same time, the business is not affected.

When a single table is too large, the business usually considers two split schemes: horizontal split and vertical split.

Horizontal split VS vertical split

Split horizontally, the split dimension is a row, and the records in the table are generally split into multiple tables according to a certain rule or algorithm.

The split table can be in either one instance or multiple different instances. If it is the latter, distributed transactions will be involved.

Split vertically, the dimension of the split is the column, which is generally split into multiple business modules. This kind of split is more of a split of the upper business.

In terms of the complexity of the transformation, the former is less than the latter.

Therefore, when the amount of data in a single table is too large, horizontal splitting is more commonly used in the industry.

The common horizontal splitting schemes are: sub-database table and partition table.

Although the sub-library and sub-table is a relatively thorough horizontal split scheme, on the one hand, its transformation takes a certain amount of time; on the other hand, it also has certain requirements for the ability of development. Relatively speaking, partitioned tables are relatively simple and do not require business transformation.

Partition table

Many people may think that the advantage of MySQL lies in OLTP applications, which is not suitable for OLAP applications, so partitioned tables are not recommended as partial OLAP features.

But in fact, for some types of business, it is more suitable to use partition tables, especially those that have obvious differences between hot and cold data, and the hot and cold data are time-related.

Let's look at the advantages of partitioned tables:

Improve query performance:

For the query operation of the partition table, if the query condition contains the partition key, the query operation will only be pushed down to the qualified partition, and the irrelevant partition will be filtered out automatically.

In the case of a large amount of data, it can improve the query speed.

Transparent to the business:

The business side does not need to make any transformation to convert the table from a non-partitioned table to a partitioned table.

Easy to manage:

When deleting, migrating, and maintaining a single partition, other partitions are not affected.

In particular, the delete (DROP) operation for a single partition avoids DELETE operations for all records of that partition.

Unfortunately, MySQL partitioned tables do not support parallel queries. In theory, when a query involves multiple partitions, parallel queries should be carried out between partitions in order to make full use of multi-core CPU resources.

But MySQL does not support it, including the early official documentation, which also mentions this issue and puts the implementation of this feature on the priority list.

These features are not currently implemented in MySQL Partitioning, but are high on our list of priorities.- Queries involving aggregate functions such as SUM () and COUNT () can easily be parallelized. A simple example of such a query might be SELECT salesperson_id, COUNT (orders) as order_total FROM sales GROUP BY salesperson_id;. Changes of partitioned tables in By "parallelized," we mean that the query can be run simultaneously on each partition, and the final result obtained merely by summing the results obtained for all partitions.- Achieving greater query throughput in virtue of spreading data seeks over multiple disks.MySQL 8.0

A significant update to the partitioning table in MySQL 5.7is that the InnoDB storage engine natively supports partitioning without having to implement it through the ha_partition interface.

So, in MySQL 5.7.If you want to create a partition table based on the MyISAM storage engine, you will be prompted with warning.

The partition engine, used by table 'sbtest.t_range', is deprecated and will be removed in a future release. Please use native partitioning instead.

In MySQL 8.0, even more thoroughly, the server layer removes the ha_partition interface code.

If you want to use partition tables, you can only use a storage engine that supports native partitions. In MySQL 8.0, there is only InnoDB.

This means that in MySQL 8.0, it is almost impossible to create MyISAM partitioned tables.

This explains from another point of view why MyISAM tables are not recommended in production.

Mysql > CREATE TABLE t_range (- > id INT,-> name VARCHAR (10)->) ENGINE = MyISAM-> PARTITION BY RANGE (id) (- > PARTITION p0 VALUES LESS THAN (5),-> PARTITION p1 VALUES LESS THAN (10)->); ERROR 1178 (42000): why does The storage engine for the table doesn't support native partitioning have to be part of the primary key?

When using partition tables, you often encounter the following error.

Mysql > CREATE TABLE opr (- > opr_no INT,-> opr_date DATETIME,-> description VARCHAR (30),-> PRIMARY KEY (opr_no)->)-> PARTITION BY RANGE COLUMNS (opr_date) (- > PARTITION p0 VALUES LESS THAN ('20210101'),-> PARTITION p1 VALUES LESS THAN ('20210102'),-> PARTITION p2 VALUES LESS THAN MAXVALUE->) ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function (prefixed columns are not considered).

That is, the partitioning key must be part of the primary key.

The opr above is an operation flow table. Where opr_no is the operation serial number, which is generally set as the primary key, and opr_date is the operation time. Partitioning based on operating time is a common partitioning scenario.

To break this limitation, opr_date can be used as part of the primary key.

Mysql > CREATE TABLE opr (- > opr_no INT,-> opr_date DATETIME,-> description VARCHAR (30),-> PRIMARY KEY (opr_no, opr_date)->)-> PARTITION BY RANGE COLUMNS (opr_date) (- > PARTITION p0 VALUES LESS THAN ('20210101'),-> PARTITION p1 VALUES LESS THAN ('20210102'),-> PARTITION p2 VALUES LESS THAN MAXVALUE->) Query OK, 0 rows affected (0.04 sec)

However, this creation will bring a new problem, that is, for the same opr_no, it can be inserted into different partitions.

Mysql > CREATE TABLE opr (- > opr_no INT,-> opr_date DATETIME,-> description VARCHAR (30),-> PRIMARY KEY (opr_no, opr_date)->)-> PARTITION BY RANGE COLUMNS (opr_date) (- > PARTITION p0 VALUES LESS THAN ('20210101'),-> PARTITION p1 VALUES LESS THAN ('20210102'),-> PARTITION p2 VALUES LESS THAN MAXVALUE->) Query OK, 0 rows affected (0.04 sec) mysql > insert into opr values; Query OK, 1 row affected (0.00 sec) mysql > insert into opr values; Query OK, 1 row affected (0.00 sec) mysql > select * from opr partition (p0) +-+ | opr_no | opr_date | description | +-+ | 1 | 2020-12-31 00:00:01 | | abc | +-+ 1 row in set (0.00 sec) mysql > select * from opr partition (p1) | +-+ | opr_no | opr_date | description | +-+ | 1 | 2021-01-01 00:00:01 | | abc | +-+ 1 row in set (0.00 sec) |

This actually violates the uniqueness requirements of the business for opr_no.

Since this is the case, some children's shoes will suggest adding a unique index to opr_no, But, the reality is cruel.

Mysql > create unique index uk_opr_no on opr (opr_no); ERROR 1503 (HY000): A UNIQUE INDEX must include all columns in the table's partitioning function (prefixed columns are not considered)

Even if you add a unique index, the partitioning key must be included in the unique index.

All in all, for MySQL partitioned tables, the uniqueness of non-partitioned columns at the table level cannot be guaranteed at the database level, only within the partition.

This is also one of the areas where MySQL partition tables have been criticized.

But in fact, this pot is not appropriate for MySQL to memorize, and there is also this limitation for Oracle index organization tables (InnoDB is index organization tables).

The Oracle official document (http://docs.oracle.com/cd/E11882_01/server.112/e40540/schemaob.htm#CNCPT1514) explicitly mentions that "the partitioning key must be part of the primary key" when talking about the features of the index organization table (Index-Organized Table, referred to as IOT).

Note the following characteristics of partitioned IOTs:-Partition columns must be a subset of primary key columns. -Secondary indexes can be partitioned locally and globally. -OVERFLOW data segments are always equipartitioned with the table partitions.

Next, let's take a look at the implementation effect of the initial table-building SQL in Oracle.

SQL > CREATE TABLE opr_oracle (opr_no NUMBER, opr_date DATE, description VARCHAR2 (30), PRIMARY KEY (opr_no)) ORGANIZATION INDEX PARTITION BY RANGE (opr_date) (PARTITION p0 VALUES LESS THAN (TO_DATE ('20170713,' yyyymmdd')), PARTITION p1 VALUES LESS THAN (TO_DATE ('20170714),' yyyymmdd'), PARTITION p2 VALUES LESS THAN (MAXVALUE)) PARTITION BY RANGE (opr_date) (* ERROR at line 8:ORA-25199: partitioning key of an index-organized table must be a subset of theprimary key

The same error was reported.

Note: ORGANIZATION INDEX is specified here, and the index organization table is created.

It seems that the fact that the partitioning key must be part of the primary key is not a limitation of MySQL, but a limitation of the index organization table.

In my opinion, the reason why there is such a restriction on index organization tables is based on performance considerations.

Assuming that the partition key and the primary key are two different columns, although the partition key is also specified during the insert operation, all partitions need to be scanned to determine whether the inserted primary key value violates the uniqueness constraint. In this way, the efficiency will be relatively low, which goes against the original intention of the partition table.

There is no such restriction for heap tables.

In the heap table, the primary key and the data in the table are stored separately, and only the primary key index is used to determine whether the inserted primary key value violates the uniqueness constraint.

But unlike MySQL, Oracle implements a global index, so the problem that the same opr_no above is allowed to be inserted into different partitions can be avoided by a globally unique index.

SQL > CREATE TABLE opr_oracle (opr_no NUMBER, opr_date DATE, description VARCHAR2 (30), PRIMARY KEY (opr_no, opr_date)) ORGANIZATION INDEX PARTITION BY RANGE (opr_date) (PARTITION p0 VALUES LESS THAN (TO_DATE ('20170713,' yyyymmdd')), PARTITION p1 VALUES LESS THAN (TO_DATE ('20170714,' yyyymmdd')), PARTITION p2 VALUES LESS THAN (MAXVALUE)) Table created.SQL > create unique index uk_opr_no on opr_oracle (opr_no); Index created.SQL > insert into opr_oracle values (1 row created.SQL > insert into opr_oracle values ('2020-12-31 00 opr_no),' abc'); 1 row created.SQL > insert into opr_oracle values ('2020-12-31 00 opr_no),' abc') Insert into opr_oracle values (abc') * ERROR at line 1:ORA-00001: unique constraint (SCOTT.SYS_IOT_TOP_87350) violated ('2020-12-31 00 abc' 00 ERROR at line 1:ORA-00001)

But MySQL is powerless, because the MySQL partitioned table implements only the local partitioned index (Local Partitioned Index), not the global index in Oracle (Global Index).

Local partition index VS global index

The schematic diagram of the local partition index and the global index is as follows:

Combined with the schematic, let's look at the difference between the two indexes:

The local partition index is also a partition index, and there is an one-to-one correspondence between the partition index and the table partition.

The global index can be either partitioned or unpartitioned.

In the case of a global partitioned index, a partitioned index can correspond to multiple table partitions, just as a table partition can correspond to multiple partitioned indexes.

The management of the local partition index affects only a single partition, not other partitions.

The management of the global partitioned index will invalidate the entire index, which, of course, can be avoided by the UPDATE INDEXES clause.

Local partitioned indexes can only guarantee uniqueness within partitions, not table-level uniqueness, but global partitions can.

In Oracle, whether an index organizes a table or a heap table, if you want to create a locally unique index, the partitioning key must also be part of the unique key.

SQL > create unique index uk_opr_no_local on opr_oracle (opr_no) local;create unique index uk_opr_no_local on opr_oracle (opr_no) local * ERROR at line 1:ORA-14039: partitioning columns must form a subset of key columns of a UNIQUEindex's article on "Why the partitioning key must be part of the primary key in a MySQL partition table" ends here. Thank you for reading! I believe that everyone has a certain understanding of the knowledge of "what is the reason why the partition key must be part of the primary key in the MySQL partition table". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report