Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to deeply analyze the Partition function of MySQL Partition

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how to deeply analyze the Partition function of MySQL partition. The content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Support for partitions (Partition) since 5.1,

= horizontal partition (divided by row according to column properties) =

To take a simple example: a table containing ten-year invoice records can be partitioned into ten different partitions, each containing records for one year.

Several modes of horizontal partition: =

* Range (scope)-this mode allows DBA to divide the data into different ranges. For example, DBA can divide a table into three partitions by year, data from the 1980s (1980s), data from the 1990s (1990s), and any data after 2000 (including 2000).

* Hash (hash)-this mode allows DBA to compute the Hash Key of one or more columns of the table, and finally partition the data regions corresponding to different values of the Hash code. For example, DBA can create a table that partitions the primary key of the table.

* Key (key value)-an extension of the above Hash mode, where Hash Key is generated by the MySQL system.

* List (predefined list)-this mode allows the system to split the row data corresponding to the values of the list defined by DBA. For example, DBA creates a table that spans three partitions, based on the data corresponding to the 2004, 2005 and 2006 values.

* Composite (composite mode)-it's mysterious, , it's actually a combination of the above patterns, so I don't explain it. For example: on a table that initializes a Range range partition, we can perform a hash hash partition on one of the partitions.

= vertical partition (by column) =

To take a simple example: a table that contains large text and BLOB columns, and these text and BLOB columns are not often accessed, so it is necessary to divide these infrequently used text and BLOB into another partition to improve access speed while ensuring their data relevance.

[test procedure for partitioned and unpartitioned tables]

* create a partition table and split it by the year of the date

[sql] view plain copy

Mysql > CREATE TABLE part_tab (C1 int default NULL, c2 varchar (30) default NULL, c3 date default NULL) engine=myisam

PARTITION BY RANGE (year (c3)) (PARTITION p0 VALUES LESS THAN (1995)

PARTITION p1 VALUES LESS THAN (1996), PARTITION p2 VALUES LESS THAN (1997)

PARTITION p3 VALUES LESS THAN (1998), PARTITION p4 VALUES LESS THAN (1999)

PARTITION p5 VALUES LESS THAN (2000), PARTITION p6 VALUES LESS THAN (2001)

PARTITION p7 VALUES LESS THAN (2002), PARTITION p8 VALUES LESS THAN (2003)

PARTITION p9 VALUES LESS THAN (2004), PARTITION p10 VALUES LESS THAN (2010)

PARTITION p11 VALUES LESS THAN MAXVALUE)

Note the last line, considering the maximum possible value

* create an unpartitioned table

[sql] view plain copy

Mysql > create table no_part_tab (C1 int (11) default NULL,c2 varchar (30) default NULL,c3 date default NULL) engine=myisam

* inject 8 million pieces of test data through stored procedures

Mysql > set sql_mode=''; / * if you fail to create a stored procedure, you need to set this variable first, bug? * /

MySQL > delimiter / * set the statement Terminator to / /, because stored procedure statements use; end * /

[sql] view plain copy

Mysql > CREATE PROCEDURE load_part_tab ()

Begin

Declare v int default 0

While v

< 8000000 do insert into part_tab values (v,'testing partitions',adddate('1995-01-01',(rand(v)*36520) mod 3652)); set v = v + 1; end while; end // mysql>

Delimiter

Mysql > call load_part_tab ()

Query OK, 1 row affected (8 min 17.75 sec)

[sql] view plain copy

Mysql > insert into no_part_tab select * from part_tab

Query OK, 8000000 rows affected (51.59 sec)

Records: 8000000 Duplicates: 0 Warnings: 0

* Test SQL performance

[sql] view plain copy

Mysql > select count (*) from part_tab where c3 > date '1995-01-01' and c3

< date '1995-12-31'; +----------+ | count(*) | +----------+ | 795181 | +----------+ 1 row in set (0.55 sec) [sql] view plain copy mysql>

Select count (*) from no_part_tab where c3 > date '1995-01-01' and c3

< date '1995-12-31'; +----------+ | count(*) | +----------+ | 795181 | +----------+ 1 row in set (4.69 sec) 结果表明分区表比未分区表的执行时间少90%。 * 通过explain语句来分析执行情况 [sql] view plain copy mysql >

Explain select count (*) from no_part_tab where c3 > date '1995-01-01' and c3

< date '1995-12-31'\G /* 结尾的\G使得mysql的输出改为列模式 */ *************************** 1. row *************************** id: 1 select_type: SIMPLE table: no_part_tab type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 8000000 Extra: Using where 1 row in set (0.00 sec) [sql] view plain copy mysql>

Explain select count (*) from part_tab where c3 > date '1995-01-01' and c3

< date '1995-12-31'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: part_tab type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 798458 Extra: Using where 1 row in set (0.00 sec) explain语句显示了SQL查询要处理的记录数目 * 试验创建索引后情况 [sql] view plain copy mysql>

Create index idx_of_c3 on no_part_tab (c3)

Query OK, 8000000 rows affected (1 min 18.08 sec)

Records: 8000000 Duplicates: 0 Warnings: 0

[sql] view plain copy

Mysql > create index idx_of_c3 on part_tab (c3)

Query OK, 8000000 rows affected (1 min 19.19 sec)

Records: 8000000 Duplicates: 0 Warnings: 0

List of database file sizes after index creation:

2008-05-24 09:23 8608 no_part_tab.frm

2008-05-24 09:24 255999996 no_part_tab.MYD

2008-05-24 09:24 81611776 no_part_tab.MYI

2008-05-24 09:25 0 part_tab#P#p0.MYD

2008-05-24 09:26 1024 part_tab#P#p0.MYI

2008-05-24 09:26 25550656 part_tab#P#p1.MYD

2008-05-24 09:26 8148992 part_tab#P#p1.MYI

2008-05-24 09:26 25620192 part_tab#P#p10.MYD

2008-05-24 09:26 8170496 part_tab#P#p10.MYI

2008-05-24 09:25 0 part_tab#P#p11.MYD

2008-05-24 09:26 1024 part_tab#P#p11.MYI

2008-05-24 09:26 25656512 part_tab#P#p2.MYD

2008-05-24 09:26 8181760 part_tab#P#p2.MYI

2008-05-24 09:26 25586880 part_tab#P#p3.MYD

2008-05-24 09:26 8160256 part_tab#P#p3.MYI

2008-05-24 09:26 25585696 part_tab#P#p4.MYD

2008-05-24 09:26 8159232 part_tab#P#p4.MYI

2008-05-24 09:26 25585216 part_tab#P#p5.MYD

2008-05-24 09:26 8159232 part_tab#P#p5.MYI

2008-05-24 09:26 25655740 part_tab#P#p6.MYD

2008-05-24 09:26 8181760 part_tab#P#p6.MYI

2008-05-24 09:26 25586528 part_tab#P#p7.MYD

2008-05-24 09:26 8160256 part_tab#P#p7.MYI

2008-05-24 09:26 25586752 part_tab#P#p8.MYD

2008-05-24 09:26 8160256 part_tab#P#p8.MYI

2008-05-24 09:26 25585824 part_tab#P#p9.MYD

2008-05-24 09:26 8159232 part_tab#P#p9.MYI

2008-05-24 09:25 8608 part_tab.frm

2008-05-24 09:25 68 part_tab.par

* Test SQL performance again

[sql] view plain copy

Mysql > select count (*) from no_part_tab where c3 > date '1995-01-01' and c3

< date '1995-12-31'; +----------+ | count(*) | +----------+ | 795181 | +----------+ 1 row in set (2.42 sec) /* 为原来4.69 sec 的51%*/ 重启mysql ( net stop mysql, net start mysql)后,查询时间降为0.89 sec,几乎与分区表相同。 [sql] view plain copy mysql>

Select count (*) from part_tab where c3 > date '1995-01-01' and c3

< date '1995-12-31'; +----------+ | count(*) | +----------+ | 795181 | +----------+ 1 row in set (0.86 sec) * 更进一步的试验 ** 增加日期范围 [sql] view plain copy mysql>

Select count (*) from no_part_tab where c3 > date '1995-01-01' and c3

< date '1997-12-31'; +----------+ | count(*) | +----------+ | 2396524 | +----------+ 1 row in set (5.42 sec) [sql] view plain copy mysql>

Select count (*) from part_tab where c3 > date '1995-01-01' and c3

< date '1997-12-31'; +----------+ | count(*) | +----------+ | 2396524 | +----------+ 1 row in set (2.63 sec) ** 增加未索引字段查询 [sql] view plain copy mysql>

Select count (*) from part_tab where c3 > date '1995-01-01' and c3

< date '1996-12-31' and c2='hello'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.75 sec) [sql] view plain copy mysql>

Select count (*) from no_part_tab where c3 > date '1995-01-01' and c3

< date '1996-12-31' and c2='hello'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (11.52 sec) = 初步结论 = * 分区和未分区占用文件空间大致相同 (数据和索引文件) * 如果查询语句中有未建立索引字段,分区时间远远优于未分区时间 * 如果查询语句中字段建立了索引,分区和未分区的差别缩小,分区略优于未分区。 = 最终结论 = * 对于大数据量,建议使用分区功能。 * 去除不必要的字段 * 根据手册, 增加myisam_max_sort_file_size 会增加分区性能 [分区命令详解] = 分区例子 = * RANGE 类型 [sql] view plain copy CREATE TABLE users ( uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(30) NOT NULL DEFAULT '', email VARCHAR(30) NOT NULL DEFAULT '' ) PARTITION BY RANGE (uid) ( PARTITION p0 VALUES LESS THAN (3000000) DATA DIRECTORY = '/data0/data' INDEX DIRECTORY = '/data1/idx', PARTITION p1 VALUES LESS THAN (6000000) DATA DIRECTORY = '/data2/data' INDEX DIRECTORY = '/data3/idx', PARTITION p2 VALUES LESS THAN (9000000) DATA DIRECTORY = '/data4/data' INDEX DIRECTORY = '/data5/idx', PARTITION p3 VALUES LESS THAN MAXVALUE DATA DIRECTORY = '/data6/data' INDEX DIRECTORY = '/data7/idx' ); 在这里,将用户表分成4个分区,以每300万条记录为界限,每个分区都有自己独立的数据、索引文件的存放目录,与此同时,这些目录所在的物理磁盘分区可能也都是完全独立的,可以提高磁盘IO吞吐量。 * LIST 类型 [sql] view plain copy CREATE TABLE category ( cid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(30) NOT NULL DEFAULT '' ) PARTITION BY LIST (cid) ( PARTITION p0 VALUES IN (0,4,8,12) DATA DIRECTORY = '/data0/data' INDEX DIRECTORY = '/data1/idx', PARTITION p1 VALUES IN (1,5,9,13) DATA DIRECTORY = '/data2/data' INDEX DIRECTORY = '/data3/idx', PARTITION p2 VALUES IN (2,6,10,14) DATA DIRECTORY = '/data4/data' INDEX DIRECTORY = '/data5/idx', PARTITION p3 VALUES IN (3,7,11,15) DATA DIRECTORY = '/data6/data' INDEX DIRECTORY = '/data7/idx' ); 分成4个区,数据文件和索引文件单独存放。 * HASH 类型 [sql] view plain copy CREATE TABLE users ( uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(30) NOT NULL DEFAULT '', email VARCHAR(30) NOT NULL DEFAULT '' ) PARTITION BY HASH (uid) PARTITIONS 4 ( PARTITION p0 DATA DIRECTORY = '/data0/data' INDEX DIRECTORY = '/data1/idx', PARTITION p1 DATA DIRECTORY = '/data2/data' INDEX DIRECTORY = '/data3/idx', PARTITION p2 DATA DIRECTORY = '/data4/data' INDEX DIRECTORY = '/data5/idx', PARTITION p3 DATA DIRECTORY = '/data6/data' INDEX DIRECTORY = '/data7/idx' ); 分成4个区,数据文件和索引文件单独存放。 例子: [sql] view plain copy CREATE TABLE ti2 (id INT, amount DECIMAL(7,2), tr_date DATE) ENGINE=myisam PARTITION BY HASH( MONTH(tr_date) ) PARTITIONS 6; CREATE PROCEDURE load_ti2() begin declare v int default 0; while v < 80000 do insert into ti2 values (v,'3.14',adddate('1995-01-01',(rand(v)*3652) mod 365)); set v = v + 1; end while; end // * KEY 类型 [sql] view plain copy CREATE TABLE users ( uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(30) NOT NULL DEFAULT '', email VARCHAR(30) NOT NULL DEFAULT '' ) PARTITION BY KEY (uid) PARTITIONS 4 ( PARTITION p0 DATA DIRECTORY = '/data0/data' INDEX DIRECTORY = '/data1/idx', PARTITION p1 DATA DIRECTORY = '/data2/data' INDEX DIRECTORY = '/data3/idx', PARTITION p2 DATA DIRECTORY = '/data4/data' INDEX DIRECTORY = '/data5/idx', PARTITION p3 DATA DIRECTORY = '/data6/data' INDEX DIRECTORY = '/data7/idx' ); 分成4个区,数据文件和索引文件单独存放。 * 子分区 子分区是针对 RANGE/LIST 类型的分区表中每个分区的再次分割。再次分割可以是 HASH/KEY 等类型。例如: [sql] view plain copy CREATE TABLE users ( uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(30) NOT NULL DEFAULT '', email VARCHAR(30) NOT NULL DEFAULT '' ) PARTITION BY RANGE (uid) SUBPARTITION BY HASH (uid % 4) SUBPARTITIONS 2( PARTITION p0 VALUES LESS THAN (3000000) DATA DIRECTORY = '/data0/data' INDEX DIRECTORY = '/data1/idx', PARTITION p1 VALUES LESS THAN (6000000) DATA DIRECTORY = '/data2/data' INDEX DIRECTORY = '/data3/idx' ); 对 RANGE 分区再次进行子分区划分,子分区采用 HASH 类型。 或者 [sql] view plain copy CREATE TABLE users ( uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, name VARCHAR(30) NOT NULL DEFAULT '', email VARCHAR(30) NOT NULL DEFAULT '' ) PARTITION BY RANGE (uid) SUBPARTITION BY KEY(uid) SUBPARTITIONS 2( PARTITION p0 VALUES LESS THAN (3000000) DATA DIRECTORY = '/data0/data' INDEX DIRECTORY = '/data1/idx', PARTITION p1 VALUES LESS THAN (6000000) DATA DIRECTORY = '/data2/data' INDEX DIRECTORY = '/data3/idx' ); 对 RANGE 分区再次进行子分区划分,子分区采用 KEY 类型。 = 分区管理 = * 删除分区 [sql] view plain copy ALERT TABLE users DROP PARTITION p0; 删除分区 p0。 * 重建分区 o RANGE 分区重建 [sql] view plain copy ALTER TABLE users REORGANIZE PARTITION p0,p1 INTO (PARTITION p0 VALUES LESS THAN (6000000)); 将原来的 p0,p1 分区合并起来,放到新的 p0 分区中。 o LIST 分区重建 [sql] view plain copy ALTER TABLE users REORGANIZE PARTITION p0,p1 INTO (PARTITION p0 VALUES IN(0,1,4,5,8,9,12,13)); 将原来的 p0,p1 分区合并起来,放到新的 p0 分区中。 o HASH/KEY 分区重建 [sql] view plain copy ALTER TABLE users REORGANIZE PARTITION COALESCE PARTITION 2; 用 REORGANIZE 方式重建分区的数量变成2,在这里数量只能减少不能增加。想要增加可以用 ADD PARTITION 方法。 * 新增分区 o 新增 RANGE 分区 [sql] view plain copy ALTER TABLE category ADD PARTITION (PARTITION p4 VALUES IN (16,17,18,19) DATA DIRECTORY = '/data8/data' INDEX DIRECTORY = '/data9/idx'); 新增一个RANGE分区。 o 新增 HASH/KEY 分区 [sql] view plain copy ALTER TABLE users ADD PARTITION PARTITIONS 8; 将分区总数扩展到8个。 [ 给已有的表加上分区 ] [sql] view plain copy alter table results partition by RANGE (month(ttime)) (PARTITION p0 VALUES LESS THAN (1), PARTITION p1 VALUES LESS THAN (2) , PARTITION p2 VALUES LESS THAN (3) , PARTITION p3 VALUES LESS THAN (4) , PARTITION p4 VALUES LESS THAN (5) , PARTITION p5 VALUES LESS THAN (6) , PARTITION p6 VALUES LESS THAN (7) , PARTITION p7 VALUES LESS THAN (8) , PARTITION p8 VALUES LESS THAN (9) , PARTITION p9 VALUES LESS THAN (10) , PARTITION p10 VALUES LESS THAN (11), PARTITION p11 VALUES LESS THAN (12), PARTITION P12 VALUES LESS THAN (13) ); 默认分区限制分区字段必须是主键(PRIMARY KEY)的一部分,为了去除此 限制: [方法1] 使用ID [sql] view plain copy mysql>

ALTER TABLE np_pk

-> PARTITION BY HASH (TO_DAYS (added))

-> PARTITIONS 4

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

However, this statement using the id column for the partitioning column is valid, as shown here:

[sql] view plain copy

Mysql > ALTER TABLE np_pk

-> PARTITION BY HASH (id)

-> PARTITIONS 4

Query OK, 0 rows affected (0.11 sec)

Records: 0 Duplicates: 0 Warnings: 0

[method 2] remove the original competition and generate new competition

[sql] view plain copy

Mysql > alter table results drop PRIMARY KEY

Query OK, 5374850 rows affected (7 min 4.05 sec)

Records: 5374850 Duplicates: 0 Warnings: 0

[sql] view plain copy

Mysql > alter table results add PRIMARY KEY (id, ttime)

Query OK, 5374850 rows affected (6 min 14.86 sec)

Records: 5374850 Duplicates: 0 Warnings: 0

On how to in-depth analysis of MySQL partition Partition function to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report