In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
It is believed that many inexperienced people have no idea about how to realize the partition table in MySQL. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
Establishment of MySQL partition
MySQL can create four types of partitions:
RANGE partition: assigns multiple rows to a partition based on column values that belong to a given contiguous interval. For details, see Section 18.2.1, "RANGE Partition".
LIST partitions: similar to partitioning by RANGE, except that LIST partitions are selected based on column values matching a value in a set of discrete values. For more information, see Section 18.2.2, "LIST Partition".
HASH partition: a partition that is selected based on the return value of a user-defined expression that uses the column values of the rows to be inserted into the table. This function can contain any valid expression in MySQL that produces a non-negative integer value. For details, see Section 18.2.3, "HASH Partition".
KEY partitions: similar to partitioning by HASH, except that KEY partitions only support computing one or more columns, and the MySQL server provides its own hash function. One or more columns must contain integer values. For more information, please see 18.2.4. KEY partition.
Subpartition: a subpartition is a re-partition of each partition in the partition table. Writing format reference: 18.2.5. Subpartition
(1) what should be paid attention to about subpartitions: each partition must have the same number of subpartitions.
(2) if you use SUBPARTITION on any partition on a partition table to explicitly define any subpartitions, then all subpartitions must be defined.
When creating a partition, you can specify the data storage location and index location of the partition, so that different data can be saved across disks or file systems. Data split disk storage can increase the data reading speed to a certain extent, because after using multiple disks, the Icano operation of each disk will be reduced. And the storage capacity can be increased by using the specified partition storage location.
It is important to remember that no matter what type of partition is used, the partition is automatically numbered sequentially when it is created and recorded starting at 0. When a new row is inserted into a partition table, these partition numbers are used to identify the correct partition. For example, if your table uses four partitions, those partitions are numbered 0, 1, 2, and 3. For RANGE and LIST partition types, it is necessary to make sure that each partition number defines a partition. For HASH partitions, the user function used must return an integer value greater than 0. For KEY partitions, this problem is automatically handled by the hash function used internally by the MySQL server. Note: partition names are case-insensitive, and for RANGE and LIST partitions, partition names cannot be repeated. These can be selected according to different needs, and the more common one is the RANGE partition.
Commonly used partition management of MySQL:
RANGE and LIST partition management
Partitions are transparent to the program, and only deletions can be operated at the partition level, while others, such as queries, modifications, and additions, cannot specify partitions.
ALTER TABLE... DROPPARTITION... . (delete partition)
ALTER TABLE... ADD PARTITION (PARTITION p3 VALUESLESS THAN (…)) ; [Z3] increase partition
ALTER TABLE... REORGANIZE PARTITION... ,... INTO (
PARTITION p0 VALUES LESS THAN (…)
); [Z4] merge and split partitions.
HASH and KEY partition management
Adding partitions is the same as RANGE or LIST partitions. For modifying partitions, you cannot delete partitions from the table of HASH or LIST partitions in the same way as deleting partitions from the table of HASH or LIST partitions. However, you can use the "ALTERTABLE... COALESCE PARTITION" command to merge HASH or KEY partitions.
If you want to view partition information, you can query it through the sql statement
SELECT * FROM INFORMATION_SCHEMA.partitions WHERETABLE_SCHEMA = schema () AND TABLE_NAME='xxx'
Comparison of partition table efficiency
MySQL Partition Table experiment
Red for zoning and blue for non-zoning
Test environment: CentOS, 1G memory, 20g hard disk
Experiment: test is not partitioned (there is a table RPT_MALEVENTS), test2 (same as test)
Background data:
> SELECT COUNT (*) FROM RPT_MALEVENTS
+-+
| | COUNT (*) |
+-+
| | 17082107 |
+-+
1 row in set (10.84 sec)
Mysql > SELECTCOUNT (*) FROM RPT_MALEVENTS
+-+
| | COUNT (*) |
+-+
| | 17082107 |
+-+
1 row in set (14.63sec)
Data distribution: 2011-8-pound-4-11-8-pound-17
Partition table structure:
CREATETABLE `RPT_ MALEVENTS` (
`DATE` date NOT NULL RECORD_
`RECORD_ HOUR` tinyint (2) NOT NULL
`RECORD_ Minute` tinyint (2) NOT NULL
`DATETIME` datetime NOT NULL
`MC_ IP` int (10) unsigned NOT NULL
`PC_ IP` int (10) unsigned NOT NULL
`NETOBJECT_GROUP_ ID`smallint (5) DEFAULTNULL
`TYPE` tinyint (3) NOT NULL
`TYPE` smallint (5) NOT NULL
`TYPE` smallint (5) NOT NULL
`ALERT_ ID`tinyint (3) NOT NULL
`EVENT_ COUNT` int (10) unsigned DEFAULT NULL
PRIMARY KEY (`RECORD_ Date`, `RECORD_ HOUR`, `RECORD_ MINUTE`, `MC_ IP`, `PC_ IP`, `ALERT_ TYPE`, `SUB_ TYPE`, `ALERT_ ID`)
KEY `RECORD_ DATETIME` (`RECORD_ DATETIME`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci / *! 50100 PARTITIONBY RANGE (TO_DAYS (RECORD_DATE) [Z1]) (PARTITIONp2011 VALUESLESS THAN (734503) ENGINE=InnoDB, PARTITION p20110809 VALUESLESS THAN (734724) ENGINE=InnoDB, PARTITION p20110810 VALUESLESS THAN (734725) ENGINE=InnoDB, PARTITIONp20110811 VALUESLESS THAN (734726) ENGINE=InnoDB, PARTITION p20110812 VALUESLESS THAN (734727) ENGINE=InnoDB, PARTITION p20110813 VALUESLESS THAN (734728) ENGINE=InnoDB, PARTITION p20110814 VALUESLESS THAN (734729) ENGINE=InnoDB, PARTITION p20110815 VALUESLESS THAN (734730) ENGINE=InnoDB PARTITION p20110816 VALUESLESS THAN (734731) ENGINE = InnoDB, PARTITIONp20110817 VALUESLESS THAN (734732) ENGINE = InnoDB, PARTITION p20110818 VALUESLESS THAN (734733) ENGINE = InnoDB, PARTITION pMax VALUESLESS THAN MAXVALUE [Z2] ENGINE = InnoDB)
The physical storage of the partition table is as follows. Currently, the storage engine of innodB is used and the sub-table structure is adopted.
The analysis is as follows
(conditional query queries all data)
Mysql > SELECTCOUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATE > '2011-08-01' AND RECORD_DATE
< '2011-08-19'; +----------+ | COUNT(*) | +----------+ | 17082107 | +----------+ 1 row in set (21.62sec) mysql>SELECTCOUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATE > '2011-08-01' AND RECORD_DATE
< '2011-08-19'; +----------+ | COUNT(*) | +----------+ | 17082107 | +----------+ 1 row in set (29.20sec) (查询部分数据,不使用分区函数使用的列) mysql>SELECTCOUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATETIME > '2011-08-02' ANDRECORD_DATETIME
< '2011-08-11'; +----------+ | COUNT(*) | +----------+ | 5083194 | +----------+ 1 row in set (2.83sec) mysql>SELECTCOUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATETIME > '2011-08-02' AND RECORD_DATETIME
< '2011-08-11'; +----------+ | COUNT(*) | +----------+ | 5083194 | +----------+ 1 row in set (5.60sec) (使用其他条件查询部分数据) mysql>SELECTCOUNT (*) FROM RPT_MALEVENTS WHERE ALERT_TYPE = 1
+-+
| | COUNT (*) |
+-+
| | 88739 |
+-+
1 row in set (8.49sec)
SELECT COUNT (*) FROMRPT_MALEVENTS WHERE ALERT_TYPE = 1
+-+
| | COUNT (*) |
+-+
| | 88739 |
+-+
1 row in set (12.88sec)
(small-scale query, query within a partition)
Mysql > SELECTCOUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATE > '2011-08-13' AND RECORD_DATE
< '2011-08-15'; +----------+ | COUNT(*) | +----------+ | 2116249 | +----------+ 1 row in set (1.85sec) mysql>SELECTCOUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATE > '2011-08-13' AND RECORD_DATE
< '2011-08-15'; +----------+ | COUNT(*) | +----------+ | 2116249 | +----------+ 1 row in set (3.10sec) 分析SQL语句的执行过程 rows表示MySQL根据表统计信息及索引选用情况,估算的找到所需的记录所需要读取的行数。 mysql>EXPLAIN PARTITIONS SELECT * FROMRPT_MALEVENTS WHERERECORD_DATETIME > '2011-08-12' AND RECORD_DATETIME EXPLAIN SELECT * FROMRPT_MALEVENTS WHERERECORD_DATETIME > '2011-08-12' AND RECORD_DATETIME
< '2011-08-13' LIMIT1\G; ***************************1. row *************************** id: 1 select_type: SIMPLE table: RPT_MALEVENTS type: range possible_keys:RECORD_DATETIME key: RECORD_DATETIME key_len: 8 ref: NULL rows: 1002288[z5] Extra: Using where 1 row in set (0.00sec) 与分区函数使用列无关的查询条件 mysql>EXPLAIN PARTITIONS SELECT COUNT (*) FROMRPT_MALEVENTS WHERE ALERT_TYPE = 1\ G
* * 1. Row *
Id: 1
Select_type: SIMPLE
Table: RPT_MALEVENTS
Partitions: p2011,p20110809,p20110810,p20110811,p20110812,p20110813,p20110814,p20110815,p20110816,p20110817,p20110818,pMax [z6]
Type: index
Possible_keys: NULL
Key: RECORD_DATETIME
Key_len: 8
Ref: NULL
Rows: 17084274 [z7]
Extra: Using where; Using index
1 row in set (0.00sec)
Mysql > EXPLAINSELECT COUNT (*) FROM RPT_MALEVENTS WHERE ALERT_TYPE = 1\ G
* * 1. Row *
Id: 1
Select_type: SIMPLE
Table: RPT_MALEVENTS
Type: index
Possible_keys: NULL
Key: RECORD_DATETIME
Key_len: 8
Ref: NULL
Rows: 17082459
Extra: Using where; Using index
1 row in set (0.00sec)
Use columns used by partition functions
Mysql > EXPLAINPARTITIONS SELECT COUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATE > '2011-08-09' AND RECORD_DATE
< '2011-08-15'\G; ***************************1. row *************************** id: 1 select_type: SIMPLE table: RPT_MALEVENTS partitions: p20110810,p20110811,p20110812,p20110813,p20110814,p20110815[z8] type: range possible_keys:PRIMARY key: PRIMARY key_len: 3 ref: NULL rows: 3767081[z9] Extra: Using where; Using index 1 row in set (0.08sec) mysql>EXPLAINPARTITIONS SELECT COUNT (*) FROM RPT_MALEVENTS WHERE RECORD_DATE > '2011-08-09' AND RECORD_DATE
< '2011-08-15'\G; ***************************1. row *************************** id: 1 select_type: SIMPLE table: RPT_MALEVENTS partitions: NULL type: range possible_keys:PRIMARY key: PRIMARY key_len: 3 ref: NULL rows: 8541229[z10] Extra: Using where; Using index 1 row in set (0.00sec) 删除数据,如果删除1整天的数据,由于我们采用按天分区, mysql>ALTER TABLERPT_MALEVENTS DROP PARTITION p20110809; [Z11]
Query OK, 0 rowsaffected (0.65 sec)
Records: 0 Duplicates: 0 Warnings: 0
After deletion, the RPT_MALEVENTS#P#p20110809.ibd containing the index and data is deleted
If you delete it in a traditional non-partitioned way.
Mysql > DELETE FROMRPT_MALEVENTS WHERE RECORD_DATE
< '2011-08-10'; Query OK, 3929328rows affected (1 min 29.68 sec) 由此可见,删除整个分区内的数据还是很快的, 如果分区表采用传统的方式删除: mysql>DELETEFROM RPT_MALEVENTS WHERE RECORD_DATE
< '2011-08-11'; Query OK, 1153866rows affected (19.72 sec) mysql>DELETE FROMRPT_MALEVENTS WHERE RECORD_DATE < '2011-08-11'
Query OK, 1153866rows affected (18.75 sec)
It takes about the same time to delete a day's data in the traditional way.
After only deleting the data, the data partition p20110810 still exists, and the size remains the same. ALTER TABLE T1 OPTIMIZE PARTITION can be used for recycling, but MySQL5.1.22 has not been implemented yet.
Delete across partitions.
DELETE FROMRPT_MALEVENTS WHERE ALERT_TYPE = 1
Query OK, 63969 rowsaffected (55.20 sec)
DELETE FROMRPT_MALEVENTS WHERE ALERT_TYPE = 1
Query OK, 63969 rowsaffected (50.26 sec)
The deletion of partitioned tables is slightly slower than that of unpartitioned tables.
[Z1] partition function
[Z2] partition information, starting from 2011-08-09
[Z3] columns that are not used by partition functions scan all partitions
[Z4] the amount of data is 681311, and the number of rows scanned after partition is 355911. Although the query condition has no column of partition function, mysql's query optimizer will correspond it to time partition, which can reduce the number of rows scanned.
[Z5] the amount of data is 681311, and the number of scanning rows after partition is 1002288.
[Z6] find all partitions
[Z7] fields that are independent of partition functions traverse almost all rows.
[Z8] scan partial partitions
[Z9] the number of scanning lines is reduced.
[Z10] estimate the number of rows scanned
[Z11] the data for this partition is all data before 2011-8-10, a total of 3929328.
Summary:
Partition table is a new function in MySQL5.1. As of MySQL5.1.22-rc, partition technology is not very mature, and many partition maintenance and management functions have not been realized. For example, the recovery of data storage space in the partition, the repair of the partition, the optimization of the partition, etc., the partition of MySQL can be used in the table that can be deleted by the partition, and the modification operation to the database is small, and the table is frequently queried by the partition field (for example, the statistical tables in malicious code are partitioned by day, often queried by time, grouped, etc., and the partition can be deleted by day). In addition, because MySQL has no global index only partitioned index, when a sheet has 2 unique indexes [Z5], the table cannot be partitioned, and the partitioned column must contain the primary key. Otherwise, MySQL will report an error.
In short, MySQL has a lot of restrictions on partitioning, and I personally think that the partitioning of hash and key is not very meaningful.
Partitioning introduces a new way to optimize queries (of course, there are corresponding disadvantages). The optimizer can use the partition function to trim the partition or remove the partition completely from the query. It achieves this optimization by extrapolating whether data can be found on a particular partition. So at best, trimming allows the query to access less data. It is important to define the partitioning key in the WHERE clause, even if it looks redundant. With the partition key, the optimizer can remove unused partitions, otherwise the execution engine will access all partitions of the table like a merge table, which can be very slow on large tables. Partitioned data is easier to maintain than non-partitioned data, and old data can be removed by deleting partitions. Partition data can be distributed to different physical locations so that servers can use multiple hard drives more efficiently.
[Z1] the return value of the partition function must be an integer, and the return value of the partition function of the new partition should be greater than that of the partition function of any existing partition.
[Z2] error prompt for tables with primary keys: # 1503
A PRIMARY KEY MUST INCLUDE ALL COLUMNS INTHE TABLE'S PARTITIONING FUNCTION, if there is no primary key, no such constraint
[Z3] Note: for tables partitioned by RANGE, you can only use ADD PARTITION to add new partitions to the high end of the partition list. That is, you cannot add a partition that is smaller than the scope of this partition.
[Z4] for tables partitioned by RANGE, only adjacent partitions can be reorganized; RANGE partitions cannot be skipped. You cannot use REORGANIZEPARTITION to change the partition type of a table; that is, for example, you cannot change a RANGE partition to a HASH partition, and vice versa. Nor can you use this command to change partition expressions or columns.
[Z5] pay attention to the difference between primary key and unique index
After reading the above, have you mastered how to implement partitioned tables in MySQL? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.