In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Overview of MySQL partition tables
We often come across a table that holds hundreds of millions or even billions of records, and these tables hold a large number of historical records. Cleaning up these historical data is a big headache, because all the data are in a common table. So you can only enable one or more delete statements with where conditions to delete (the general where condition is time). This puts a lot of pressure on the database. Even if we delete these, the underlying data files are not getting smaller. The most effective way to deal with this kind of problem is to use partition tables. The most common method of partitioning is to partition by time.
One of the biggest advantages of partitioning is that it can clean up historical data very efficiently.
1. Confirm that the MySQL server supports partitioned tables
Command:
Show plugins
2. Characteristics of MySQL partition table
Logically a table, physically stored in multiple files
HASH partition (HASH)
Characteristics of HASH partition
According to the value of MOD (partition key, number of partitions), data rows are stored in different partitions of the table. The data can be evenly distributed in each partition. The key value of HASH partition must be a value of type INT, or it can be converted to type INT by function.
How to set up a HASH partition table
Use the INT type field customer_id as the partition key
CREATE TABLE `customer_login_ log` (`ID', id` int (10) unsigned NOT NULL COMMENT 'login user ID', `login_ time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT' user login time', `login_ ip` int (10) unsigned NOT NULL COMMENT 'login IP', `login_ type` tinyint (4) NOT NULL COMMENT' login type: 0 unsuccessful 1 success') ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' user login log table 'PARTITION BY HASH (customer_id) PARTITIONS 4
Use the non-INT type field login_time as the partition key (need to convert to INT type first)
CREATE TABLE `customer_login_ log` (`ID', id` int (10) unsigned NOT NULL COMMENT 'login user ID', `login_ time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT' user login time', `login_ ip` int (10) unsigned NOT NULL COMMENT 'login IP', `login_ type` tinyint (4) NOT NULL COMMENT' login type: 0 unsuccessful 1 success') ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' user login log table 'PARTITION BY HASH (UNIX_TIMESTAMP (login_time)) PARTITIONS 4
If the customer_login_log table is not partitioned, the file on the physical disk is
Customer_login_log.frm # stores table raw data information customer_login_log.ibd # Innodb data file
If you create the HASH partition table as above, there are five files
Customer_login_log.frm customer_login_log#P#p0.ibdcustomer_login_log#P#p1.ibdcustomer_login_log#P#p2.ibdcustomer_login_log#P#p3.ibd
Demo
It looks like there is only one database, but there are actually multiple partition files. For example, if we want to insert a piece of data, we do not need to specify a partition. MySQL will automatically deal with it for us.
Query
Range Partition (RANGE)
Characteristics of RANGE zoning
According to the range of partition key values, data rows are stored in different partitions of the table. The ranges of multiple partitions should be continuous, but they cannot overlap. By default, the VALUES LESS THAN property is used, that is, each partition does not include the specified value.
How to set up RANGE Partition
If no p3 partition is defined, an error will be reported when the inserted customer_id is greater than 29999. If it is defined, the excess data will be stored in p3.
Applicable scenarios for RANGE Partition
The partition key is a date or time type (it can make the data of each partition table more balanced. If the integer id is used as the partition key in the above example, and if the active users are concentrated between 10000 and 19999, then the amount of data in p1 will be much larger than that of other partitions, which loses the meaning of partition. And partition by time type, if the data is to be archived in chronological order, only one partition needs to be archived) all queries include partition keys (to avoid cross-partition queries) to clean up historical data by partition range on a regular basis
LIST partition
Characteristics of LIST partition
Press the list of partition key values to partition like range partitions, the list values of each partition cannot be repeated. Each row of data must be able to find the corresponding partition list, otherwise the data insertion fails.
How to set up LIST Partition
If you insert a data row with a login_type of 10, an error will be reported
3. How to partition a login log table (customer_login_log)
Business scenario
Every time the user logs in, the customer_login_log log is recorded. The user login log is saved for one year, and can be deleted or archived after 1 year.
Partition type and partition key of login log table
Use RANGE partitions with login_time as the partition key
User login log table after partition
It is stored by year, so it is converted by YEAR function.
CREATE TABLE `customer_login_ log` (`login_ id` int (10) unsigned NOT NULL COMMENT 'login user ID', `login_ time` DATETIME NOT NULL COMMENT' login IP', `login_ type` tinyint (4) NOT NULL COMMENT 'login type: 0 unsuccessful 1 success') ENGINE=InnoDB PARTITION BY RANGE (YEAR (login_time)) (PARTITION p0 VALUES LESS THAN (2017), PARTITION p1 VALUES LESS THAN (2018), PARTITION p2 VALUES LESS THAN (2019))
Insert and query data
Query the partition data in the specified table
SELECT table_name,partition_name,partition_description,table_rows FROMinformation_ schema.`PARTITIONS` WHERE table_name = 'customer_login_log'
Insert two more 18-year logs, which will be stored in the p2 table
It was said before that when setting up a partition table, it is best to set up a MAXVALUE partition. The reason why it is not established here is for the convenience of data maintenance. If we establish a MAXVALUE partition, it is easy to ignore a problem. When we insert some data in 2019, it will be automatically stored in that MAXVALUE partition, and then it will be inconvenient to do data maintenance, so there is no MAXVALUE partition.
Instead, by planning tasks, add this partition at the end of each year. For example, we are now at the end of 2018, and we need to establish a log partition in the log table for 2019, otherwise the log will fail to be inserted in 2019.
We can use the following statement
Increase zoning
ALTER TABLE customer_login_log ADD PARTITION (PARTITION p3 VALUES LESS THAN (2020))
Add partitions and insert data
Delete partition
If we want to delete the data for one year between 2016 and 2017, because we have already done the partition, we only need to delete the p0 partition through a statement.
ALTER TABLE customer_login_log DROP PARTITION p0
It can be found that the p0 partition has been deleted and all the logs for 2016 have been cleared.
Archive partition historical data
We may have another need to archive the data.
Mysql version > = 5.7. archiving partition historical data is very convenient, which provides a way to exchange partitions.
Partition data archiving migration conditions:
MySQL > = 5.7 data tables archived with the same structure must be non-partitioned tables and non-temporary tables; there can be no foreign key constraints on the archiving engine if: archive
Build tables and exchange partitions
CREATE TABLE `arch_customer_login_ log` (`customer_ id`logon user ID', `login_ time` DATETIME NOT NULL COMMENT 'user logon time', `login_ ip`logon IP', `login_ type`logon type: 0 failed 1 success') ENGINE=InnoDB; ALTER TABLE customer_login_log exchange PARTITION p1 WITH TABLE arch_customer_login_log
It can be found that the 2017 data in the original customer_login_ log table (the data in the p1 partition) has been transferred to the arch_customer_login_ log table, but the p1 partition has not been deleted, but the data has been transferred, so we also need to execute the DROP command to delete the partition to prevent data from being inserted into it.
Change the storage engine of archived data to archiving engine
Finally, we change the storage engine of archived data to the archiving engine, and the command is
ALTER TABLE customer_login_log ENGINE=ARCHIVE
The advantage of using the archiving engine is that it takes up less space than Innodb, but the archiving engine can only query, not write.
4. The main things to do with partitioned tables
Select the partitioning key according to the business scenario to avoid querying partitioned tables across partitioning queries. It is best to include tables whose partitioning key has a primary key or unique index in the WHERE clause. The primary key or unique index must be part of the partitioning key. (this is why we removed the primary key login log id (login_id) when we partitioned above, otherwise we cannot partition according to the year above. So the partition table is actually more suitable in the MyISAM engine)
About the index difference between MyISAM and Innodb
1. About automatic growth
The auto-growing column of the myisam engine must be an index. If it is a combined index, the auto-growing column may not be the first column, and it can sort and increment according to the previous columns.
The automatic growth of the innodb engine must be an index, and if it is a composite index, it must also be the first column of a composite index.
two。 About primary key
Myisam allows tables without any indexes and primary keys to exist
The index of myisam is the address where the row is saved.
If the innodb engine does not set a primary key or a non-empty unique index, it automatically generates a 6-byte primary key (invisible to the user)
The data of innodb is part of the primary index, and the additional index holds the value of the primary index.
3. On the count () function
Myisam holds the total number of rows of the table. If select count (*) from table; will directly take out this value,
Innodb does not save the total number of rows in the table, and if you use select count (*) from table;, you will traverse the entire table, which is quite expensive, but after adding the wehre condition, myisam and innodb handle it in the same way.
4. Full-text index
Myisam supports full-text indexing of FULLTEXT type
Innodb does not support full-text indexing of type FULLTEXT, but innodb can use the sphinx plug-in to support full-text indexing, and the effect is better. (sphinx is an open source software that provides API interfaces in multiple languages to optimize various queries in mysql.)
5.delete from table
When using this command, innodb will not re-create the table, but delete the data one by one. If you want to empty the table with a large amount of data on the innodb, it is best not to use this command. (truncate table is recommended, but the user needs to have permission to drop this table)
6. Index save location
The index of myisam is saved separately in the table name + .MYI file.
The index of innodb is stored in the table space with the data.
Summary
The above is the whole content of this article, I hope that the content of this article has a certain reference and learning value for your study or work, if you have any questions, you can leave a message and exchange, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.