In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces what the partition table in MySQL means. It is very detailed and has a certain reference value. Friends who are interested must finish reading it.
For users, the partitioned table is a separate logical table, but at the bottom it consists of multiple physical subtables. The code that implements the partition is actually an encapsulation of the handle object of a set of underlying tables, and the request for the partition table is converted into an interface call to the storage engine through the handle object.
Meaning
MySQL can define the data stored in each partition by using the PARTITION BY clause when creating the table. When executing a query, the optimizer filters partitions that do not have the data we need based on the partition definition, so that the query does not have to scan all partitions-- just look for the partitions that contain the data we need.
One of the main purposes of partitioning is to store data in different tables according to a coarse granularity. By doing so, we can store the relevant data together, and it will be convenient when we want to delete the data of the entire partition in bulk.
Partitioning can play a big role in the following scenarios:
The table is too large to be all in memory, or there is hot data only in the last part of the table, and the rest is historical data.
The data of partitioned tables is easier to maintain
The data of the partition table can be distributed across different physical devices
You can use partitioned tables to avoid some special bottlenecks
If necessary, you can back up and restore separate partitions
The partitioned table itself has some limitations, and the following are particularly important:
A table can only have a maximum of 1024 partitions
In MySQL5.1, the partition expression must be an integer, or an expression that returns an integer. In MySQL5.5, some scenarios can be partitioned using columns directly
Foreign key constraints cannot be used in partitioned tables
If there is a primary key or unique index column in the partition field, then all primary key columns and unique index columns must be included
The principle of partitioned tables
There is no difference between the storage engine managing the underlying tables of the partition and managing ordinary tables (all underlying tables must use the same storage engine)
The index of a partitioned table is simply to add an identical index to each underlying table From the point of view of the storage engine, the underlying table is no different from a normal table, and the storage engine does not need to know whether it is a normal table or part of a partitioned table.
The operation on the partition table follows the following operation logic:
SELECT query
When querying a partition table, the partition layer first opens and locks all the underlying tables. The optimizer first determines whether some partitions can be filtered, and then calls the corresponding storage engine interface to access the data of each partition.
INSERT operation
When writing a record, the partition layer first opens and locks all the underlying tables, then determines which partition receives the record, and then writes the record to the corresponding underlying table
DELETE operation
When deleting a record, the partition layer first opens and locks all the underlying tables, then determines the partition corresponding to the data, and finally deletes the corresponding underlying tables.
UPDATE operation
When updating a record, the partition layer first opens and locks all the underlying tables. MySQL first determines which partition of the record that needs to be updated, then takes out the data and updates it, and then determines which partition the updated data should be placed in. Finally, it writes to the underlying table and deletes the underlying table where the original data is located.
All of these operations support filtering.
Although each operation "opens and locks all the underlying tables first", this does not mean that the partitioned table locks the whole table during processing. If the storage engine can implement row-level locks on its own, the corresponding table locks are released at the partition layer. This locking and unlocking process is similar to a query on a normal InnoDB.
Type of partition table
MySQL supports a variety of partitioned tables, and what we see most is partitioning by range, with each partition storing records that fall within a certain range. A partition expression can be a column or an expression that contains a column.
For example, as shown in the following table, sales for each year are stored in different partitions:
CREATE TABLE sales (order_date DATETIME NOT NULL,....) ENGINE=InnoDB PARTITION BY RANGE (YEAR (order_date)) (PARTITION packs 2010 VALUES LESS THAN (2010), PARTITION packs 2011 VALUES LESS THAN (2011), PARTITION packs 2012 VALUES LESS THAN (2012), PARTITION p_catchall VALUES LESS THAN MAXVALUE;)
Various functions can be used in the PARTITION partition clause. However, there is a requirement that the value returned by the expression must be a definite integer and cannot be a constant.
MySQL also supports keys, hashes, list partitions, and so on.
How to use partitioned tables
If we want to query the records for a period of time from a very large table, how should we query the table and how can we be more efficient?
Because of the large amount of data, it is certainly not possible to scan the whole table at each query, and we do not want to use the index because of the space and maintenance consumption of the index. Even if you do use an index, you will find that the data is not clustered in the way you want, resulting in a large number of fragments, resulting in thousands of random Icano in a query. In fact, when the amount of data is super large, the B-Tree index can no longer pray.
So we can choose some coarse-grained but less expensive ways to retrieve data, such as indexing only a small piece of metadata on a large amount of data.
This is what partitioning does, and understand that partitioning can be used as the initial form of an index. Because partitions do not need additional data structures to record what data each partition has-partitions do not need to pinpoint the location of each piece of data, there is no need for additional data structures-so the cost is very low. You only need a simple expression to express what data is stored in each partition.
In order to ensure the scalability of a large amount of data, there are generally two strategies:
Full scan of data without any indexes: it is efficient as long as you can use WHERE conditions to limit the data you need to a few partitions. Using this strategy assumes that you don't have to put the data completely in memory, and that all the data you need is on disk. Because the memory is relatively small, the data is quickly squeezed out of memory, so caching does nothing. This strategy applies when a large amount of data is accessed in a normal way.
Index data and separate hotspots: if the data has obvious "hot spots", and other data is rarely accessed except this part of the data, then you can put this part of the hot data in a separate partition, so that the data in this partition can have the opportunity to cache in memory. Such a query can access only a small partitioned table, can use indexes, or can effectively use caching.
Under what circumstances will something go wrong?
The two partitioning strategies described above are based on two very important assumptions: queries can filter out a lot of additional partitions, and partitions themselves do not incur many additional costs.
It turns out that these two assumptions can be problematic in some scenarios:
Mismatch between partitioned columns and index columns: if the defined index columns and partitioned columns do not match, the query cannot be partitioned and filtered.
The cost of choosing a partition can be high: different types of partitions are implemented differently, so their performance varies. Range partitions, in particular, can be very expensive to query which partitions of eligible rows, because the server needs to scan the list of all partition definitions to find the correct answer.
The cost of opening and locking all underlying tables can be high: when querying access to partitioned tables, MySQL needs to open and lock all underlying tables, which is another overhead of partitioned tables.
The cost of maintaining partitions can be high: some partition maintenance operations can be very fast, such as adding or deleting partitions. Some operations, such as reorganizing partitions or ALTER-like statements, can be expensive because such operations require data replication.
The above is all the content of the article "what is the meaning of partition table in MySQL". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.