In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you what the temporary table of MySQL learning is. The editor thought it was very practical, so I shared it with you as a reference. Let's follow the editor and have a look.
Temporary watch
Temporary tables can be divided into disk temporary tables and memory temporary tables, while temporary files will only exist on disk, not in memory. Specifically, the temporary table memory form has Memory engine and Temptable engine, the main difference is the character type (varchar, blob,text type) storage mode, the former regardless of the actual number of characters, are using fixed-length space storage, the latter will use variable-length space storage, which improves the storage efficiency in memory, there are more data can be stored in memory instead of converted into disk temporary tables. The Memory engine has been available since the early 5.6, and Temptable is a new engine introduced in 8.0. On the other hand, there are three forms of disk temporary table, one is MyISAM table, one is InnoDB temporary table, and the other is Temptable file map table. The last of these is provided by 8.0.
In versions 5.6 and earlier, disk temporary tables are placed in the temporary directory of the database configuration, and the undolog of disk temporary tables is placed together with the undo of regular tables. (note that since disk temporary tables are deleted after database restart, there is no need for redolog to crash recovery to ensure transaction integrity, so it is not necessary to write redolog, but undolog is still needed because rollback is needed.
After MySQL 5.7, the data of the disk temporary table and the undo are separated and placed in a separate tablespace ibtmp1. The main reason for separating temporary tables is to reduce the overhead of maintaining metadata when creating deleted tables.
After MySQL 8.0, the data of the disk temporary table is placed separately in the Session temporary table space pool (ibt file in the # innodb_temp directory), and the undo of the temporary table is placed in the tablespace ibtmp1 of global. Another big improvement is that the space occupied by disk temporary table data in 8.0 can be released to the operating system when the connection is disconnected, while in version 5.7 it needs to be rebooted.
At present, there are two situations where temporary tables are used:
Users explicitly create temporary tables
This is a table created by the user by explicitly executing the command create temporary table, and the type of the engine is either explicitly specified or uses the default configured value (default_tmp_storage_engine). Memory usage follows the memory management style of the specified engine. For example, the table of InnoDB is first cached in Buffer Pool, and then written back to the disk file by brushing the thread.
In tmpdir, the disk temporary table is located under tmpdir, and the file name is similar to # sql4d2b_8_0.ibd, where # sql is the fixed prefix, 4d2b is the hexadecimal representation of the process number, 8 is the hexadecimal representation of the MySQL thread number (id in show processlist), 0 is the incremental value of each connection starting from 0, and ibd is the disk temporary table of innodb (controlled by the parameter default_tmp_storage_engine). In 5.6. after the disk temporary table is created, the corresponding frm and engine files are created under tmpdir, which can be seen through the file system ls command. After the connection is closed, the corresponding files are automatically deleted. Therefore, if we see a lot of similar format file names in tmpdir 5.6, we can use the file name to determine which process and which connection uses the temporary table. This technique is especially useful when troubleshooting tmpdir directories that take up too much space. When the connection is released, the temporary table explicitly created by the user will automatically release the space and release the space back to the operating system. The undolog of the temporary table is stored in the undo tablespace, along with the undo of the regular table. With the undo rollback segment, the user-created temporary table can also support rollback.
The temporary disk table is located in the ibtmp file, and the location and size of the ibtmp file are controlled by the parameter innodb_temp_data_file_path. The data and undo of the explicitly created table are in the ibtmp. After the user is disconnected, the temporary table will be released, but just marking it in the ibtmp file, the space will not be released back to the operating system. If you want to free up space, you need to restart the database. In addition, it is important to note that 5.6 can see the created file directly under tmpdir, but 5.7 is created in the tablespace ibtmp, so you can't see the specific table file. If you need to view it, you need to look at the INFORMATION_SCHEMA.INNODB_TEMP_TABLE_INFO table, which has a column of name, and you can see the table name here. The naming specification is similar to 5.6, so you can quickly find connections that take up a lot of space.
In 8.0, the data of the temporary table is further separated from the undo, the data is stored in the ibt file (controlled by the parameter innodb_temp_tablespaces_dir), and the undo is still stored in the ibtmp file (still controlled by the parameter innodb_temp_data_file_path). The Session temporary table space where ibt files are stored is called the Session temporary table space, and the ibtmp that stores undo is called the Global temporary table space. Here is an introduction to the Session temporary tablespace where the data is stored. Session temporary tablespaces are represented on disk as a set of file pools of ibt files. On startup, the database is recreated in the configured directory and deleted when the database is closed. At startup, 10 ibt files are created by default, with a maximum of two per connection, one for the user-created temporary table and the other for the implicit temporary table created by the optimizer described below. Of course, temporary tables are created only when they are needed, and ibt files are not consumed if they are not needed. When all 10 ibt are used, the database will continue to be created, up to 400, 000. When the connection is released, the ibt file used by the connection is automatically released and the space is reclaimed. If you want to recycle Global temporary tablespaces, you still need to restart. However, because the files storing data have been separated, and they support dynamic recycling (that is, disconnection means releasing space), the space occupation problem that has plagued everyone for a long time on 5.7 has been well alleviated. Of course, there is room for optimization, for example, space needs to be disconnected before it can be released, while in theory, a lot of space can be freed after some SQL (such as the user drop some explicitly created temporary table) is executed. In addition, if you need to check the table name, still look at the INFORMATION_SCHEMA.INNODB_TEMP_TABLE_INFO table. It should be noted that on 8.0, explicit temporary tables cannot be compressed, while 5.6 and 5.7 can.
The optimizer implicitly creates temporary tables
This kind of temporary table is an auxiliary table created by the database to assist the execution of some complex SQL. Whether a temporary table is needed or not is generally decided by the optimizer. Unlike the temporary table explicitly created by the user to create disk files directly, if the optimizer thinks that SQL needs temporary table assistance, it will first use the memory temporary table. If it exceeds the configured memory (min (tmp_table_size, max_heap_table_siz)), it will be converted into a disk temporary table. This kind of disk temporary table is similar to that explicitly created by the user, and the engine type is controlled by the parameter internal_tmp_disk_storage_engine. In general, this implicitly created temporary table is used in slightly more complex queries, including and not limited to order by, group by, distinct, etc. Users can use the explain command to see if there is a word like Using temporary in the Extra column, and if so, be sure to use a temporary table.
In 5. 6, the implicit temporary table is still under tmpdir, which can be seen during complex SQL execution and is deleted as soon as the execution is complete. It is worth noting that in 5. 6, this implicitly created temporary table can only be controlled by the MyISAM engine, that is, there is no internal_tmp_disk_storage_engine parameter to control. Therefore, when we only have innodb tables in our system, we will also see some indicators of MyISAM changing, in this case, it is generally the reason for implicit temporary tables.
The implicit temporary table is created in the ibtmp file and will be marked for deletion after the SQL is completed, but the space is still not returned to the operating system. If it needs to be returned, the database needs to be restarted. In addition, the parameter internal_tmp_disk_storage_engine is supported, and users can select InnoDB or MYISAM table as the disk temporary table.
In 8.0, implicit temporary tables are created in the Session temporary tablespace, that is, together with the data of the temporary tables explicitly created by the user. If a connection requires an implicit temporary table for the first time, the database will pull one from the pool of ibt files for the connection to use until the connection is released. As mentioned above, in 8.0, temporary tables explicitly created by users are also allocated an ibt from the pool for use, and up to two ibt files are used to store temporary tables per connection. We can query INFORMATION_SCHEMA.INNODB_SESSION_TEMP_TABLESPACES to determine where the ibt file is going. In this table, each ibt file is a line, as long as there are several ibt files in the current system. There is a column called ID. If this column is 0, it means that the ibt is not being used. If it is not 0, it means that the connection of this ID is in use. For example, if ID is 8, then the connection with process_id of 8 is using the ibt file. In addition, there is a column purpose, a value of INTRINSIC indicates that the implicit temporary table is in use, and this ibt,USER indicates that the temporary table is in use. In addition, there is a column of size that represents the current size. Users can query this table to determine the use of temporary tables in the entire database, which is very convenient.
In 5. 6 and 5. 7, memory temporary tables can only use the Memory engine, and by 8. 0, there is an additional choice of Temptable engine. Temptable uses variable length storage in the storage format, which can save storage space, further improve memory utilization, and reduce the number of conversion to disk temporary tables. If the disk temporary table set up is InnoDB or MYISAM, the consumption of a conversion copy is required. In order to reduce consumption as much as possible, Temptable proposes an overflow mechanism, that is, if the memory temporary table exceeds the configured size, the disk space map is used, that is, a file is opened and then deleted, leaving a handle for read and write operations. The read-write file format is the same as the in-memory format, which skips the conversion step and further improves performance. Note that this feature is available in version 8.0.16 that has not yet been released, because the code is not yet visible and can only be guessed from the documentation. In 8.0.16, the parameter internal_tmp_disk_storage_engine has been removed, and disk temporary tables can only use the overflow form of InnoDB or TempTable. From the documentation, we seem to see that the official recommendation for TempTable is the new engine. The specific performance improvement needs to be tested after the code has been released before a conclusion can be drawn.
Temporary file
Compared with temporary tables, temporary files may be more unfamiliar to everyone, and temporary files are more used in scenarios where data are cached and sorted. In general, the cached or sorted data is first placed in memory, and temporary disk files will be used only if it cannot be stored in memory. The use of temporary files is not quite the same as the general table, after the general table is created, it begins to read and write data, and after using it, the file is deleted, but the temporary file is used in a different way. After creating it (using the mkstemp system function), immediately call unlink to delete the file, but not the close file, and then use the original handle to operate the file. The advantage is that when the process exception crash, there will be no temporary files left because they have not been deleted, but the disadvantages are also obvious, we use the ls command on the file system can not see this file, we need to use lsof + L1 to view this deleted attribute file.
Currently, we mainly use temporary files in the following scenarios:
Temporary files in DDL
In the process of doing online DDL, many operations need to rebuild the original table. Before rebuilding the table, we need to sort a variety of secondary indexes, but the sorting of a large number of data is unlikely to be completed in memory and needs to rely on external sorting algorithms. MySQL uses merge sorting. Temporary files need to be created during this process. Generally, the amount of space required is similar to that of the original table. But after using it, it will be cleaned up immediately, so you need to leave enough space when doing DDL. You can specify the path to this sort file by specifying innodb_tmpdir. This parameter can be modified dynamically, generally setting it on a path with enough disk space. The name of a temporary file is generally similar to ibXXXXXX, where ib is a fixed prefix and XXXXXX is a random combination of uppercase and lowercase letters and numbers.
In doing online DDL, we allow users to do DML operations on the original table, that is, additions, deletions, modifications and queries. We can't insert the original table directly, so we need a place to record the changes to the original table, and then apply it to the new table after the end of the DDL. The place of this record is online log, of course, if there are few changes, it can be stored directly in memory (the parameter innodb_sort_buffer_size can be controlled, and this parameter also controls the size of each read and write block of online log). This onlinelog is also stored in a temporary file, created in innodb_tmpdir, and the maximum size is controlled by the parameter innodb_online_alter_log_max_size. If this size is exceeded, the DDL will fail. The name of the temporary file is also similar to the name of the sorted temporary file above.
In the final phase of online DDL, all the sorted files and the DML generated halfway need to be applied to an intermediate file with a file name similar to # sql-ib53-522550444.ibd, where # sql-ib is a fixed prefix and 53 is a randomly generated number for the table id,522550444 of the InnoDB layer. At the same time, a frm file is also generated in the server layer (not in 8.0), with a file name similar to # sql-4d2b_2a.frm, where # sql is a fixed prefix, 4d2b is the hexadecimal representation of the process number, and 2a is the hexadecimal representation of the thread number (id in show processlist). So we can also use this naming convention to find out which thread is doing DDL. It should be noted here that the intermediate file here is actually a temporary table, not a temporary file mentioned above. These intermediate files can be viewed through ls. When you take the last step in DDL, the two temporary files are named back to their original table names. Because of this feature, residual and useless files may be left on disk when the database is halfway through crash. In this case, you can first rename the frm file to the same name as the ibd file, and then use DROP TABLE#mysql50##sql-ib53-522550444` to clean up the remaining files. Note that if you delete the ibd file without the drop command, it may result in residual information in the data dictionary, which is not elegant. Of course, in 8.0, due to the use of atomic data dictionary, this kind of residual file will not appear.
Caching operations in BinLog
BinLog is written to the file only when the transaction commits. Before it is committed, it will be placed in memory (controlled by the parameter binlog_cache_size). If the memory slows down, a temporary file will be created. The method is to first create it through mkstemp, and then directly unlink, leaving a handle to read and write. Temporary file names are similar to MLXXXXXX, where ML is a fixed prefix and XXXXXX is a random combination of uppercase and lowercase letters and numbers. The BinLog of a single transaction is too large, which may cause the size of the entire BinLog to be too large, thus affecting synchronization, so we need to control the transaction size as much as possible.
Optimize temporary files created
Some operations, in addition to relying on implicit temporary tables in the engine layer to assist in the calculation of complex SQL, in the Server layer, temporary files will be created to assist, such as order by operations, and the filesort function will be called. This function also uses memory (sort_buffer_size) to sort first, and if not, a temporary file is created to assist sorting. The file name is similar to MYXXXXXX, where MY is a fixed prefix and XXXXXX is a random combination of uppercase and lowercase letters and numbers.
Temporary files used in Load data
In BinLog replication, if the Load Data command is used on the main library, that is, to import data from the file, the database will write the whole file to RelayLog, and then transfer it to the slave database, which parses the RelayLog, extracts the corresponding Load file, and then applies it to the slave database. The location of this file on the repository is controlled by the parameter slave_load_tmpdir. It is recommended in the document that this directory should not be configured in the memory directory of the physical machine or the directory that will be deleted after restart. Because replication depends on this file, if it is accidentally deleted, it will cause the replication to be interrupted.
Other
In addition to the places mentioned above, temporary files are also used in several other places:
In the InnoDB layer, multiple temporary files are created at startup to store: the last foreign key or unique key error; the last deadlock information; and the final innodb status information. The reason for using temporary files instead of memory is guessed that memory usage does not fluctuate by writing these metrics. In the Server layer, temporary files are used when show create table is used in partition tables. Temporary files are also used when sorting within the MYISAM table. Related parameters
* tmpdir: * this parameter is the configuration of the temporary directory, where temporary tables / files will be placed by default in versions 5.6 and earlier. This parameter can be configured with multiple directories so that you can take turns to create temporary tables / files on different directories, and if different directories point to different disks, you can achieve the purpose of streaming.
* innodb_tmpdir: * this parameter is used by sorting temporary files in DDL as long as it is used. It will take up a lot of space and it is recommended to configure it separately. This parameter can be set dynamically and is also a Session variable.
* slave_load_tmpdir: * this parameter is mainly used when configuring the location of temporary files in the slave database for Load Data in BinLog replication. Because the database needs files that rely on Load data after Crash, it is recommended that you do not configure a directory that will delete the data after restart.
* internal_tmp_disk_storage_engine: * * when an implicit temporary table is converted to a disk temporary table, which engine is used? by default, only MyISAM and InnoDB are used. 5.7 and later versions are supported. This parameter is cancelled after version 8.0.16.
* internal_tmp_mem_storage_engine: * the storage engine used for implicit temporary tables in memory. You can choose Memory or Temptable engine. It is recommended to choose a new Temptable engine.
* default_tmp_storage_engine: * the default engine for explicit temporary tables, that is, the engine for temporary tables created by users through SQL statements.
* tmp_table_size: * min (tmp_table_size,max_heap_table_size) is the memory size of the implicit temporary table. If this value is exceeded, it will be converted to a disk temporary table.
* max_heap_table_size: * the memory limit of the Memory memory table created by the user.
* big_tables: * converting a memory temporary table to a disk temporary table requires a conversion operation, which needs to be converted in different engine formats, which requires consumption. If we can know in advance that disk temporary tables are needed to execute a SQL, that is, there is definitely not enough memory, we can set this parameter so that the optimizer skips the use of memory temporary tables and uses disk temporary tables directly, reducing overhead.
* temptable_max_ram: * * this parameter is only available after 8.0. it mainly specifies the memory size for the Temptable engine. After this, it is either converted to a disk temporary table or the built-in overflow mechanism is used.
* temptable_use_mmap: * whether to use Temptable's overflow mechanism.
Thank you for reading! On the temporary table of MySQL learning what is shared here, I hope the above content can have some help to you, so that you can learn more knowledge. If you think the article is good, you can share it and let more people see it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.