Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the reason why MySQL temporary tables can be renamed?

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "what is the reason why MySQL temporary tables can be renamed", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "what is the reason why MySQL temporary tables can be renamed" this article.

Today, let's start with this question: what are the characteristics of temporary tables and what scenarios are suitable?

Here, I need to help you sort out a problem that is easy to misunderstand: some people may think that temporary tables are memory tables. However, the two concepts are completely different.

Memory table, which refers to the table using the Memory engine, the syntax for building the table is create table. Engine=memory . * * the data of this table is stored in memory and will be emptied when the system is rebooted, but the table structure is still there. * * apart from these two features that look "strange", it is a normal watch in terms of other features.

Temporary tables that can use a variety of engine types. If you are using the temporary table of the InnoDB engine or the MyISAM engine, the data is written to disk. Of course, temporary tables can also use the Memory engine.

After figuring out the difference between a memory table and a temporary table, let's take a look at the characteristics of the temporary table.

Characteristics of temporary tables

For ease of understanding, let's take a look at the following sequence of operations:

As you can see, the temporary table has the following characteristics in use:

The syntax for building a table is create temporary table.

A temporary table can only be accessed by the session that created it and is not visible to other threads. Therefore, the temporary table t created by session An in the figure is not visible to session B.

A temporary table can have the same name as a regular table.

When there are temporary tables and regular tables with the same name in session A, the showcreate statement and the add, delete, change and query statements access the temporary table.

The showtables command does not display temporary tables.

Since the temporary table can only be accessed by the session that created it, the temporary table is automatically deleted at the end of the session.

It is also because of this feature that temporary tables are particularly suitable for scenarios like join optimization in the previous article. Why? The reasons mainly include the following two aspects:

Temporary tables with different session can be renamed. If multiple session perform join optimization at the same time, there is no need to worry about the failure of table creation caused by duplicate table names.

There is no need to worry about data deletion. If a normal table is used, the client is disconnected abnormally during the execution of the process, or the database is restarted abnormally, and you also need to clean up the data tables generated in the intermediate process. Temporary tables are automatically recycled, so this additional operation is not required.

Application of temporary table

Because you don't have to worry about duplicate name conflicts between threads, temporary tables are often used in the optimization of complex queries. Among them, the cross-database query of sub-database and sub-table system is a typical usage scenario.

The general scenario of sub-database and sub-table is to distribute a large logical table to different database instances. such as. Split a large table ht into 1024 sub-tables according to field f, and then distribute them to 32 database instances. As shown in the following figure:

In general, this sub-library sub-table system has an intermediate layer proxy. However, there are some scenarios that allow clients to connect directly to the database, that is, there is no proxy layer.

In this architecture, the choice of partition key is based on "reducing cross-database and cross-table queries". If most statements contain the equivalent condition of f, then use f as the partition key. In this way, after parsing the SQL statement at the proxy layer, you can determine which sub-table to route the statement to for query.

For example, the following sentence:

Select v from ht where fission N

At this point, we can use sub-table rules (for example, N24) to confirm which sub-table the required data is placed on. This kind of statement only needs to access a sub-table, which is the most popular statement form of the sub-database and sub-table scheme.

However, if there is another index k on this table, and the query statement looks like this:

Select v from ht where k > = M order by t_modified desc limit 100

At this time, because the partition field f is not used in the query conditions, we can only go to all the partitions to find all the rows that meet the conditions, and then uniformly do the operation of order by. In this case, there are two more commonly used ideas.

The first idea is to implement sorting in the process code of the proxy layer. The advantage of this method is that the processing speed is fast, and after getting the data from the sub-database, you can directly participate in the calculation in memory. However, the shortcomings of this scheme are also obvious:

The development workload required is relatively large. The statement we give is relatively simple. If complex operations, such as group by or even join, are involved, the development ability of the middle tier is required.

There is a lot of pressure on the proxy side, especially the problem of insufficient memory and CPU bottleneck.

Another idea is to summarize the data obtained by each sub-database into a table of a MySQL instance, and then do logical operations on this summary instance.

For example, in the above statement, the execution process can look like this:

Create a temporary table temp_ht on the summary library that contains three fields v, k, and t_modified

Execute select v.jiggle kjagomo modified from ht_x where k > = M order by t_modified desc limit 100on each sub-library.

Insert the results executed by the sub-library into the temp_ht table

Execute select v from temp_ht order by t_modified desc limit 100

Get the results. The flow chart for this process is shown below:

In practice, we often find that the amount of computation of each sub-library is not saturated, so we will directly put the temporary table temp_ht on one of the 32 sub-libraries.

Why can temporary tables be renamed?

You might ask, how can different threads create temporary tables with the same name?

We're implementing

Create temporary table temp_t (id int primary key) engine=innodb

During this statement, MySQL creates a frm file for the InnoDB table to save the table structure definition and a place to store the table data.

The frm file is placed in the temporary file directory with the suffix .frm and the prefix "# sql {process id} _ {thread id} _ sequence number".

From the prefix rules of the file name, we can see that we actually create an InnoDB temporary table called T1. MySQL thinks that the table name we create is different from the ordinary table T1, so if there is already a regular table T1 under the same library, we can still create another temporary table T1.

Let me give you an example.

The process number of this process is 1234 session A, the thread id is 4, and the thread id of session B is 5. So you can see that the temporary tables created by session An and session B do not have the same name as the files on disk.

MySQL maintains data tables. In addition to physically having files, there is also a mechanism in memory to distinguish different tables, each corresponding to a table_def_key.

The table_def_key value of a regular table is obtained by "library name + table name", so if you want to create two ordinary tables with the same name in the same library, you will find that table_def_key already exists in the process of creating the second table.

For temporary tables, table_def_key adds "server_id+thread_id" on the basis of "library name + table name".

In other words, the two temporary tables T1 created by session An and session B have different table_def_key and disk file names, so they can coexist.

In implementation, each thread maintains its own temporary list list. In this way, every time you operate a table in session, you first go through the linked list to check whether there is a temporary table with that name, and if so, give priority to operating the temporary table, if there is no more regular table; at the end of the session, perform the "DROPTEMPORARY TABLE + table name" operation on each temporary table in the linked list.

At this point, you will find that the command DROPTEMPORARY TABLE is also recorded in binlog. You must be surprised that temporary tables can only be accessed within threads, so why do you need to write them into binlog? Here, we need to talk about active and standby replication.

Temporary table and active / standby replication

Since you write binlog, it means that you need to prepare the library. You can imagine executing the following sequence of statements on the main library:

Create table t_normal (id int primary key, c int) engine=innodb;/*Q1*/create temporary table temp_t like tweets / races / Q2 cycles / inserts into temp_t values (1); / * Q3*/insert into t_normal select * from temp_t;/*Q4*/

If the operations on temporary tables are not recorded, then there are only binlog logs for the statements create table t _ tables and insert intot_normal select * fromtemp_t in the repository. When the slave database executes to insert intot_normal, it will report an error "Table temp_t does not exist".

You might say, wouldn't it be nice to set binlog to row format? Because when binlog is in row format, when recording the binlog of insert intot_normal, the data of this operation is recorded, that is, the logic recorded in write_rowevent is "insert a row of data (1)".

Yes, indeed. If the current binlog_format=row, then statements related to temporary tables will not be recorded in the binlog. That is, the operation of the temporary table is recorded in the binlog only when it is in binlog_format=statment/mixed.

In this case, the statement to create the temporary table is passed to the slave for execution, so the synchronization thread of the standby will create the temporary table. The main library automatically deletes the temporary table when the thread exits, but the standby synchronization thread is running continuously. So, at this point, we need to write another DROPTEMPORARY TABLE on the main database and pass it to the standby database for execution.

It's okay for different threads on the main library to create temporary tables with the same name, but what about passing it to the standby database for execution?

Now, let me give you an example. In the following sequence, instance S is a backup library for M.

The two session on the main library M create a temporary table T1 with the same name, and both create temporary table T1 statements are passed to the standby library S.

However, the application log thread of the repository is shared, that is, the create statement should be executed twice in the application thread. (even if multithreaded replication is turned on, it may be assigned to execute from the same worker of the library). So, will this cause the synchronization thread to report an error?

Obviously not, otherwise the temporary watch would be a bug. In other words, when the repository thread executes, the two T1 tables are treated as two different temporary tables. How did this come true? When MySQL records the binlog, it writes the thread id that the main library executes this statement to binlog. In this way, the application thread in the standby library can know the main library thread id executing each statement, and use this thread id to construct the table_def_key of the temporary table:

The temporary table T1 of session A, and the table_def_key in the library is: library name + T1 + "M's serverid" + "session A's thread_id"

The temporary table T1 of session B is the table_def_key in the library: library name + T1 + "M's serverid" + "session B's thread_id".

Because the table_def_key is different, the two tables will not conflict in the application thread of the repository.

The above is all the contents of the article "Why MySQL temporary tables can be renamed". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report