Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to delete 2T large table by mysql in Innodb

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article is about how mysql deletes 2T large tables in Innodb. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Suppose you have a table erp, if you directly issue the following command

Drop table erp

At this point, all mysql-related processes will stop, and mysql will not resume execution until the end of the drop. The reason for this is that during drop table, innodb maintains a global lock, and the lock is released when the drop is finished.

This means that if during the day, when the traffic is very high, if you execute the order to delete the large table without doing anything, the whole mysql will hang there, during the deletion of the table, the QPS will decline seriously, and then the product manager will come to you for tea. That's why there's a scene in the cartoon, which you can delete at 12:00 at night, in the dead of night.

Of course, some people may say, "you can write a stored procedure to delete the table and run it once when there is no traffic at night."

I was surprised and thought about it. I could only say, "Let's not argue, or listen to me about the common practice in the industry."

A hypothesis.

Just to be clear, there is a premise here that mysql turns on independent tablespaces, and then MySQL5.6.7 turns on by default.

That is, in my.cnf, there is such a configuration (these are the knowledge of mysql optimization, which will be introduced to you later)

Innodb_file_per_table = 1

To view the status of the tablespace, use the following command

Mysql > show variables like'% per_table'; +-- +-+ | Variable_name | Value | +-- +-+ | innodb_file_per_table | OFF | +-+-+

If the value of innodb_file_per_table is OFF, it means that a shared tablespace is used.

If the value of innodb_file_per_table is ON, it means that a separate tablespace is used.

So, you have to ask me, what's the difference between independent and shared tablespaces?

Shared tablespaces: all the table data and index files of a database are placed in one file. By default, the file path of this shared tablespace is in the data directory. The default file name is: ibdata1 (this file can be expanded to multiple). Note that in this way, the operation and maintenance is super inconvenient. You see, all the data are in one file, so it is very inconvenient to maintain a single table. In addition, when you do delete operation, there will be a lot of gaps in the file, and the ibdata1 file will not shrink automatically. In other words, using shared tablespaces to store data will encounter the problem that space cannot be freed after drop table.

Independent tablespaces: each table is deployed independently, with a .frm table description file and an .ibd file.

.frm file: saves the metadata for each table, including the definition of the table structure, etc., which is independent of the database engine.

.ibd file: a file that holds the data and indexes of each table.

Note that in this way, each table has its own independent tablespace, which is convenient for operation and maintenance, and enables the movement of a single table between different databases. In addition, tablespaces can be automatically reclaimed when performing drop table operations. After performing the delete operation, you can defragment and reclaim part of the tablespace through alter table TableName engine=innodb.

Datadir in ps:my.cnf is used to set the data storage directory.

Well, there's a lot of Barabara up there, and I just want to say one thing:

In most cases, OPS will definitely choose independent tablespace storage for mysql, because using independent tablespace is much better from the point of view of performance optimization and operation and maintenance difficulty.

So, the premise I mentioned at the beginning is that mysql needs to open independent tablespaces. This assumption is true in 90% of the cases. If you really encounter the situation that your company's mysql uses shared tablespaces, please have a heart-to-heart talk with your operation and maintenance staff and ask why shared tablespaces are used.

Correct posture

Suppose we have datadir = / data/mysql/, and we have a database named mytest. In the database mytest, there is a table named erp that executes the following command

Mysql > system ls-1 / data/mysql/mytest/

Get the following output (I filtered it)

-rw-r- 1 mysql mysql 9023 8 18 05:21 erp.frm-rw-r- 1 mysql mysql 2356792000512 8 18 05:21 erp.ibd

The role of frm and ibd has been described above. Now the erp.ibd file is too large, so the deletion is stuck.

How to solve this problem?

Here you need to take advantage of the knowledge of hard links in linux for quick deletion. Now let me add some of the contents of "Brother Bird's Private Cuisine".

In fact, soft links can be analogically understood as shortcuts in windows, so there is no more introduction, mainly about hard links.

As for this hard link, let me say briefly that I don't want to post a lot of words. I look too tired.

That is, for the actual stored file, there is a

Then, there is a file name pointing to the above node Index.

So, the so-called hard link is that more than one file name points to node Index, and several file names point to node Index.

Suppose there will be another file name pointing to the above node Index, that is

At this time, you have done the operation to delete the file name (1), the linux system detected that there is a file name (2) pointing to node Index, so it will not really delete the file, but to delete the reference to step (2), this step is very fast, after all, just delete the reference. So that's what the picture looks like.

Next, you delete the file name (2) operation, the linux system detected that there is no other file name pointing to the node Index, will delete the real storage file, this step is to delete the real file, so it is relatively slow.

OK, we are using the above principle.

First set up a hard link to erp.ibd, using the ln command

Mysql > system ln / data/mysql/mytest/erp.ibd / data/mysql/mytest/erp.ibd.hdlk

At this point, the file directory is as follows

-rw-r- 1 mysql mysql 9023 8 18 05:21 erp.frm

-rw-r- 2 mysql mysql 2356792000512 8 18 05:21 erp.ibd

-rw-r- 2 mysql mysql 2356792000512 8 18 05:21 erp.ibd.hdlk

You will find that there is an extra erp.ibd.hdlk file, and the inode of both erp.ibd and erp.ibd.hdlk is 2.

At this point, you perform the drop table operation

Mysql > drop table erp;Query OK, 0 rows affected (0.99 sec)

You will find that it was deleted in less than a second. Because, at this point, there are two file names (erp.ibd and erp.ibd.hdlk) that point to an inode. At this time, the delete operation is performed, only the reference is deleted, so it is very fast.

So, the deletion at this time, the table has been deleted from the mysql. But the disk space hasn't been freed yet, because there's still one file left, erp.ibd.hdlk.

How to delete erp.ibd.hdlk correctly?

If you don't have any experience, you will answer me and delete it with the rm command. It should be noted that in the production environment, deleting large files directly with the rm command will cause disk IO overhead to soar, and the CPU load is too high, which will affect the operation of other programs.

So, at this point, you should delete it with the truncate command, which is in the coreutils toolset.

For details, you can go to Baidu. Some people have tested the rm and truncate commands, and the truncate command has almost no effect on the disk IO,CPU load.

The delete script is as follows

TRUNCATE=/usr/local/bin/truncatefor I in `seq 2194-10 10 `; do sleep 2$ TRUNCATE-s ${I} G / data/mysql/mytest/erp.ibd.hdlk donerm-rf / data/mysql/mytest/erp.ibd.hdlk

Starting at 2194G, shrink 10G at a time, stop for 2 seconds, continue until there are only 10G left in the file, and finally use the rm command to delete the rest.

Other circumstances

What this means is what if the database is deployed on windows. This question, I will answer, is actually not professional enough. Because since my debut, I have never encountered that mysql is based on windows in the production environment. Suppose you really run into it, there is a tool called mklink under windows, which is used to create hard link locks under windows. It should be able to accomplish similar functions.

Thank you for reading! This is the end of the article on "how to delete the 2T table in mysql in Innodb". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report