In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to delete the library in MySQL". In the daily operation, I believe that many people have doubts about how to delete the library in MySQL. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts about "how to delete the library in MySQL". Next, please follow the editor to study!
The traditional high-availability architecture cannot prevent accidental deletion of data, because a drop table command of the master database will be passed to all slave libraries and cascading slave libraries through binlog, causing instances of the entire cluster to execute this command.
Although most of the data we encountered before was deleted, it was taken care of by operation and maintenance students or DBA. But in fact, as long as the students who have the right to operate the data, they may step on the line of mistakenly deleting data.
Today we will talk about what we can do before and after mistakenly deleting data to reduce the risk of mistakenly deleting data and the loss caused by mistakenly deleting data.
In order to find a more efficient way to solve the problem of mistakenly deleted data, we need to classify the mistakenly deleted data related to MySQL:
Mistakenly delete data rows using delete statement
Mistakenly delete a data table by using drop table or truncate table statements
Mistakenly delete database by using drop database statement
Use the rm command to mistakenly delete the entire MySQL instance.
Erroneous deletion of lines
If you accidentally delete a data row using the delete statement, you can use the Flashback tool to restore the data through flashback.
The principle of Flashback data recovery is to modify the content of binlog and replay it back to the original database. The premise that you can use this scheme is that you need to ensure that binlog_format=row and binlog_row_image=FULL are used.
When recovering data, do the following for a single transaction:
For insert statements, the corresponding binlog event type is Write_rows event. Just change it to Delete_rows event.
By the same token, for the delete statement, change Delete_rows event to Write_rows event
In the case of Update_rows, the pre-and post-modified values of the data rows are recorded in the binlog, and the positions of the two rows can be reversed.
What if the misoperation is not one, but more than one? For example, the following three transactions:
(a) delete... (B) insert... (C) update...
Now to restore the database to the state before these three transaction operations, after parsing the binlog with the Flashback tool, the command to write back to the main library is:
(reverse C) update... (reverse B) delete... (reverse A) insert...
In other words, if multiple transactions are involved in the erroneous deletion of data, the order of the transactions needs to be changed before execution.
To be clear, I do not recommend that you perform these operations directly on the main library.
The safer way to restore data is to restore a backup, or find a slave library as a temporary library, perform these operations on this temporary library, and then restore the data from the confirmed temporary library back to the main database.
Why would you do that?
This is because changes in the state of data are often associated with a master library that performs online logic. It may be a little late to find the data problem, which causes the business code logic to continue to modify other data on the basis of the previous misoperation. Therefore, if these rows of data are recovered separately at this time, and without confirmation, secondary destruction of the data may occur.
Of course, we not only have to talk about the post-processing methods of mistakenly deleting data, but also to achieve prevention in advance. I have the following two suggestions:
Set the sql_safe_updates parameter to on. In this way, if we forget to write the where condition in the delete or update statement, or if the where condition does not contain an index field, the execution of the statement will report an error.
Before the code goes online, it must be audited by SQL.
You might say, with sql_safe_updates=on set, what should I do if I really want to delete all the data from a small table?
If you are sure that the delete operation is OK, you can add a where condition to the delete statement, such as where id > = 0.
However, delete full tables are slow, and you need to generate rollback logs, write redo, and write binlog. So, from a performance point of view, you should give priority to using truncate table or drop table commands.
You can also use Flashback to recover the data deleted using the delete command. Data deleted using the truncate / drop table and drop database commands cannot be recovered through Flashback. Why?
This is because, even if we have configured binlog_format=row, the recorded binlog is still in statement format when executing these three commands. There is only one truncate/drop statement in binlog, and the data cannot be recovered from this information.
So, what if we do use these commands to delete data by mistake?
Erroneous deletion of libraries / tables
In this case, if you want to recover the data, you need to use full backup and incremental logs. This solution requires regular full backups online and real-time backup of binlog.
When both conditions are met, if someone deletes a database by mistake at 12:00, the process for restoring data is as follows:
Take the last full backup, assuming that the library is prepared once a day, and the last backup was at 0: 00 on the same day.
Restore a temporary library with backup
From the log backup, take out the log after 0: 00 in the morning
All these logs are applied to the temporary library, except for the statement that mistakenly deletes the data.
The schematic diagram of this process is as follows:
Figure 1 data recovery process-mysqlbinlog method
I need to explain to you the following points about this process:
To speed up data recovery, if there are multiple databases on this temporary library, you can use the mysqlbinlog command with a-database parameter to specify the library where the table is erroneously deleted. This avoids the need to apply other library logs when recovering data.
When applying the log, you need to skip the binlog of the statement that misoperated at 12:00:
If the original instance does not use GTID mode, you can only use the-stop-position parameter to execute to the log before the misoperation when applying to the binlog file containing 12:00, and then use-start-position to continue with the log after the misoperation.
If the instance uses the GTID mode, it is much more convenient. Assuming that the GTID of the misoperation command is gtid1, you only need to execute set gtid_next=gtid1;begin;commit; to add the GTID to the GTID collection of the temporary instance, and then the misoperation statement will be skipped automatically when the binlog is executed sequentially.
However, even so, using the mysqlbinlog method to recover data is not fast enough for two main reasons:
If you delete a table by mistake, it is best to restore only this table, that is, to replay only this table, but the mysqlbinlog tool cannot specify that the log of only one table be parsed
Parsing the log application with mysqlbinlog, the process of applying the log can only be single-threaded. None of the parallel replication methods we described earlier can be used here.
One way to speed up is to set the temporary instance as the slave library of the online standby library after restoring the temporary instance with backup, so that:
Before start slave, you can do this by executing the declare
The temporary change replication filter replicate_do_table = (tbl_name) command allows the temporary library to synchronize only the misoperated tables.
Parallel replication technology can also be used to speed up the entire data recovery process.
The schematic diagram of this process is shown below.
Figure 2 data recovery process-master-slave method
As you can see, in the figure, there is a dotted line between the binlog backup system and the online repository, which means that if the binlog needed by the temporary instance has been deleted from the repository for too long, we can find the required binlog from the binlog backup system and put it back in the repository.
Suppose that we find that the binlog required by the current temporary instance starts with master.000005, but the smallest binlog file shown by performing show binlogs on the slave library is master.000007, which means two binlog files are missing. At this point, we need to find these two files in the binlog backup system.
The procedure for putting the previously deleted binlog back into the repository is as follows:
Download master.000005 and master.000006 files from the backup system and put them in the log directory of the repository.
Open the master.index file in the log directory and add two lines at the beginning of the file, which are ". / master.000005" and ". / master.000006", respectively.
Restart the backup library in order to re-recognize these two log files
Now that this slave library has all the binlog needed by the temporary library, establish the master / slave relationship and you can synchronize normally.
Whether it is applying the binlog file parsed by mysqlbinlog tool to the temporary library or connecting the temporary library to the standby library, what these two schemes have in common is that after mistakenly deleting the library or table, the idea of restoring the data is mainly through backup, plus the way of applying binlog.
That is, both scenarios require the backup system to back up full logs on a regular basis, and to ensure that binlog is backed up before it is deleted locally.
However, it is impossible for a system to back up unlimited logs, and you need to set the number of days to keep a log based on cost and disk space resources. If your DBA team tells you that you can guarantee that an instance will be restored to any point in half a month, this means that the backup system will keep logs for at least half a month.
In addition, I suggest that no matter which method you use, you should turn this data recovery function into an automated tool and practice it frequently. Why would you say that?
The reasons here mainly include two aspects:
Although "this kind of thing happens, no one wants to", but in case of erroneous deletion, we can quickly recover the data and minimize the loss, and we should not have to run away.
And if the temporary manual operation in a hurry, and finally misoperation, causing secondary damage to the business, it does not make sense.
Delayed replication standby library
Although we can speed up the process of recovering data by using parallel replication, this scheme still has the problem of "uncontrollable recovery time".
If the backup of a library is particularly large, or the time of misoperation is longer than that of the last full backup, for example, if an instance prepared once a week misoperates on the 6th day after the backup, you need to restore the log for 6 days. The recovery time may be calculated on a daily basis.
So, what can we do to shorten the time it takes to recover data?
If there is a very core business that does not allow too long recovery time, we can consider building a backup library for delayed replication. This feature was introduced in MySQL version 5. 6.
The problem with the general master-slave replication structure is that if a table on the master database is mistakenly deleted, this command will soon be sent to all slave databases, resulting in all slave database tables being mistakenly deleted.
The slave database for delayed replication is a special standby library. You can specify that the slave database has a delay of N seconds with the master database through the command CHANGE MASTER TO MASTER_DELAY = N.
For example, if you set N to 3600, this means that if any data on the main database is mistakenly deleted and the misoperation command is found within 1 hour, the command has not been executed in the delayed replication standby database. At this time, execute stop slave on this repository, and then skip the misoperation command through the method described earlier, and you can recover the required data.
In this way, you can get one at any time, and you only need to chase for another hour at most to recover a temporary instance of the data, which shortens the time required for the entire data recovery.
Methods to prevent erroneous deletion of libraries / tables
Although often walking by the river, it is difficult not to wet shoes, but after all, we can find some ways to avoid it. So here, I will also give you some suggestions to reduce the risk of erroneous deletion.
The first suggestion is to separate accounts. The purpose of this is to avoid miswriting commands. For example:
We only give DML permission to business developers, not truncate/drop permission. And if business developers have DDL requirements, they can also be supported through the development management system.
Even members of the DBA team are required to use only read-only accounts on a daily basis, and only use accounts with update permissions if necessary.
The second suggestion is to develop a code of practice. The purpose of this is to avoid miswriting the table name to be deleted. For example:
Before deleting a data table, you must rename the table. Then, observe for a period of time to make sure there is no impact on the business before deleting the table.
When changing the table name, it is required to add a fixed suffix to the table name (such as adding _ to_be_deleted), and then the action of deleting the table must be performed through the management system. Also, when the management system deletes tables, only tables with fixed suffixes can be deleted.
Rm delete data
In fact, for a MySQL cluster with a highly available mechanism, the least fear is that rm deletes data. As long as the whole cluster is not deleted maliciously, but only the data of one of the nodes is deleted, the HA system will start to work and select a new master database to ensure the normal operation of the whole cluster.
At this point, all you have to do is recover the data on this node and connect it to the entire cluster.
Of course, now not only DBA has automation system, but SA (system administrator) also has automation system, so maybe the operation of a batch offline machine will wipe out all the nodes of your entire MySQL cluster.
In response to this situation, my advice is to keep your backup across computer rooms as much as possible, or preferably across cities.
Summary
Today, I discussed with you several possibilities of mistakenly deleting data and how to deal with it.
However, I would like to emphasize that prevention is far more important than treatment.
In addition, in MySQL's clustering scenario, backups are used from time to time to restore instances, so it is necessary to check the validity of backups regularly.
If you are a business development student, you can use the show grants command to check the permissions of the account. If the permissions are too large, you can advise DBA to assign you an account with lower permissions. You can also evaluate the importance of the business and discuss with DBA the backup cycle, whether it is necessary to create a delayed replication repository, and so on.
The reliability of data and services is not only the work of the operation and maintenance team, but also the result of the guarantee of all links.
Why is there a sentence that cannot be dropped by kill?
There are two kill commands in MySQL: one is kill query + thread id, which terminates the statement being executed in this thread; the other is kill connection + thread id, where connection can be disconnected by default, of course, if the thread has a statement being executed, it is necessary to stop the statement being executed first.
I don't know if you have ever encountered such a phenomenon when using MySQL: you used the kill command but failed to disconnect the connection. After executing the show processlist command, you can see that the Command column of this statement shows Killed.
You must be wondering what it means to show as Killed. Shouldn't you not see this thread directly in the results of show processlist?
Today, let's discuss this problem. In fact, in most cases, the kill query/connection command is valid. For example, when we execute a query and find that it takes too long to continue the query, we can use the kill query command to terminate the query.
In another case, it is also valid to use the kill command directly when the statement is waiting for a lock. Let's take a look at this example:
Figure 1 successful example of kill query
As you can see, after session C executes kill query, session B prompts the statement to be interrupted almost immediately. This is what we expect.
What does the thread do after receiving the kill?
But here you have to stop and think: does session B just terminate the thread and quit without paying attention? Obviously, this will not work.
When you add, delete, modify and check a table, a MDL read lock is added to the table. So, although session B is in the blocked state, it still holds an MDL read lock. If the thread is terminated directly when it is kill, then the MDL read lock has no chance of being released.
In this way, kill does not mean to stop immediately, but rather tells the executing thread that the statement no longer needs to be executed and that it is time to "execute the logic of stop."
In fact, this is similar to Linux's kill command, kill-N pid is not to stop the process directly, but to send a signal to the process, and then the process processes the signal and enters the termination logic. Only for the MySQL kill command, there is no need to pass semaphore parameters, only the "stop" command.
In implementation, when the user executes kill query thread_id_B, the thread that processes the kill command in MySQL does two things:
Change the running state of session B to THD::KILL_QUERY (assign the variable killed to THD::KILL_QUERY)
Send a signal to the execution thread of session B.
Why send a signal?
Because in our example in figure 1, session B is in a lock waiting state. If you just set the thread state of session B to THD::KILL_QUERY, thread B does not know the state change and will continue to wait. The purpose of sending a signal is to get session B out of waiting to process the THD::KILL_QUERY state.
In the above analysis, there are three implications:
There are many "buried points" in the execution of a statement, and the thread state is judged at these "buried points". If the thread state is found to be THD::KILL_QUERY, it begins to enter the statement termination logic.
If you are in a waiting state, it must be a wait that can be awakened, otherwise it will not be executed at all to the "burial point".
There is a process from the beginning of the statement to the complete completion of the termination logic.
When you get to this point, you will know that it is not "just stop".
Next, let's take a look at another example of not losing kill, that is, the example of insufficient innodb_thread_concurrency that we mentioned earlier.
First, execute set global innodb_thread_concurrency=2, setting the maximum number of concurrent threads for InnoDB to 2; then, execute the following sequence:
Figure 3 effect after kill connection
At this point, the Commnad column of the thread id=12 shows Killed. In other words, although the client is disconnected, the statement on the server is still in the process of being executed.
Why doesn't this statement exit like the update statement in the first example when executing the kill query command?
In implementation, when waiting for a row lock, the pthread_cond_timedwait function is used, and this wait state can be awakened. However, in this example, the wait logic of thread 12 looks like this: every 10 milliseconds, it determines whether it can enter InnoDB execution, and if not, it calls the nanosleep function to enter the sleep state.
In other words, although the state of thread 12 has been set to KILL_QUERY, the state of the thread is not judged during the loop waiting to enter the InnoDB, so it will not enter the termination logic phase at all.
This is what session E does when it executes the kill connection command.
Set the state of thread 12 to KILL_CONNECTION
Turn off the network connection of thread 12. Because of this operation, you will see that session C is prompted to disconnect at this time.
So why do you see the Command column as killed when you execute show processlist? In fact, this is because there is a special logic when performing show processlist:
If the state of a thread is KILL_CONNECTION, the Command column is displayed as Killed.
So in fact, even if the client exits, the state of the thread is still waiting. When will this thread exit?
The answer is that only when the conditions for entering InnoDB are met, the query statement of session C continues to execute, and then it is possible to determine that the thread state has become KILL_QUERY or KILL_CONNECTION, and then enter the termination logic phase.
Here, let's make a brief summary.
This example is the first case where the kill is invalid, that is, the thread does not execute the logic to determine the state of the thread. Similarly, due to the IO pressure, the function of reading and writing IO has been unable to return, resulting in the failure to determine the state of the thread in time.
In another case, the termination logic takes a long time. At this point, the show processlist result is also Command=Killed, and the statement is not really completed until the termination logic is completed. In this kind of situation, the common scenarios are as follows:
During the execution of a very large transaction, it is kill. At this point, the rollback operation needs to reclaim all new data versions generated during the execution of the transaction, which takes a long time.
Large query rollback. If a large temporary file is generated during the query, and the pressure on the file system is high, deleting the temporary file may have to wait for IO resources, which will take a long time.
When the DDL command is executed to the final stage, if it is kill, the temporary files of the intermediate process need to be deleted, or it may take a long time to be affected by IO resources.
Someone asked me before, if you use the Ctrl+C command directly on the client, is it possible to terminate the thread directly?
The answer is no.
There is a misunderstanding here. In fact, the operation of the client can only operate to the thread of the client, and the client and the server can only interact through the network, so it is impossible to operate the server thread directly.
Because MySQL is a stop protocol, it is useless to issue commands to this connection when the statement executed by this thread has not yet been returned. In fact, when Ctrl+C is executed, it is the MySQL client that initiates another connection and then sends a kill query command.
So, don't think that everything will be all right after executing Ctrl+C on the client side. Because, to kill a thread, it also involves a lot of back-end operations.
Two other misunderstandings about the client
In practical use, I often encounter some students who have misunderstandings about the use of the client. Next, let's look at two of the most common misunderstandings.
The first misunderstanding is that if there are too many tables in the library, the connection will be slow.
Some online libraries contain a lot of tables (the most I've ever seen have 60,000 tables). At this point, you will find that every time you use a client connection, it will get stuck on the following interface.
If the db1 library table is small, the connection will be very fast, and you can quickly enter the state of entering commands. Therefore, some students will think that the number of tables affects the connection performance.
You know from the beginning that when each client establishes a connection with the server, all it needs to do is TCP handshake, user verification, and permission. But these operations obviously have nothing to do with the number of tables in the library.
But in fact, as the text prompt in the figure says, when connecting with default parameters, the MySQL client provides a local library name and table name completion function. In order to achieve this feature, the client needs to do more after the connection is successful:
Execute show databases
Cut to the db1 library and execute show tables
Use the results of these two commands to build a local hash table.
Of these operations, the most time-consuming is the third step in building the hash table locally. So, when there are a lot of tables in a library, this step takes a long time.
In other words, the slow connection process that we perceive is not the slow connection, nor the slow server, but the slow client.
The prompt in the figure also says that if you add-A to the connection command, you can turn off the automatic completion function, and then the client can quickly return.
The effect of automatic completion here is that when you enter a library name or table name, enter a prefix, and you can use the Tab key to automatically complete the table name or display a prompt.
In practice, if you don't use the auto-completion function much, I suggest you add-A by default every time you use it.
In fact, the prompt does not say that in addition to adding-A, adding-quick (or abbreviated to-Q) parameter can also skip this stage. However, this-quick is a more misleading parameter and a common misunderstanding about the client.
When you see this parameter, do you think it should be a parameter that allows the server to accelerate? In fact, on the contrary, setting this parameter may degrade the performance of the server. Why would you say that?
After the MySQL client sends the request, the receiving server returns the result in two ways:
One is the local cache, that is, opening a piece of memory locally and saving the results first. If you develop in API, it corresponds to the mysql_store_result method.
The other is not cached, read one and deal with the other. If you develop in API, it corresponds to the mysql_use_result method.
The MySQL client defaults to the first method, but if you add the-quick parameter, the second non-caching method is used.
When using the non-caching method, if the local processing is slow, it will cause the server to send the result to be blocked, so it will slow down the server. About the specific behavior of the server, I will explain it to you in the next article.
Then you would say, in that case, why did you name this parameter quick? This is because using this parameter can achieve the following three effects:
The first point, as mentioned earlier, skips the automatic completion of table names.
Second, mysql_store_result needs to apply for local memory to cache query results. If the query result is too large, it will consume more local memory and may affect the performance of the client local machine.
The third point is that the execution of the command is not recorded in the local command history file.
So you can see that the-quick parameter means to make the client faster.
At this point, the study of "how to delete the library in MySQL" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.