Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to execute a SQL update statement

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article is about how to execute a SQL update statement. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

I. Preface

Earlier, we systematically understood the execution flow of a query statement, and introduced the processing modules involved in the execution process. I believe you remember that the execution of a query statement usually goes through functional modules such as connectors, analyzers, optimizers, and executors, and finally reaches the storage engine.

So, what is the execution process of an update statement?

Second, sentence analysis

Let's start with an update statement for a table. Here is the creation statement for this table, which has a primary key ID and an integer field c:

Mysql > create table T (ID int primary key, c int)

If you want to add the value of the line ID=2 to 1, the SQL statement will write:

Mysql > update T set c=c+1 where ID=2

I have introduced the basic execution link of SQL statement to you earlier. I will bring that picture here again, and you can also take a brief look at this picture and review it first. First of all, it is safe to say that the query statement of the set of processes, update statements will also go through.

It is the job of the connector to connect to the database before you execute the statement.

As we said earlier, when there is an update on a table, the query cache associated with that table is invalidated, so this statement empties all cached results on table T. This is why we generally do not recommend using query caching.

Next, the parser knows through lexical and grammatical parsing that this is an update statement. The optimizer decided to use the index ID. The executor is then responsible for the specific execution, finds this line, and then updates it.

Unlike the query process, the update process also involves two important logging modules, which are the protagonists we are going to discuss today: redo log (redo logs) and binlog (archive logs). If you come into contact with MySQL, then these two words must not be bypassed, I will continue to emphasize with you in the following content. But then again, redo log and binlog have a lot of interesting aspects of design, and these design ideas can also be used in your own programs.

Important logging module: redo log

I don't know if you still remember the article "Kong Yiji". The hotel shopkeeper has a pink board, which is specially used to record guests' credit records. If there are not many people on credit, he can write down the customer's name and account on the board. But if there are more people on credit, there will always be times when the fan board can't remember it. At this time, the shopkeeper must have a special account book to record credit.

If someone wants credit or repayment, the shopkeeper generally has two ways:

One way is to turn out the ledger directly and add or deduct the credit account this time.

Another way is to write down the account on the pink board first, and then check the account book after closing.

When the business is booming, the counter is very busy, the shopkeeper will definitely choose the latter, because the former operation is really too troublesome. First of all, you have to find the record of this person's total credit. If you think about it, there are dozens of pages. If the shopkeeper wants to find that name, he may have to take presbyopic glasses to find it slowly. After finding it, he will come up with an abacus to calculate it, and finally write the results back on the ledger.

It's troublesome to think about the whole process. By contrast, it is convenient to write it down on the pink board first. If you think about it, if the shopkeeper does not have the help of a pink board and has to turn over the account book every time he keeps accounts, is the efficiency unbearably low?

Similarly, there is this problem in MySQL. If every update operation needs to be written to disk, and then the disk has to find the corresponding record, and then update it, the whole process IO cost and search cost are very high. In order to solve this problem, MySQL designers use ideas similar to hotel shopkeeper powder boards to improve update efficiency.

The whole process of matching the pink board with the account book is actually the WAL technology often mentioned in MySQL. The full name of WAL is Write-Ahead Logging, and its key point is to write the log first, then the disk, that is, write the pink board first, and then write the account book when you are not busy.

Specifically, when there is a record that needs to be updated, the InnoDB engine will first write the record to redo log and update the memory, and then the update is complete. At the same time, the InnoDB engine will update the operation record to disk at the appropriate time, and this update is often done when the system is relatively idle, just like the shopkeeper does after closing.

If there is not much credit today, the shopkeeper can wait until closing time to sort it out. But what if one day there is a lot of credit and the chalk board is full? At this time, the shopkeeper had to lay down his work, update some of the credit records in the pink board into the account book, and then erase these records from the pink board to make room for new accounts.

Similarly, the redo log of InnoDB is a fixed size, for example, it can be configured as a set of 4 files, each of which is 1GB, so this "pink board" can record the operation of 4GB in total. Write from the beginning, and then go back to the beginning and loop at the end of the write, as shown in the following figure.

Write pos is the location of the current record, move back (clockwise) while writing, and then go back to the beginning of document 0 after writing to the end of document 3. Checkpoint is the current location to be erased, and it is also pushed back and looped, updating the record to the data file before erasing the record.

Between write pos and checkpoint is the empty part of the "pink board" that can be used to record new operations. If write pos catches up with checkpoint, it means that the "pink board" is full, and you can't perform any new updates at this time. You have to stop and erase some records and push checkpoint.

With redo log,InnoDB, you can ensure that even if the database is abnormally restarted, the previously submitted records will not be lost, which is called crash-safe.

To understand the concept of crash-safe, consider our previous example of credit records. As long as the credit records are recorded on the pink board or in the account book, even if the shopkeeper forgets, such as abruptly suspending business for a few days, the credit account can still be clarified through the data on the account book and the powder board after the business resumes.

Important logging module: binlog

As we mentioned earlier, there are actually two pieces of MySQL as a whole: one is the Server layer, which mainly does things at the functional level of MySQL, and the other is the engine layer, which is responsible for storing specific matters related to it. The pink board redo log we talked about above is a log unique to the InnoDB engine, and the Server layer also has its own log, called binlog (archive log).

I'm sure you'll ask, why are there two journals?

Because at first there was no InnoDB engine in MySQL. MySQL comes with an engine of MyISAM, but MyISAM does not have the capability of crash-safe, and binlog logs can only be used for archiving. InnoDB is another company that introduced MySQL in the form of plug-ins, and since there is no crash-safe capability to rely on binlog alone, InnoDB uses another logging system, that is, redo log, to implement crash-safe capabilities.

There are three differences between the two kinds of logs.

Redo log is specific to the InnoDB engine; binlog is implemented at the Server layer of MySQL and can be used by all engines.

Redo log is a physical log that records "what changes have been made on a data page"; binlog is a logical log that records the original logic of the statement, such as "add 1 to the c field of the ID=2 line."

Redo log is written in a loop, and the fixed space will be used up; binlog can be appended. "append write" means that when the binlog file is written to a certain size, it will switch to the next one and will not overwrite the previous log.

With a conceptual understanding of these two logs, let's look at the internal flow of the executor and the InnoDB engine when executing this simple update statement.

The actuator first goes to the engine to fetch the ID=2 line. ID is the primary key, and the engine uses a tree search to find this line. If the data page of the ID=2 line is already in memory, it is returned directly to the executor; otherwise, you need to read the memory from disk and then return it.

The executor takes the row data given by the engine, adds 1 to this value, for example, it was N, now it is Number1, gets a new row of data, and then calls the engine interface to write this new line of data.

The engine updates the new line of data to memory and records the update operation to redo log, when the redo log is in the prepare state. The executor is then informed that the execution is complete and the transaction can be committed at any time.

The executor generates the binlog for this operation and writes the binlog to disk.

The executor calls the commit transaction interface of the engine, and the engine changes the redo log just written to the commit (commit) state, and the update is completed.

Here I give the flow chart of the execution of this update statement, with a light box indicating that it is executed inside the InnoDB and a dark box indicating that it is executed in the executor.

You may have noticed that the last three steps seem a bit "winding", splitting the writing of redo log into two steps: prepare and commit, which is called "two-phase commit".

Two-phase submission

Why must there be a "two-phase submission"? This is to make the logic between the two logs consistent. To illustrate this, we have to start with the question at the beginning of the article: how to restore the database to any second in half a month?

As we said earlier, binlog records all logical operations in the form of "append writes". If your DBA promises to restore within half a month, then all the binlog of the last half month will be saved in the backup system, and the whole database will be backed up regularly. The "periodicity" here depends on the importance of the system, either once a day or once a week.

When you need to restore to a specified second, for example, when you find that there is an erroneous deletion of the table at 12:00 one afternoon, and you need to retrieve the data, you can do this:

First, find the most recent full backup. If you are lucky, it may be a backup from last night, from this backup to the temporary library.

Then, starting from the point in time of the backup, the backup binlog is taken out in turn and replayed to the moment before the table was mistakenly deleted at noon.

In this way, your temporary library is the same as the online library before it was deleted by mistake, and then you can take the table data out of the temporary library and restore it to the online library as needed.

Well, after talking about the data recovery process, let's come back to why the log needs to be "two-phase commit." Here, we might as well use the method of counterproof to explain.

Still use the previous update statement as an example. Suppose the line of the current ID=2, the value of field c is 0, and what happens if crash occurs during the execution of the update statement after the first log is written and the second log is not finished?

Write redo log first and then binlog. Suppose that the MySQL process restarts abnormally when the redo log is finished and the binlog is not finished. As we said earlier, even after the redo log is written, the system can still recover the data even if it crashes, so the value of c in this line after recovery is 1. However, because the crash is not finished with binlog, there is no record of this statement in binlog at this time. Therefore, when you back up the log later, there is no such statement in the saved binlog. Then you will find that if you need to use this binlog to restore the temporary library, because the binlog of this statement is missing, the temporary library will lose this update, and the value of the recovered line c is 0, which is different from the value of the original library.

Write binlog first and then redo log. If crash is written after binlog, the value of c in this line is 0 because the redo log has not been written and the transaction is invalid after crash recovery. But the log "change c from 0 to 1" has been recorded in binlog. So, when you use binlog to recover later, there is one more transaction, and the value of the recovered row c is 1, which is different from the value of the original library.

As you can see, if you do not use two-phase commit, the state of the database may be inconsistent with that of the library recovered with its logs.

You might say, is this probability very low, and there is usually no scene where you need to restore the temporary library?

Actually, no, it's not just that you need to use this process to recover data after misoperation. When you need to expand capacity, that is, when you need to build more backup libraries to increase the reading capacity of the system, the common practice is to use full backup plus application binlog. This "inconsistency" will lead to inconsistencies between master and slave databases online.

In a nutshell, both redo log and binlog can be used to represent the commit status of a transaction, while two-phase commit is to keep the two states logically consistent.

Thank you for reading! This is the end of the article on "how to execute an SQL update statement". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report