In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
Most people do not understand the knowledge points of this article, "what is the implementation process of a SQL update statement?", so the editor summarizes the following content, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "what is the implementation process of a SQL update statement?"
I. implementation process
During the execution of SQL query statements, we have learned the basic execution links of SQL statements. I will bring that picture here again, and you can also take a brief look at this diagram and review it. First of all, it is safe to say that the query statement of the set of processes, update statements will also go through.
It is the job of the connector to connect to the database before you execute the statement.
The first step in using the database is to connect to the database first, and the connector is the reception at this time. Connectors are responsible for establishing connections with clients, obtaining permissions, maintaining and managing connections. Connection command:
Mysql mysql-h host address-u user name-p
After typing the command, you need to enter the password in the interactive conversation. Although the password can also be written directly after-p on the command line, it may lead to your password being compromised and it is not recommended to type it directly.
The following pictures are all my computer operation (mac+ mamp).
If you enter the account number or password incorrectly, it will prompt (1045):
To get back to the point, let's start with an update statement for a table. Here is the creation statement for this table, which has a primary key id and a field of type name and a field of type decimal score:
CREATE TABLE `s_ info` (`id` int (11) NOT NULL AUTO_INCREMENT COMMENT 'self-added id', `name` varchar' NOT NULL DEFAULT''COMMENT' name', `score` decimal (5pp2) NOT NULL DEFAULT '0.00' COMMENT' score', PRIMARY KEY (`id`) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' Student score Table'
And add several pieces of test data:
INSERT INTO `test`.`s _ info` (`name`, `score`) VALUES ('Zhang Yi', 80.00), ('Zhao er', 90.00), ('Wang San', 100.00), ('Li Si', 98.00), ('Ma Wu', 87.00)
The results are as follows:
If you want to add the value of the line id=4 to 1, the SQL statement will write:
Update s_info set score=score+1 where id = 4
The execution result is shown in the figure:
When there is an update on a table, the query cache associated with that table is invalidated, so this statement empties all cached results on the table. This is why we generally do not recommend using query caching.
The parser knows that this is an update statement through lexical and grammatical parsing.
The optimizer decided to use the index ID.
The executor is responsible for the specific execution, find this line, and then update it.
Unlike the query process, the update process also involves two important logging modules, redo log (redo logs, physical logs) and binlog (archive logs, logical logs). If you have come into contact with MySQL, these two words must not be bypassed.
Second, log module
Before talking about the logging module, let's talk about what physical logs and logical logs are.
Physical log: popularly speaking, only "I" can be used by myself, others can not share my "physical format, privatized."
Logical log: it can be used by other engines and is shared by all engines.
1. Physical log redo log
Redo log is a unique log of InnoDB engine, also known as rewrite log, which is used to record the changes of transaction operations. It records the value after data modification, regardless of whether the transaction commits successfully or not. It gives MySQL the ability to crash and recover.
For example, if the MySQL instance is down or down, when restarting, the InnoDB storage engine will use redo log to recover the data to ensure the persistence and integrity of the data.
Take the bookkeeping of the common ancient tavern shopkeeper as an example:
The tavern keeper has a blackboard on which people on credit are rarely recorded. If there are too many people on credit, because the space on the blackboard is limited, he needs to prepare an extra ledger to record all credit accounts. If someone wants credit, the average boss has two ways:
(1) open the account book, find the credit person's record, and make additional credit records; (2) write the credit person's record on the blackboard first, and then update it to the credit account when the passenger flow is low. If the shopkeeper uses the first method, whenever someone wants credit, he first needs to open a thick ledger, look up the customer's name page by page, and then register. If you think about it, if there are not many people on credit, it will be easier for the shopkeeper to find the record of the person on credit. If there are several credit books, if you look for them one by one, the shopkeeper will have a headache.
There is the same problem in MySQL. If each update operation needs to be written to disk, and then the disk has to find the corresponding record, and then update, the whole process IO cost, search cost is very high. In order to solve this problem, MySQL designers use ideas similar to hotel shopkeeper powder boards to improve update efficiency.
In fact, the whole process of the coordination of the blackboard and the account book is the WAL (Write Ahead Logging) technology often mentioned in MySQL. Its key point is to write the log first and then write the disk.
When there is a record that needs to be updated, the InnoDB engine will first write the record to the redo log (blackboard) and update the memory. At the same time, the InnoDB engine will update the operation record to disk at the appropriate time, and this update is often done when the system is relatively idle, just like the shopkeeper does after closing.
If one day there is a lot of credit and the chalk board is full, what should I do? At this time, the shopkeeper had to lay down his work, update some of the credit records on the blackboard into the account book, and then erase them from the blackboard to make room for new accounts.
Similarly, the redo log of InnoDB is a fixed size, for example, it can be configured as a set of 4 files, each of which is 1GB, so the "blackboard" can record the operation of 4GB in total. Write from the beginning, and then go back to the beginning and cycle at the end of the writing.
As shown in the following figure:
Write pos is the location of the current record, move back while writing, and go back to the beginning of document 1 at the end of document 4. Check point is the location where the record is currently written to the data file, and it is also moved backward and looped.
If described in conjunction with the boss's blackboard scenario above, write pos is where the boss writes the credit records on the blackboard sequentially. For mysql, write pos moves backward; check point is where the boss writes the records on the blackboard to the credit book, and when the boss writes it on the credit book, he erases the record on the pink board. For mysql, check point moves backward.
Between write pos and checkpoint is the empty part of the "blackboard" that can be used to record new operations. If write pos catches up with check point, it means that the "blackboard" is full, and you can't perform any new updates at this time. You have to stop and erase some records and push check kpoint.
With redo log,InnoDB, you can ensure that even if the database is abnormally restarted, the previously submitted records will not be lost, which is called crash-safe.
Usage scenarios of redo log
Used for system collapse recovery.
Redolog configuration
(1), cache size
The default size of innodb_log_buffer_size is 16MB. View the relevant configuration sql:
SHOW GLOBAL VARIABLES LIKE'% innodb_log%'
The result is shown in the figure:
(2) brushing strategy
The commit is written to disk and is synchronized according to the policy of this configuration.
When submitting a transaction, the data in redo log buffer will not be brushed to disk.
1 when submitting things, the log must be brushed to disk, which can strictly ensure that the data is not lost (default and recommended policy).
2 when submitting things, first brush the log into the os cache cache corresponding to the disk file, and then brush the data into the disk after a period of time.
Check the related parameter SQL:
SHOW GLOBAL VARIABLES LIKE'% sync_binlog%'
2. Logical log binlog
MySQL as a whole, there are actually two pieces: one is the Server layer, which mainly does things at the functional level of MySQL, and the other is the engine layer, which is responsible for storage-related specific matters. Redo log is unique to the InnoDB engine, while the Server layer also has its own log, called binlog (Archive Log).
Why are there two journals?
At first there was no InnoDB engine in MySQL. MySQL comes with an engine of MyISAM, but MyISAM does not have the capability of crash-safe, and binlog logs can only be used for archiving. InnoDB is another company that introduced MySQL in the form of plug-ins, and since there is no crash-safe capability to rely on binlog alone, InnoDB uses another logging system, that is, redo log, to implement crash-safe capabilities.
Bin log is the service layer of mysql database and is a log module shared by all storage engines. It is used to record the write operations performed by the database, that is, to record in the transaction commit phase and save in binary form on disk.
There are the following differences between the two types of logs:
\ redo logbinlog1InnoDB engine specific. Binlog is implemented in the Server layer of MySQL and can be used by all engines. 2redo log is a physical log that records "what changes have been made on a data page." Binlog is a logical log and is executed by the service layer of the mysql database. What is recorded is the original logic of the statement, such as "add 1 to the score field of the ID=4 line." 3redo log is written in a loop, and the fixed space will be used up. Binlog can be appended to write. You can set the binlog file size through the max_binlog_size parameter, and when the file size reaches a certain value, a new file is generated to save the log.
With a conceptual understanding of these two logs, let's look at the internal flow of the executor and the InnoDB engine when executing this simple update statement.
(1) the actuator first goes to the engine to get the ID=4 line. ID is the primary key, and the engine uses a tree search to find this line. If the data page of the ID=4 row is already in memory, return it directly to the executor; otherwise, you need to read the memory from disk and then return it, and put an exclusive lock on the row record and write the old value of the updated row record to undo log (for rollback).
(2) the executor gets the row data given by the engine, adds 1 to this value, for example, it was N, and now it is Number1, gets a new row of data, and then calls the engine interface to write the new data.
(3) the engine updates this new line of data to memory and records the update operation to redo log, when redo log is in prepare state. The executor is then informed that the execution is complete and the transaction can be committed at any time.
(4), the executor generates the binlog of this operation, and then brushes it to the binlog file (disk) according to the policy.
(5) the executor calls the commit transaction interface of the engine, and the engine changes the redo log just written to the commit (commit) state, and the update is completed.
The execution flow chart of the update statement is as follows, where the yellow box indicates that it is executed inside the InnoDB, and the green box indicates that it is executed in the executor.
Focusing on the last three steps, the write to redo log is split into two steps: prepare and commit, which is called "two-phase commit".
Two-phase submission
Why must there be a "two-phase submission"? This is to make the logic between the two logs consistent.
Binlog records all logical operations and takes the form of "append write". If your DBA promises to restore within half a month, then all the binlog of the last half month will be saved in the backup system, and the whole database will be backed up regularly.
When you need to restore to a specified second, such as when you find that there is an erroneous deletion of the table at 11:00 in the afternoon, and you need to retrieve the data, we can do this:
First of all, find the most recent full backup, which, if you are lucky, may be a backup from last night, from this backup to the temporary database; then, starting from the point in time of the backup, take out the backup binlog in turn and replay it to the moment before the table was mistakenly deleted at noon. In this way, your temporary library is the same as the online library before it was deleted by mistake, and then you can take the table data out of the temporary library and restore it to the online library as needed.
Why do logs need "two-phase commit"?
Because redo log and binlog are two separate logic, if you don't need a two-phase commit, either write the redo log first and then write the binlog, or in reverse order. Let's see what's wrong with these two approaches.
Still use the previous update statement as an example. Suppose the line of the current ID=4, the value of the field score is 98.00, and what happens if crash occurs during the execution of the update statement after the first log is written and the second log is not finished?
(1) write redo log first and then binlog.
Suppose that the MySQL process restarts abnormally when the redo log is finished and the binlog is not finished. As we said earlier, after the redo log is written, the system can still recover the data even if it crashes, so the value of score for this line after recovery is 99.00. However, because the crash is not finished with binlog, there is no record of this statement in binlog at this time. Therefore, when you back up the log later, there is no such statement in the saved binlog. If you need to use this binlog to restore the temporary library, because the binlog of this statement is missing, the temporary library will be missing this update, and the value of the restored line is 98.00, which is different from the value of the original library.
(2) write binlog first and then redo log
If crash after binlog has been written, the transaction is invalid after crash recovery because redo log has not been written yet, so the value of score for this line is 98.00. But the log "change score from 98.00 to 99.00" has been recorded in binlog. So, when you use binlog to recover later, there is one more transaction, and the value of the recovered row of score is 99.00, which is different from the value of the original library.
As you can see, if you do not use two-phase commit, the state of the database may be inconsistent with that of the library recovered with its logs.
Question: is the probability of such operation very low, and there is no scene where there is no need to restore the temporary library?
Not only do you need to use this process to recover data after misoperation. When you need to expand capacity, that is, when you need to build more backup libraries to increase the reading capacity of the system, the common practice is to use full backup plus application binlog. This "inconsistency" will lead to inconsistencies between master and slave databases online.
In a nutshell, both redo log and binlog can be used to represent the commit status of a transaction, while two-phase commit is to keep the two states logically consistent.
For the InnoDB engine, the binlog log is recorded every time the transaction commit commits, and the record is still in memory, so when is it stored on disk? Mysql controls the timing of binlog flushing through the sync_binlog parameter. Value range: 0mm N:
0: instead of forcing it, it is up to the system to decide when to write to disk.
1: write bin log to disk every time you commit a transaction
N: for every N transaction commit, bin log is written to disk.
Note: this value defaults to 0 and the operating system mechanism is used for buffered data synchronization. The sync_binlog parameter is recommended to be set to 1, so that the binlog will be written to disk every time the transaction commit, which also ensures that the binlog log will not be lost after an abnormal restart of mysql.
SHOW GLOBAL VARIABLES LIKE'% innodb_flush%'
As shown in the figure:
Binlog usage scenario
In the actual scenario, the main scenario of bin log has two points, one is master-slave replication, and the other is data recovery.
(1) Master-slave replication: enable binlog on the master side, then send the binlog to each slaver side, and the slaver side reads the binlog log, thus making the data in the master-slave database consistent.
(2) data recovery: obtain the time period data you want to recover through binlog
The above is about the content of this article about "what is the execution process of a SQL update statement". I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.