In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Editor to share with you the example analysis of multi-version concurrency control of large objects in MySQL. I believe most people don't know much about it, so share this article for your reference. I hope you will gain a lot after reading this article. Let's learn about it together.
MVCC of large objects in MySQL 8.0:InnoDB
In this article, I will explain the multi-version concurrency control (MVCC) of the large object (LOB) design in the MySQL InnoDB storage engine. MySQL 8.0 has a new feature that allows users to partially update large objects, including JSON documents. With this partial update feature, the way MVCC works with LOB has changed when LOB is partially updated. For normal updates (full updates), MVCC will work like previous versions. Let's take a look at how MVCC works when partial updates are not involved, and then consider the use case for partial updates to LOB.
General updates to MVCC
I use the term regular updates to refer to updates that are not partial updates. I'll use an example to explain how MVCC can be used to routinely update large objects. I will use the following mtr (1) test case for this purpose:
Create table T1 (F1 int primary key, f2 longblob) engine = innodb; insert into T1 values (1, repeat ('a', 65536)); start transaction; update T1 set f2 = repeat ('b', 65536) where F1 = 1;-- echo # Connection con1:-- for users using MySQL clients, you may need to establish a new link by opening another terminal window, as shown below. Connect (con1, localhost, root,);-- echo # Must see the old value 'aaaaaaaaaa' select F1, right (f2,10) from T1 order by F1;-- echo # Connection default: connection default; disconnect con1; commit; drop table T1
In order to understand the following explanation, it is important to understand the above test cases carefully.
The test scenario is as follows:
Initially, table T1 contains a single record (R1).
The transaction trx1 updates the record to the new value.
While trx1 is still active, another transaction trx2 is reading the record. It will read the old value.
Table T1 contains only one record (R1). But trx1 and trx2 will see two different values. The table actually contains only the latest values (values seen by trx1), while the values or records seen by trx2 are obtained from undo logging. Let's look at the picture below to better understand it.
Initial status: before the update operation
The following figure shows what happened before the update operation. The undo log is empty. The clustered index of the table contains one row. There is a LOB in the table. The clustered index record contains a reference to the LOB.
Final status: after the update operation
Now let's look at what happens after the update operation.
Here are some important observations:
There are two LOB in the user table space-the old LOB and the new LOB. The old LOB can only be accessed by undoing the log. The clustered index record points to the new LOB.
The update operation has created an undo log record that contains the update vector. This undo logging points to the old LOB.
Clustered index records point to undo logging through the DB_ROLL_PTR system column. This scroll pointer points to undo logging, which can be used to build previous versions of clustered index records.
The undo record does not contain the LOB itself. Instead, it contains only a reference to the LOB stored in the user's tablespace.
LOB references stored in undo log records are different from LOB references stored in clustered index records.
The steps that the transaction takes in connection 1 are as follows:
The transaction looks at R1 and determines that the transaction that modified the clustered index record has not been committed. This means that it cannot read the record (because the default isolation level is REPEATABLE READ).
It looks at the DB_ROLL_PTR in R1 and finds the undo logging. Use undo logging to build a previous version of R1.
It reads the old version of R1 that was built. Note that this version is not available in clustered index records. But it uses undo records to build on the fly.
When R1 points to the new LOB, the old version of this construct points to the old LOB. So the result contains the old LOB.
This is how LOB's MVCC works when partial updates are not involved.
MVCC partial update
Let's look at another example to see how MVCC works with partial updates. We need another example, because currently only partial updates to JSON documents are supported through the functions json_set () and json_replace ().
Create table T2 (F1 int primary key, j json) engine = InnoDB; set @ elem_a = concat ('", repeat ('a', 200),'"); set @ elem_a_with_coma = concat (@ elem_a,','); set @ json_doc = concat ("[", repeat (@ elem_a_with_coma, 300), @ elem_a, "]") Insert into T2 (F1, j) values (1, @ json_doc); start transaction; update T2 set j = json_set (j,'$[200]', repeat ('b', 200)) where F1 = 1;-- echo # Connection con1: connect (con1, localhost, root,);-- echo # Must see the old value 'aaaaaaaaaa...' Select json_extract (j,'$[20000]) from T2;-- echo # Connection default: connection default; disconnect con1; commit
This scenario is the same as the previous example. Only the longblob field has been changed to an JSON document. The data loaded is also slightly different to conform to the JSON format.
Tip: you can print the LOB index in the server log file by adding the statement set debug ='+ d _ innocent _ index _ print'in the above mtr test case (of the two). The LOB index will print as soon as it is inserted. The LOB index provides you with the structure of the stored LOB object.
Before a partial update operation
The initial conditions before the full or partial update operation are the same and have been given above. But in the following figure, some additional information is provided.
Let's take a look at the other information shown in the figure:
LOB references stored in clustered index records now contain the LOB version number v1. During the initial insert operation, it is set to 1 and incremented each time it is partially updated.
Each LOB data page has an entry in the LOB index. Each entry contains LOB version information. Whenever an LOB data page is modified, it is copied to a new LOB data page with new data, and a new LOB index entry with an incremental LOB version number is created.
The additional information is the LOB version number. This is available in the LOB reference in the clustered index record and in each entry in the LOB index.
After a partial update operation
The following figure illustrates what happens after a partial update operation.
The most important optimization here is that there is still only one LOB in the user tablespace. Update only those LOB data pages that need to be modified. This single LOB after a partial update operation contains both the old version and the new version of the LOB. The v1 and v2 tags on the LOB data page in the figure illustrate this.
Another important observation is that LOB references in undo logs and clustered index records point to the same LOB. However, LOB references contain different version numbers. The LOB reference in the undo log record contains v1 (the old version number) and the LOB reference in the clustered index record contains the new version number v2.
Purpose of the LOB version number
As shown above, different LOB references with different version numbers point to the same LOB. A single LOB contains parts from different versions. The LOB version number is used to get the correct version that various LOB references point to. In this section, we will learn how to do this.
The LOB index contains a list of the LOB pages that make up the LOB. It contains the page number of the LOB data page, the amount of data contained in each LOB data page, and the version number. Each node in this list is called an LOB index entry. Each LOB index entry contains a list of older versions. Let's look at a diagram that illustrates the structure of the update test case in the above section.
Initially, the LOB index contains a total of four entries before completing the partial update. The page numbers of the four entries are 5, 6, 7 and 8. No LOB index entries have older versions. The version number of all four entries is 1.
After the partial update is complete, we notice that page number 9 has replaced page number 7, which is now considered the old version of page number 9. The version number of page 9 is 2, and the version number of page 7 is 1.
After the partial update is complete, when the LOB is accessed through the LOB reference with version number 1, the first index entry on page 5 is viewed. Its version number is 1. If the version number in the index entry is less than or equal to the version number in the LOB reference, the entry is read. Therefore, page 5 will be read. The index entry with page number 6 will then be viewed. Its version number is 1, so it will be read. The index entry with page number 9 will then be viewed. Its version number is 2. But the version number referenced by lob is 1. 0. If the version number in the index entry is greater than the version number in the LOB reference, the entry is not read. Because the entry for page 9 has version 2, the old version will be viewed. The index entry with page number 7 will be checked. Its version number is 1, so it will be read. After that, the index entry with page number 8 is checked. Its version number is 1, so it will also be read. This is the way to access older LOB.
After the partial update is complete, when the LOB is accessed through the LOB reference with version number 2, the first index entry on page 5 is viewed. Its version number is 1. If the version number in the index entry is less than or equal to the version number in the LOB reference, the entry is read. So it will read the page number 5, 6, 9, 8 in order. Because the version number is always
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.