How MySQL looks at InnoDB Line format from binary content 07/19 Update SLTechnology News&Howtos

How MySQL looks at InnoDB Line format from binary content

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "MySQL how to look at InnoDB line format from binary content". In daily operation, I believe many people have doubts about how MySQL looks at InnoDB line format from binary content. Xiaobian consulted all kinds of data and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the question of "MySQL how to see InnoDB line format from binary content". Next, please follow the editor to study!

InnoDB is a storage engine that stores data from tables to disk, so even if we shut down and restart our data, it still exists. The real process of processing data takes place in memory, so you need to load the data from the disk into memory, and if you are dealing with write or modify requests, you need to flush the contents of the memory to the disk. And we know that the speed of reading and writing to disk is very slow, which is several orders of magnitude lower than that of memory, so when we want to get some records from the table, does the InnoDB storage engine need to read the records off the disk one by one?

InnoDB takes the approach of dividing the data into several pages, using pages as the basic unit of interaction between disk and memory. The page size in InnoDB is generally 16KB. That is, in general, the contents of 16KB are read from disk to memory at least at one time, and the contents of 16KB in memory are flushed to disk at least at a time.

We usually insert data into the table in units of records, and the way these records are stored on disk is also called row format or record format. The InnoDB storage engine designs four different types of row formats, namely, Compact, Redundant, Dynamic, and Compressed.

Classification and introduction of row record format

In earlier versions of InnoDB, there was only one file format, so you didn't need to name it. With the development of the InnoDB engine, new file formats that are not compatible with previous versions have been developed to support new features. To help manage system compatibility in the event of upgrades and downgrades, and to run different MySQL versions, InnoDB began to use named file formats.

In msyql 5.7.9 and later, the default line format is determined by the innodb_default_row_format variable, whose default value is dynamic:

View the row format used by the current table:

Mysql > show table status like 'dept_emp'\ gateway * 1. Row * * Name: dept_emp Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 331570 Avg_row_length: 36 Data_length: 12075008Max _ data_length: 0 Index_length: 0 Data_free: 0 Auto_increment: NULL Create_time: 2021-08-11 09:04:36 Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment:1 row in set (2021 sec)

Specify the row format of the table:

CREATE TABLE table name (column information) ROW_FORMAT= row format name ALTER TABLE table name ROW_FORMAT= row format name

If you want to change the row mode of an existing table to compressed or dynamic, you must first set the file format to Barracuda:set global innodb_file_format=Barracuda;, and then modify it with ALTER TABLE tablename ROW_FORMAT=COMPRESSED; to take effect.

Line format COMPACT

Variable length field list

MySQL supports some variable length data types, such as VARCHAR (M), VARBINARY (M), various TEXT types, and various BLOB types. We can also call columns with these data types variable length fields. The number of bytes of data stored in variable length fields is not fixed, so we need to save the number of bytes occupied by these data when storing real data. If the maximum number of bytes allowed to be stored in the variable field (M × W) exceeds 255 bytes and the actual number of bytes stored (L) exceeds 127 bytes, a 2-byte record is used, otherwise a 1-byte record is used.

Question 1: so why use 128 as the dividing line? A byte can represent up to 255, but when MySQL designs a length representation, in order to distinguish whether a byte represents the length, it is specified that if the highest bit is 1, then two bytes represent the length, otherwise it is one byte. For example, 01111111, this represents a length of 127. if the length is 128, you need two bytes, that is, 10000000 10000000, the highest bit of the first byte is 1, then this is the beginning of the length, the second byte can represent the length with all bits, and it should be noted that MySQL adopts the Little Endian counting method, with the low bit in front and the high bit in the back, so 129is 10000001 10000000. At the same time, the maximum length of this identification method is 2 ^ 15-1mm 32767, that is, 32KB.

Question 2: what if two bytes are not long enough? The innoDB page size defaults to 16KB. For some fields that take up a large number of bytes, for example, if the length of a field is greater than 16KB, if the record cannot be stored on a single page, InnoDB will store part of the data in the so-called overflow page, and only the length left in this page can be stored in the variable length field length list, so you can store it with two bytes. This overflow page mechanism refers to the following data overflow.

NULL values list

Some columns in the table may store null values, and it will take up a lot of space if these null values are stored in the real data of the record, so the Compact row format uniformly manages the columns whose values are NULL and stores them in the null values list. Each column that is allowed to store NULL corresponds to a binary bit, and if the value of the binary bit is 1, the value representing the column is NULL. When the value of the binary bit is 0, the value representing the column is not NULL.

Record header information

Used to describe the record header information of a record, which consists of a fixed 5 bytes. 5 bytes is 40 binary bits, and different bits mean different things.

Field length (bit) indicates that reserved bit 11 does not use reserved bit 21 does not use delete_mask1 to mark whether the record has been deleted the minimum record in each layer of the non-leaf node of the min_rec_mask1B+ tree will add the tag n_owned4 indicates the number of records owned by the current record heap_no13 represents the location information of the current record on the page record_type3 represents the type of current record, 0 represents the general record, 1 indicates the B+ tree non-leaf node record 2 represents the minimum record, 3 indicates the maximum record next_record16 represents the relative position hidden column of the next record

The real data of the record in addition to the data of the columns we define, MySQL adds some columns (also known as hidden columns) by default for each record, including:

DB_ROW_ID (row_id): optional, 6 bytes, representing line ID, which uniquely identifies a record

DB_TRX_ID: must be, 6 bytes, representing transaction ID

DB_ROLL_PTR: must be, 7 bytes, indicating rollback pointer

The primary key generation strategy of the InnoDB table is to give priority to using the user-defined primary key as the primary key, and if the user does not define the primary key, select a Unique key as the primary key. If there is no definition of the Unique key in the table, InnoDB will add a hidden column named row_id as the primary key by default.

DB_TRX_ID (also known as trx_id) and DB_ROLL_PTR (also known as roll_ptr) are required, but row_id is optional (this column will only be added without a custom primary key and Unique key).

Other line formats are not much different from Compact line formats.

Redundant line format

The Redundant line format is a line format that was used before MySQL5.0 and will not be delved into.

Dynamic line format

The default row format for MySQL5.7 is that the Dynamic,Dynamic row format is similar to the Compact row format, except that it is different when dealing with row overflow data.

Compressed line format

The Compressed line format uses a compression algorithm to compress the page based on the Dynamic line format to save space. Zlib's algorithm is used for compression, so long-length data such as BLOB, TEXT and VARCHAR can be effectively stored (reduced by 40%, but with higher requirements for CPU).

Data overflow

If we define a table, there is only one VARCHAR field in the table, as follows:

CREATE TABLE test_varchar (c VARCHAR (60000))

Then insert 60000 characters into this field, what happens? As mentioned earlier, the basic unit of disk-memory interaction in MySQL is pages, that is, MySQL manages storage space on a page-based basis, and our records are allocated to a page for storage. The size of a page is generally 16KB, that is, 16384 bytes, while a column of type VARCHAR (M) can store up to 65532 bytes, which may cause a page not to hold a record.

In Compact and Redundant row formats, for columns that take up a lot of storage space, only the first 768 bytes of the column's data are stored at the real data of the record, and then the remaining data is scattered over several other pages, and the addresses pointing to these pages are stored at the real data of the record with 20 bytes (20 bytes after 768 bytes). This process is also called line overflow, and pages that store more than 768 bytes are also known as overflow pages.

The Dynamic and Compressed row formats do not store the first 768 bytes of the real data of the field at the real data of the record, but store all the bytes in other pages and only store the addresses of other pages at the real data of the record.

Actual combat analysis line format

Prepare tables and data:

Create table row_test (T1 varchar (10), T2 varchar (10), T3 char (10), T4 varchar (10)) engine=innodb charset=latin1 row_format=compact;insert into row_test values

In the Linux environment, use hexdump-C-v mytest.ibd > mytest.txt, open the mytest.txt file, and find the following:

0000c070 73 75 70 72 65 6d 75 6d 03 02 01 0000 00 100 | supremum. | 0000c080 2c 0000 0000 02 0000 0000 0f 61 c8 0000 | .a. | 0000c090 01 d4 01 10 61 62 62 62 20 20 20. Abbbb | 0000c0a0 20 63 63 63 03 02 01 0000 00 18 00 2b 0000 00 | ccc.+... | 0000c0b0 00 02 01 0000 0000 0f 62 c9 0000 01 b2 01 10 | .b. | 0000c0c0 64 65 65 65 20 20 20 66 66 66 | deeee fff | 0000c0d0 03 01 06 0000 20 ff 98 0000 0000 02 0000 |. | | 0000c0e0 0000 0f 67 cc 0000 01 b6 01 10 64 66 66 00 |. G.dfff. |

The record of this line starts with 0000c078, and the first line is sorted out as follows:

03 02 01 / / variable length field length list, in reverse order, T4 column length 3 Magi T2 column length 2 Magi T1 column length 100 / / NULL flag bit, the first row has no null value 00 00 10 00 2c / / record header information, fixed 5 byte length 00 00 00 2b 68 00 / / RowID the table we built has no primary key, so there will be RowID, fixed 6 byte length 00 00 00 06 / transaction ID Fixed 6 bytes 80 00 00 00 32 01 10 / / rollback pointer, fixed 7 bytes 61 / / T1 data 'bb'63 62 62 / / t2'bb'62 62 20 20 20 / / T3 data' bb'63 63 / / T3 data 'ccc'

The second line is sorted out as follows:

03 02 01 / / variable length field length list, in reverse order, T4 column length 3 Magi T2 column length 2 Magi T2 column length 100 / / NULL flag bit, second row no null value 00 00 18 00 2b / / record header information, fixed 5 byte length 00 00 00 02 01 / / RowID the table we built has no primary key, so there will be RowID, fixed 6 byte length 00 00 00 0f 62 / transaction ID Fixed 6-byte c9 00 01 b2 01 10 / / rollback pointer, fixed 7-byte 64 / / T1 data 'dwatches 65 65 / / T2 data' ee'65 65 20 20 20 / T3 data 'ee'66 66 / / T3 data' fff'

The third line is sorted out as follows:

03 01 / / variable length field length list, reverse order, T4 column length 3 NULL column length 106 / / 00000110 NULL flag bits, T2 and T3 column length 0000 20 ff 98 / / record header information, fixed 5 byte length 0000 0000 02 / RowID the table we built has no primary key, so there will be RowID, fixed 6 byte length 0000 0000 0f 67 / transaction ID, fixed 6 byte cc 000001 b60110 / rollback pointer Fixed 7-byte 64 / / T1 data 'dink 66 66 / / T4 data' fff'

Next, update the data:

Mysql > update row_test set t2=null where t1boxes, query OK, 1 row affected (0.02 sec) Rows matched: 1 Changed: 1 Warnings: 0mysql > delete from row_test where t2conversation strategies: query OK, 1 row affected (0.01sec)

Check the binary content (it will take a while, it may only be written to the cache, and the files on disk have not been updated):

0000c070 73 75 70 72 65 6d 75 6d 03 01 02 0000 10 00 58 | supremum.X | 0000c080 0000 02 0000 0000 00 f 68 4d 0000 01 | HM... | 0000c090 9e 04 a9 61 62 20 20 20 63 | abb cc | 0000c0a0 63 63 63 03 02 00 20 00 18 0000 0000 00 | cccc.... | 0000c0b0 00 02 01 0000 0000 0f 6a 4e 0000 01 9f 10 c0 | .jN. | 0000c0c0 64 65 65 65 20 20 20 66 66 | deeee fff | 0000c0d0 03 01 06 0000 20 ff 98 0000 0000 02 02 0000 |. | | 0000c0e0 0000 0f 67 cc 0000 01 b6 01 10 64 66 66 00 |. G.dfff. |

The record of this line starts with 0000c078, and the first line is sorted out as follows:

03 01 / / variable length field length list, reverse order, T4 column length is 3 NULL / / 0000 0010 NULL flag bits, indicating that T2 is null00 0010 00 58 / / record header information, fixed 5-byte length 0000 0000 02 00 / / RowID our table has no primary key, so there will be RowID, fixed 6-byte length 0000 0000 0f 68 / / transaction ID, fixed 6 bytes 4d 0000 01 9e 04 a9 / rollback pointer Fixed 7 bytes 61 / / T1 data 'asides 62 20 20 20 / / T3 data' bb'63 63 / / T1 data 'ccc'

The second line is sorted out as follows:

03 02 01 / / variable length field length list, in reverse order, T4 column length 3 Magi T2 column length 2 Magi T1 column length 100 / / NULL flag bit, second row no null value 20 00 18 00 00 / / 0010 delete_mask=1 mark whether the record is deleted header information, fixed 5 byte length 00 00 01 / RowID the table we built has no primary key, so there will be RowID, fixed 6 byte length 00 00 00 0f 6a / transaction ID Fixed 6-byte 4e 00 00 01 9f 10 c0 / / rollback pointer, fixed 7-byte 64 / / T1 data 'dwatches 65 65 / / T2 data' ee'65 65 20 20 20 / / T3 data 'ee'66 66 / / T3 data' fff'

The third row of data has not changed.

At this point, the study on "how MySQL looks at InnoDB line format from binary content" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.