In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains the "InnoDB line storage format is what", the article explains the content is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "what is the InnoDB line storage format" bar!
In earlier versions of InnoDB, there was only one file format, so you didn't need to name it. With the development of the InnoDB engine, new file formats that are not compatible with previous versions have been developed to support new features. Today we'll take a look at the InnoDB line storage format.
The InnoDB storage engine supports four-name formats: REDUNDANT,COMPACT, DYNAMIC, and COMPRESSED.
Overview of InnoDB line format
REDUNDANT line format
The REDUNDANT format provides compatibility with older versions of MySQL.
The format of REDUNDANT lines is made up of two file formats that support InnoDB (Antelope and Barracuda).
The table using REDUNDANT row format stores the first 768 bytes of variable-length column values (VARCHAR, VARBINARY and, BLOB and TEXT types) in the index record within the B-tree node, and the rest on the overflow page. Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length columns and can be stored off-page. For example, CHAR (255) if the maximum byte length of the character set is greater than 3, the column can exceed 768 bytes utf8mb4.
If the column has a value of 768 bytes or less, the overflow page is not used and may result in some I / O savings because the value is stored entirely in the B-tree node. This applies to relatively short BLOB column values, but may cause B-tree nodes to populate data instead of key values, reducing its efficiency. A table with many BLOB columns can cause the B-tree node to become too full and contain too few rows, making the entire index less efficient than if the rows are shorter or the column values are stored off the page.
REDUNDANT row format storage properties
The REDUNDANT row format has the following storage features:
Each index record contains a 6-byte header. Headers are used to link contiguous records together and for row-level locking.
Records in a clustered index contain fields for all user-defined columns. In addition, there is a 6-byte transaction ID field and a 7-byte scrolling pointer field.
If no primary key is defined for the table, each clustered index record also contains a 6-byte row ID field.
Each secondary index record contains all primary key columns defined for the clustered key, which are not in the secondary index.
The record contains a pointer to each field of the record. If the total length of the fields in the record is less than 128 bytes, the pointer is one byte; otherwise, two bytes. An array of pointers is called a record directory. The area that the pointer points to is the data part of the record.
Internally, fixed-length character columns, such as CHAR (10), are stored in fixed-length format. Trailing spaces are not truncated from the VARCHAR column. Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length columns and can be stored off-page. For example, CHAR (255) if the maximum byte length of the character set is greater than 3, the column can exceed 768 bytes utf8mb4.
The SQL NULL value retains one or two bytes in the record directory. If NULL is stored in a variable-length column, the SQL value retains zero bytes in the data portion of the record. For fixed-length columns, the fixed length of the column is retained in the data portion of the record. Keeping fixed space for NULL values allows columns to be updated from NULL non-null values to non-values without causing index page fragmentation.
COMPACT line format
The COMPACT line format reduces row storage space by about 20% compared to the REDUNDANT line format, at the expense of increased CPU use for certain operations. If your workload is a typical workload limited by cache hit ratio and disk speed, the COMPACT format may be faster. If the workload is limited by CPU speed, the compact format may slow down.
The format of COMPACT lines is made up of two file formats that support InnoDB (Antelope and Barracuda).
The table using COMPACT row format stores the first 768 bytes of variable-length column values (VARCHAR, VARBINARY and, BLOB and TEXT types) in the index record within the B-tree node, and the rest on the overflow page.
Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length columns and can be stored off-page. For example, CHAR (255) can exceed 768 bytes if the maximum byte length of the character set is greater than 3, if the column is a utf8mb4 character type.
If the column has a value of 768 bytes or less, the overflow page is not used and may result in some savings in Icano because the value is completely stored in the B-tree node. This applies to relatively short BLOB column values, but may cause B-tree nodes to populate data instead of key values, reducing its efficiency. A table with many BLOB columns can cause the B-tree node to become too full and contain too few rows, making the entire index less efficient than if the rows are shorter or the column values are stored off the page.
COMPACT row format storage properties
The COMPACT row format has the following storage features:
Each index record contains a 5-byte header, which can precede a variable-length header. Headers are used to link contiguous records together and for row-level locking.
The variable length portion of the record header contains a bit vector indicating the NULL column. If the number of columns in the index can be NULL N, the bit vector occupies bytes. (for example, if you can have any position between 9 and 16 columns, the bit vector uses two bytes. Columns that do not occupy space other than the bits in this vector The variable length section of the title also contains the length of the variable length column. Each length requires one or two bytes, depending on the maximum length of the column. If all the columns in the index are CEILING (* Numplash 8) NULLNULLNOT NULL and have a fixed length, the header does not have a variable length part.
For each non-NULL variable length field, the record header contains a column length of one or two bytes. If part of the column is stored outside the overflow page, or if the maximum length is more than 255 bytes and the actual length is more than 127 bytes, only two bytes are needed. For external storage columns, the 2-byte length represents the length of the internal storage portion plus a 20-byte pointer to the external storage portion. The inner part is 768 bytes, so the length is 768 + 20. The 20-byte pointer stores the true length of the column.
After the record header is the data content of the non-NULL column.
Records in a clustered index contain fields for all user-defined columns. In addition, there is a 6-byte transaction ID field and a 7-byte scrolling pointer field.
If no primary key is defined for the table, each clustered index record also contains a 6-byte row ID field.
Each secondary index record contains all primary key columns defined for the clustered key, which are not in the secondary index. If any primary key column is of variable length, the record header of each secondary index has a variable length portion to record their length, even if a secondary index is defined on the fixed length column.
Internally, for non-variable length character sets, fixed-length character columns (for example, stored in CHAR (10) fixed-length format). Trailing spaces are not truncated from the VARCHAR column.
Internally, for variable-length character sets, such as utf8mb3 and utf8mb4, InnoDB attempts to store in bytes by trimming trailing spaces. If the byte length of the column value exceeds the byte length, adjust the trailing space to the minimum byte length of the column value. The maximum length of the column is the maximum character byte length × CHAR (* N*) NCHAR (* N*) NCHAR (* N*)
N retains the minimum number of bytes. In many cases, keeping the minimum space can complete column updates without causing index page fragmentation. By contrast, when row format is used, columns occupy the maximum character byte length × CHAR (* N*) NCHAR (* N*) NREDUNDANT
Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length fields that can be stored off-page. For example, CHAR (255) can exceed 768 bytes if the maximum byte length of the character set is greater than 3, if the column is a utf8mb4 character type.
COMPACT line format storage feature diagram
MySQL InnoDB COMPAT line format structure
MySQL InnoDB COMPAT line format structure header information
MySQL InnoDB COMPAT line format structure header information description
| | name | size (bit) | description |-|-| reserved bit | 1 | unknown | | reserved bit | 1 | unknown | | delete_flag | 1 | whether the line has been deleted | | min_rec_flag | 1 | 1 | If the record is pre-defined as the smallest record | | n_owned | 4 | number of records owned by the record | | heap_no | 13 | sorted record of the record in the index heap | | record_type | 3 | record type: normal, 001: Btree node pointer, 010: supermum,1xx: reserved | | next_record | 16 | relative position of the next record on the page | | next_record | 16 |
In fact, InnoDB adds three hidden columns to each piece of data, which are
DYNAMIC line format
The DYNAMIC row format provides the COMPACT row format with the same storage features, but adds enhanced storage for long variable-length columns and supports prefixes for large index keys.
The Barracuda file format supports the DYNAMIC line format.
When using the creation table ROW_FORMAT=DYNAMIC,InnoDB, you can store long variable-length column values (for VARCHAR, VARBINARY and, BLOB and TEXT types) completely off-page, and clustered index records contain only 20-byte pointers to the overflow page. Fixed-length fields greater than or equal to 768 bytes are encoded as variable-length fields. For example, CHAR (255) can exceed 768 bytes if the maximum byte length of the character set is greater than 3, if the column is a utf8mb4 character type.
Whether columns are stored off the page depends on the page size and the total size of the rows. When the row is too long, select the longest column for off-page storage until the clustered index record is suitable for the B-tree page. Columns with TEXT and BLOB that are less than or equal to 40 bytes are stored on the line.
The DYNAMIC row format maintains the efficiency of storing the entire row in the index node where it fits (such as COMPACT and REDUNDANT formats), but the DYNAMIC row format avoids the problem of populating B-tree nodes with a large number of long data bytes. The DYNAMIC row format is based on the idea that if part of a long data value is stored on a closed page, it is usually the most efficient way to store the entire value of the closed page. For DYNAMIC format, shorter columns may remain in the B-tree node, minimizing the number of overflow pages required for a given row.
The DYNAMIC row format supports index keys with prefixes of up to 3072 bytes. This feature is controlled by the innodb_large_prefix variable, which is enabled by default. For more information about innodb_large_prefix, see variable descriptions.
Tables that use the DYNAMIC row format can be stored in system tablespaces, per tablespaces, and general tablespaces. To DYNAMIC store tables in the system tablespace, disable innodb_file_per_table and use regular CREATE TABLE or ALTER TABLE statements, or use the TABLESPACE [=] innodb_system table option with CREATE TABLE or ALTER TABLE. When the innodb_file_per_table and innodb_file_format variables do not apply to general tablespaces and are not used, they are applicable to the TABLESPACE [=] innodb_system table option to store DYNAMIC tables in the system tablespace.
DYNAMIC row format storage properties
The DYNAMIC line format is a deviated COMPACT line format.
COMPRESSED line format
COMPRESSED row format provides DYNAMIC row format with the same storage features and functions, but adds support for table and index data compression.
The Barracuda file format supports the COMPRESSED line format.
COMPRESSED row format uses similar internal details to close pages stored in DYNAMIC row format, compressed from additional storage of table and index data and performance considerations, and uses smaller page sizes. Using the COMPRESSED row format, this KEY_BLOCK_SIZE option controls the amount of column data stored in the clustered index and how much is placed on the overflow page.
The COMPRESSED row format supports index keys with prefixes of up to 3072 bytes. This feature is controlled by the innodb_large_prefix variable, which is enabled by default.
COMPRESSED can create tables that use row formatting in either the file table space or the common table space of each table. The system tablespace does not support COMPRESSED row format. To store the COMPRESSED table in the file tablespace of each table, innodb_file_per_table must enable this variable and innodb_file_format must set it to Barracuda. The innodb_file_per_table and innodb_file_format variables do not apply to general table spaces. General tablespaces support all row formats, but it is important to note that compressed and uncompressed tables cannot coexist in the same common tablespace due to different physical page sizes. For more information, see Section 14.6.3.3, "General tablespaces."
COMPRESSED row format storage properties
The COMPRESSED line format is a deviated COMPACT line format. It's just that there are some differences in dealing with row overflow data, instead of storing the first 768 bytes of the string at the real data of the record, all the bytes are stored on other pages, only the addresses of other pages are stored at the real data of the record. In addition, the Compressed row format uses a compression algorithm to compress the page.
Thank you for your reading, these are the contents of "what is the InnoDB line storage format?" after the study of this article, I believe you have a deeper understanding of what the InnoDB line storage format is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.