In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
[Dib classroom] A series of training courses for database developers, database OPS personnel and cloud OPS personnel for Tencent Cloud Database Senior Product Manager Di BGE, which is designed to help you from getting started to being proficient in learning and using databases.
This issue is the second issue of Di B classroom Special issue [MySQL Classic case Analysis Series]. Search and follow the official Wechat of "Tencent Cloud Database" and reply to "Di B classroom" to view the History Phase 10 Di B classroom tutorial ~
First, start with the common mistakes.
Let's start with a common sql error message at the beginning of the story:
I believe that we must have encountered many times for this kind of error report, and "big data" is also an unavoidable topic in the production process. The data here is "large", which not only takes up more storage space, but also includes more and larger storage of a single (table) field, long data retention time, more data redundancy, and large volume caused by not obvious hot and cold data. the peak value of access changes obviously with hot spots, and the complexity of logic processing leads to the magnification of data storage pressure and so on. Coming back to the problem of error reporting, let's first take a look at the structure of the table:
See here, I believe that we will have different processing methods, here will not compare the advantages and disadvantages of various processing methods, only describe the use of two high-frequency processing methods.
According to the error reporting guidelines, change two large varchar (22288) to text and blob.
According to the business characteristics, reduce the storage length of varchar, or split it into multiple small vachar and char according to the rules.
These two processing methods also have their own advantages and disadvantages. Changing the field to text or blob not only increases the data storage capacity, but can only use prefixes or full-text indexes for the index pages of this field. If the data stored in the business side is in json format, it is a good choice to support json data type, which can be queried and output for a single subclass. Similarly, if it is scaled down and split, it will be more dependent on the business scenarios and logical requirements, the logic of business use needs to be modified, and the project quantity also needs to be evaluated.
Second, in-depth exploration
Then let's take an in-depth analysis of some confusing concepts about limiting the size of "65535".
1. "65535" is not the maximum limit of N in a single varchar (N), but the bytes sum of fields of the non-large field type of the entire table.
Every table (regardless of storage engine) has a maximum row size of 65535 bytes. Storage engines may place additional constraints on this limit, reducing the effective maximum row size.
-
two。 Different character sets can affect the max that a field can store, for example, UTF8 characters require 3 bytes of storage, and for VARCHAR (255) CHARACTER SET UTF8 columns, it takes up 255x3 = 765bytes. Therefore, the table cannot contain more than 65535 + 765 columns. GBK is double-byte, and so on.
3. Variable length columns also take into account the number of bytes of the actual length of the storage column when evaluating the field size. For example, the VARCHAR (255th) CHARACTER SET UTF8 column requires two additional bytes to store value length information, so the column requires up to 767 bytes of storage, which can actually store up to 65533 bytes and the remaining two bytes store length information.
4. BLOB, TEXT and JSON columns are different from varchar, char and other fields. Column length information is independent of long storage and can reach 65535 bytes of real storage.
5. Defining NULL columns reduces the maximum number of columns allowed.
InnoDB table, NULL and NOT NULL column storage size are the same
The MyISAM table, and the NULL column requires extra space to record whether its value is NULL. Each NULL requires an extra bit (rounded to the nearest byte). The maximum row length is calculated as follows:
Row length = 1 + (sum of column lengths) + (number of NULL columns + delete_flag + 7) / 8 + (number of variable-length columns)
Static table, delete_flag = 1, static table identifies whether the row has been deleted by recording a bit in the row.
Dynamic table, delete_flag = 0, the tag is stored at the beginning of the dynamic line
6. For InnoDB tables, the NULL and NOT NULL column storage sizes are the same
7. InnoDB allows a maximum of 1000 columns for a single table
8. The varchar primary key only supports fields with no more than 767 bytes or 768 bytes, 384 double bytes, or 767 bytes, while GBK is double-byte, and UTF8 is three bytes.
9. Unused engines have different restrictions on indexes.
The length of each column of innodb cannot be greater than 767 bytes; and the sum of all constituent index columns cannot be greater than 3072 bytes
The length of each column of myisam cannot be greater than 1000 bytes, and the sum of all constituent index columns cannot be greater than 1000 bytes.
III. Real failure
The following is the business failure encountered today. There are a large number of errors in online business as follows, resulting in the program being unable to write data:
According to the tips and normal thinking, our first reaction is that there are the following problems in the business:
1. The field in the set table structure exceeds the limit
two。 The length of the data inserted in a field exceeds the max value set by the changed field.
Then look at the database table structure of the business, as follows:
The first reason was quickly ruled out, because first of all, the error of the business does not occur when the table is created. If it is 65535 of the sum of non-large fields in the table, there will be an error when the table is created, while the business reports the error when it is written. And through the database table structure, we can also find that a large number of fields of mediumblob type, and the sum of non-large fields is much less than 65535.
Then, according to the specific non-large fields of SQL,appversion, datadata, elt_stamp and id provided by the business, it does not exceed the limit. Mediumblob type fields can be stored up to 16m, and the data of the business is far from reaching this order of magnitude. According to the error prompt to change the non-large fields appversion, datadata, elt_stamp, id to blob type, it still cannot be solved (according to the previous analysis, it is certainly not the root cause of the problem).
After calming down, I found that there is another detail that has been ignored. The failure rate of the business is not 100%, indicating that there are still successful requests. By comparing the successful and failed sql, it is found that the difference in the amount of data is still the mediumblob type field. Well, the first thing that comes to mind now is whether the parameter max_allowed_packet has been reduced, that is, a single request has been rejected if it exceeds the size. Check the value configured below (as shown in the figure below). The data length of the configured size of 1G discipline SQL is far less than that. This reason is also ruled out.
Basically ruled out a few common problems here, and then take a look at the limitation of another parameter: innodb_page_size, which defaults to 16K, with two rows of data per page, so each row has a maximum of 8k data.
After looking at the following data table Row_format is Compact, we can infer that the reason for the problem is that the default approach storage format of innodb stores the first 864 bytes of each blob field in page, so if the blob exceeds a certain number, the single row size will exceed 8k, so the error is reported. This corollary is also applied by comparing the success and failure of the SQL, so how to solve this problem now?
1. Business split table, large fields for sub-table storage
two。 Solve the problem by solving the storage way of Row_format
Because the number of storage entries in the business single table is not large, and the business logic is not suitable for splitting, we need to solve this problem on Row_format.
Under the Barracuda file format, there are two new row record formats, Compressed and Dynamic. The new two formats use a complete row overflow mode for storing BLOB data, and only 20 bytes of pointers are stored in the data page, and the actual data are stored in BLOB Page. Another feature of the Compressed line record format is that the data stored in it is compressed using zlib's algorithm.
The relevant change operations are relatively simple:
Modify the MySQL global variable:
SET GLOBAL innodb_file_format='Barracuda'
Smoothly change the properties of the original table:
ROW_FORMAT=COMPRESSED
4. Continue to study
Through this case, we can extract two points worthy of in-depth study:
1. About innodb_page_size
Starting with MySQL5.6, innodb_page_size can set the Innodb data page to 8K, which defaults to 16K. This parameter needs to be added to the my.cnf at the beginning of initialization. If the table has been created and modified, an error will be reported when starting MySQL.
So what if you want to change this value before version 5.6? That can only be to do some articles on the source code, and then re-rebuild MySQL.
UNIV_PAGE_SIZE is the data page size, which defaults to 16K, which can be set to the power of 2. The value can be set to 4k, 8k, 16k, 32K, 64K. After changing the UNIV_PAGE_SIZE at the same time, you need to change the UNIV_PAGE_SIZE_SHIFT to the power of 2 to UNIV_PAGE_SIZE, so set the data page as follows:
Then let's talk about the impact on mysql performance when innodb_page_size is set to different values. The table tested contains 100 million records and a file size of 30 gigabytes.
① read-write scenario (50% read, 50% write)
16K, less pressure on CPU, average CPU pressure at 20% 8K, CPU pressure 30%, CPU pressure 40%, but select throughput higher than 16K
② read scenario (100% read)
There is no significant difference between 16K and 8K.
The InnoDB Buffer Pool management page itself has a price. The more Page you have, the longer the administrative list will be at the same size. So when our data rows are inherently longer (large chunks are inserted), larger pages are more conducive to speed, because a page can put more rows, each IO has a larger write size, and less IOPS can write more data. When the president is more than 8K, if it is a 16K page, some string types will be forced to convert to TEXT, and the string body will be transferred to the extension page, which will cause reading columns to require one more IO, a larger page will support a larger president, and 64K pages can support approximately 32K presidents without the use of extension pages. However, if it is a short president of random reads and writes, it is not suitable to use such a large page, which will lead to a decline in IO efficiency, large IO can only read a small part.
two。 About Row_format
The Innodb storage engine keeps records, which are stored in rows. Prior to InnoDB 1.0.x, the InnoDB storage engine provided Compact and Redundant formats for storing row record data. Innodb_plugin in MySQL 5.1 introduces a new file format: Barracuda, which has two new line formats: compressed and dynamic. And compact and redundant are collectively called Antelope. You can view the row format used by the current table by using the command SHOW TABLE STATUS LIKE 'table_name';, where the row_format column represents the type of row record structure currently in use.
In MySQL version 5.6, the default Compact, msyql 5.7.9 and later, the default row format is determined by the innodb_default_row_format variable, the default value is DYNAMIC, and you can also specify ROW_FORMAT=DYNAMIC when create table (through which the storage format of the table can be adjusted dynamically). If you want to change the row mode of an existing table to compressed or dynamic, you must first set the file format to Barracuda (set global innodb_file_format=Barracuda;). Then use ALTER TABLE tablename ROW_FORMAT=COMPRESSED; to modify it to take effect, otherwise the modification is invalid and there is no prompt.
① compact
If the blob column value is 768 bytes long, the first 768 bytes are still on the data page, while the rest are placed on the overflow page (off-page), as shown in the following figure:
The blob or variable growth field types mentioned above include blob, text and varchar, in which there will also be an overflow page when the length of the varchar column value is greater than a certain number of N. under the latin1 character set, the N value can be calculated as follows: the block size of innodb defaults to 16kb, because the innodb storage engine table is an index organization table, and the leaf node at the bottom of the tree is a two-way linked list, so there should be at least two rows of records in each page. This determines that when innodb stores a row of data, it cannot exceed 8k, minus the number of bytes occupied by other column values, which is about N.
② compressed or dynamic
Use a full row overflow for blob, that is, the clustered index record (data page) retains only a 20-byte pointer to the address of the overflow segment where it is actually stored:
Dynamic row format, whether the column is stored in the off-page page depends mainly on the row size. It will put the longest column in the row to the off-page until the data page can store the next two rows. TEXT/BLOB column
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.