In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the knowledge of "MySQL logic layering, storage engine, sql optimization, index optimization and underlying implementation". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
First, logical layering
Connection layer: connection and threading, this layer is not unique to MySQL. Generally, there are similar components based on C _ pact S architecture, such as connection processing, authorization authentication, security and so on.
Service layer: including cache query, parser, optimizer, this part is the core function of MySQL, including parsing and optimizing SQL statements, querying cache directory, and the implementation of built-in functions (date, time, encryption, etc.).
Engine layer: responsible for data storage, different storage engines, different storage methods, data formats, extraction methods, etc., this part also greatly affects the performance of data storage and extraction; the abstraction of the storage layer.
Storage layer: storage data, file system.
Second, storage engine
View the storage engine supported by the database: show engines
To see which engine the database uses by default, you can use the command: show variables like'% storage_engine%'
Specify the engine for the database object:
Create table tb (id int (4) auto_increment, name varchar (5), dept varchar (5), primary key (id)) ENGINE=MyISAM AUTO_INCREMENT=1DEFAULT CHARSET=utf8
View the table creation statement: show create table default_table
How to choose the MySQL storage engine MyISAM and InnoDB?
Although MyISAM and InnoDB are not the only storage engines in MySQL, they are commonly used. There may be webmasters who have not paid attention to MySQL's storage engine, but in fact, storage engine is also a major point in database design, so which storage engine should blog system use?
Let's take a look at the differences between the two storage engines.
First, InnoDB supports transactions, but MyISAM does not, which is very important. Transactions are an advanced way of processing, such as in some column additions and deletions, as long as any error can be rolled back, but MyISAM is not.
2. MyISAM is suitable for query and insert-based applications, and InnoDB is suitable for frequent modifications and applications involving high security.
3. InnoDB supports foreign keys, but MyISAM does not.
4. MyISAM is the default engine, and InnoDB needs to specify
5. InnoDB does not support indexes of type FULLTEXT
6. The number of rows of the table is not saved in InnoDB, such as select count (*) from table, InnoDB needs to scan the entire table to calculate the number of rows, but MyISAM can simply read out the number of saved rows. Note that MyISAM also needs to scan the entire table when the count (*) statement contains where conditions
For self-growing fields, the InnoDB must contain an index with only that field, but a federated index can be established with other fields in the MyISAM table
8. When emptying the entire table, InnoDB is deleted row by row, which is very slow. MyISAM rebuilds the table
9. InnoDB supports row locks (or locking entire tables in some cases, such as update table set axiom 1 where user like'% lee%').
Through the above nine differences, combined with the characteristics of personal blog, the personal blog system is recommended to use MyISAM, because the main operation in the blog is to read and write, there are few chain operations. So choose MyISAM engine to make your blog open page more efficient than InnoDB engine blog, of course, it is only personal advice, most blogs still choose carefully according to the actual situation.
Third, sql optimization
3.1.1 mysql internal implementation indexing principle (B+Tree)
3.1.1.1, binary tree
3.1.1.2, B-Tree
Each node in the B-Tree can contain a lot of keyword information and branches according to the actual situation, as shown in the following figure as a third-order B-Tree:
Each node occupies the disk space of a disk block. On one node, there are two keywords sorted in ascending order and three pointers to the root node of the subtree. The pointer stores the address of the disk block where the child node resides. The three scope fields divided by the two keywords correspond to the scope fields of the data of the subtree pointed to by the three pointers. Taking the root node as an example, the data range of the subtree pointed to by the keyword 17 and 35 P1 pointer is less than that pointed to by the 17quotient P2 pointer, and the data range of the subtree pointed to by the 175th P3 pointer is greater than 35.
Simulate the process of finding keyword 29:
Find disk block 1 according to the root node and read it into memory. [disk Icano operation for the first time]
The comparison keyword 29 is in the interval (171.35), and find the pointer P2 of disk block 1.
Locate disk block 3 according to the P2 pointer and read it into memory. [disk Icano operation second time]
The comparison keyword 29 is in the interval (261.30), and the pointer P2 of disk block 3 is found.
Locate disk block 8 according to the P2 pointer and read it into memory. [disk Icano operation for the third time]
Find keyword 29 in the keyword list in disk block 8.
Based on the analysis of the above process, it is found that 3 disk IZP O operations and 3 memory lookup operations are required. Because the keyword in memory is an ordered table structure, dichotomy search can be used to improve efficiency. On the other hand, the three disk Iamp O operations are the decisive factors that affect the efficiency of the whole B-Tree search.
Compared with AVLTree, B-Tree reduces the number of nodes, which makes the data fetched from memory by disk IBG O play a role every time, thus improving the query efficiency.
3.1.1.3, B+Tree (the number of times to query any data is n)
B+Tree is an optimization based on B-Tree to make it more suitable for the implementation of external storage index structure. InnoDB storage engine uses B+Tree to implement its index structure.
You can see from the B-Tree structure diagram in the previous section that each node contains not only the key value of the data, but also the data value. However, the storage space of each page is limited, if the data data is large, it will lead to a small number of key that each node (that is, a page) can store, and when the amount of data stored is very large, it will also lead to a large depth of B-Tree, which will increase the number of disk B-Tree O when querying, and then affect the query efficiency. In B+Tree, all data recording nodes are stored on the leaf nodes of the same layer according to the order of key values, while only key value information is stored on non-leaf nodes, which can greatly increase the number of key values stored in each node and reduce the height of B+Tree.
B+Tree differs from B-Tree in several ways:
Non-leaf nodes only store key value information.
There is a chain pointer between all leaf nodes.
The data records are stored in the leaf node.
Optimize the B-Tree in the previous section. Since the non-leaf nodes of B+Tree only store key value information, assuming that each disk block can store 4 key values and pointer information, the structure of B+Tree is as shown in the following figure:
There are usually two header pointers on the B+Tree, one to the root node and the other to the leaf node with the smallest keyword, and there is a chain ring structure between all the leaf nodes (that is, data nodes). Therefore, there are two kinds of lookup operations for B+Tree: one is range lookup and paging lookup for the primary key, and the other is random lookup starting from the root node.
The above should know the B+Tree, so when we build the index, we will generate a B+Tree. If we only query the index field, the sql statement will go directly to B+Tree instead of the data table, so it is very important to improve performance. Also, do not index the fields that are always modified, because if the fields are modified, the B+Tree structure will be refactored, which will degrade the performance.
3.1.1 Index classification:
There are four types of mysql indexes: primary key index, unique index, general index and full-text index. By adding an index to the field, the reading speed of the data can be improved, and the concurrency and pressure resistance of the project can be improved. An optimization method in mysql when an index is optimized. The function of the index is equivalent to the catalogue of the book, and the content you need can be found quickly according to the page number in the catalog.
Primary key index: the primary key is a unique index, but it must be specified as PRIMARY KEY, and there can be only one primary key per table.
Alert table tablename add primary key (`Field name `)
Unique index: all values of the index column can only appear once, that is, they must be unique, and the value can be empty.
Alter table table_name add primary key (`Field name `)
General index: the basic index type, the value can be empty, there is no restriction of uniqueness.
Alter table table_name add index (`Field name `)
Full-text index:
The index type of the full-text index is FULLTEXT. Full-text indexes can be created on columns of type varchar, char, and text. Can be created through the ALTER TABLE or CREATE INDEX command. For large datasets, creating a full-text index through the ALTER TABLE (or CREATE INDEX) command is faster than inserting records into an empty table with a full-text index. MyISAM supports full-text indexing, while InnoDB supports full-text indexing after mysql5.6. Full-text indexing does not support that Chinese needs to be processed by sphinx (coreseek) or Swift search technology.
3.2.2 Mechanism of indexing
1. Why is the query faster after we add the index?
The traditional query method traverses according to the order of the table. No matter how many pieces of data are queried, mysql needs to traverse the data of the table from beginning to end.
After we have added the index, mysql usually generates an index file through the BTREE algorithm. When querying the database, we find the index file and traverse it (half the search efficiency) to find the corresponding key to get the data.
two。 The cost of the index
2.1 Indexes are created to generate index files and take up disk space
2.2 Index file is a binary tree type file, it is conceivable that our dml operation will also modify the index file, so the performance will be degraded.
3. On which column do you use indexes?
3.1 Indexes should be created frequently as query condition fields
3.2 fields with poor uniqueness are not suitable for creating indexes, although frequently used as query criteria, such as gender gender fields
3.3 Fields that are updated very frequently are not suitable as indexes
3.4 Fields that do not appear in the where clause should not be indexed
Summary: an index should be created only if the following conditions are met.
A: be sure to use b: the content of this field is not the only value in the where bar. C: the content of the field does not change frequently.
3.2.2, SQL parsing order
Next, let's take a look at the past life and this life of a SQL sentence.
First, take a look at the sample statement.
SELECT DISTINCT. FROM. JOIN. ON. WHERE. GROUP BY. HAVING. ORDER BY. LIMIT.
However, the order in which it is executed is as follows
FROM. ON. JOIN. WHERE. GROUP BY. HAVING. SELECT DISTINCT. ORDER BY. LIMIT.
3.2.3 how to build an index
In general, the index should be based on the fields that will be used for JOIN,WHERE judgment and ORDERBY sorting. Try not to index a field in the database that contains a large number of duplicate values. For a field of type ENUM, it is possible to have a large number of duplicate values.
3.2.4 there are some techniques when using indexes:
1. The index will not contain columns with NULL
As long as the column contains a null value, it will not be included in the index, and as long as one column in the composite index contains a null value, then this column is invalid for this conforming index.
2. The index should be based on fields that often perform select operations. For fields that are often modified, there is no need to build an index, because an index will generate a B+ tree. After you modify the fields of the index, the Btree needs to be modified, but it is not very good for performance.
3. Composite index: composite index, do not use it across columns or out of order (best left prefix)
4.like statement operation: in general, the use of like operation is not encouraged, if it must be used, pay attention to the correct use. Like'% aaa%' does not use an index, while like 'aaa%' can use an index.
5. Do not do anything on the index (calculation, function, type conversion), otherwise the index will fail
6. Do not use NOT IN,! = operation, but =, BETWEEN,IN can use the index
7. The index should be based on fields that often perform select operations.
This is because, if these columns are rarely used, the presence or absence of indexes does not significantly change the query speed. On the contrary, due to the increase of the index, the maintenance speed of the system is reduced and the space requirement is increased.
8. The index should be based on a field with a unique value.
9. Columns defined as text, image, and bit data types should not be indexed. Because the amount of data in these columns is either quite large or very few.
10. In join operations (when you need to extract data from multiple data tables), mysql can use the index only if the primary key and foreign key have the same data type, otherwise it will not be used if the index is established in time.
Third, sql performance issues
a. Analyze the execution plan of SQL: explain, which can simulate the execution of SQL statements by SQL optimizer, so as to let developers know the state of SQL they have written.
B.MySQL query optimization interferes with our optimization (there is an sql optimizer in the mysql service layer) and can optimize the sql we write, which is beyond our control.
Query execution plan: explain + SQL statement explain SELECT * from book
Id: number
Select_type: query type
Table: tabl
Type: index type system > const > eq_ref > ref > range > index > all. If you want to optimize type, you can generally achieve range if you have an index.
Possible_keys: index used for forecasting
Key: the index actually used
Key_len: the length of the index actually used
Ref: references between tables
Rows: the amount of data queried by the index
Extra: additional information. Here's what he might send.
i)。 Using filesort: high performance consumption; requires an "extra" sort (query). It is common in order by statements. Solve: which fields to where, just order by those fields 2
Ii). Using temporary: high performance loss, using temporary watches. It generally appears in group by statements. Solution: avoid: query those columns based on those columns group by.
Iii). Using index: performance improvement; index coverage (overwrite index). Reason: do not read the original file, only get the data from the index file (no need to query back to the table)
As long as all the columns used are in the index, the index overrides using index
Iii). Using where (query needs to be returned to the table).
Suppose age is an index column
However, the query statement select age,name from. Where age =., this statement must go back to the original table to look up Name, so using where. Solve it. Name is also added to the index.
That's all for "MySQL logical layering, storage engine, sql optimization, index optimization and underlying implementation". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.