In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what are the test questions we often meet in MySQL database?" friends who are interested may wish to have a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "what are the test questions we often meet in MySQL database?"
1. Common paradigms of databases:
The first normal form (1NF): refers to that the columns of the table can not be divided again, each column of the table in the database is an indivisible basic data item, and there cannot be multiple values in the same column.
The second normal form (2NF): on the basis of 1NF, it also contains two parts: one is that the table must have a primary key, and the other is that the non-primary key columns in the table must be completely dependent on the primary key and not only on part of the primary key.
The third normal form (3NF): on the basis of 2NF, the transitive dependence of non-primary key columns on primary keys is eliminated, and non-primary key columns must be directly dependent on primary keys.
BC normal form (BCNF): based on 3NF, the transitive dependence of the main attribute on the code part is eliminated.
2. The execution process of SQL statement:
2.1. Database driver and database connection pool of the client:
(1) before the client communicates with the database, it establishes a connection with MySQL through the database driver. After the establishment is completed, the SQL statement is sent.
(2) in order to reduce the degradation of system performance caused by frequent creation and destruction of connections, a certain number of connection threads are maintained through the database connection pool. when a connection is needed, it is obtained directly from the connection pool and returned to the connection pool after use. Common database connection pools include Druid, C3P0, and DBCP
2.2. The execution process of the Server layer of MySQL architecture:
(1) Connector: mainly responsible for establishing connections with clients, obtaining permissions, maintaining and managing connections
(2) query cache: make a query in the cache first, and return it directly if it is found. If it cannot be found in the cache, query it in the database.
MySQL caching is turned off by default, which means that caching is not recommended, and the whole block of query caching has been removed in the MySQL8.0 version. This is mainly due to the limitations of its usage scenarios:
Let's first talk about the data storage format in the cache: key (sql statement)-value (data value), so if the SQL statement (key) is a little different, it will query the database directly.
Because the data in the table is not immutable, most of it changes frequently, and when the data in the database changes, the corresponding cached data associated with the table needs to be removed.
(3) parser / parser: the main work of the parser is to carry out lexical parsing and syntax parsing of the SQL statements to be executed, and finally get the abstract syntax tree, and then use the preprocessor to check the semantics of the abstract syntax tree to determine whether the table in the abstract syntax tree exists, and then judge whether the select projection column field exists in the table.
(4) Optimizer: the syntax tree obtained by SQL after lexical parsing and syntax parsing, through the contents of data dictionaries and statistical information, and then through a series of operations, finally obtains an execution plan, including choosing which index to use.
When analyzing the index query or not, it is analyzed by dynamic data sampling and statistical analysis; as long as it is statistically analyzed, there may be analysis errors, so this factor should also be taken into account when SQL executes the index.
(5) Actuator: according to a series of execution plans, call the API interface provided by the storage engine to call the operation data to complete the execution of SQL.
2.3.The execution process of Innodb storage engine:
(1) first, the MySQL executor calls the API query data of the storage engine according to the execution plan
(2) the storage engine first queries the data from the cache pool buffer pool. If not, it will query the disk. If the query is found, it will put it in the cache pool.
(3) when the data is loaded into Buffer Pool, the original record of the data will be saved to the undo log file.
(4) innodb will perform update operation in Buffer Pool
(5) the updated data will be recorded in redo log buffer
(6) when a transaction is committed, it will do the following three things
(7) (first thing) Brush the data from redo log buffer into the redo log file
(8) (second thing) write the operation record to the bin log file
(9) (third thing) record the name of the bin log file and the location of the updated content in the bin log to redo log, and add the commit tag at the end of the redo log
(10) use a background thread that will at some point brush the updated data from our Buffer Pool into the MySQL database, thus unifying the data in memory and the database.
3. What are the common storage engines? The difference between InnoDB and MyISAM?
The storage engine is a component that performs actual operations on the underlying physical data, providing API for all kinds of operational data for the Server service layer. The commonly used storage engines are InnoDB, MyISAM and Memory. Here we mainly introduce the difference between InnoDB and MyISAM:
(1) transactions: MyISAM does not support transactions, and InnoDB supports transactions
(2) Lock level: MyISAM only supports table-level locks, while InnoDB supports row-level locks and table-level locks. Row-level locks are used by default, but row locks are used only if data is queried through the index, otherwise table locks will be used. Row-level locks consume more resources than table locks each time they acquire and release locks. There may be a deadlock when using row locks, but there is no deadlock at the table level
(3) Primary and foreign keys: MyISAM allows tables without any indexes and primary keys, and does not support foreign keys. The primary key of InnoDB cannot be empty and supports self-growth of the primary key. If no primary key or non-empty unique index is set, a 6-byte primary key will be automatically generated to support foreign key integrity constraints.
(4) Index structure: both MyISAM and InnoDB use B+ tree indexes, and the data fields of MyISAM's primary key index and secondary index are the addresses where row data records are stored. However, the data field of the primary key index of the InnoDB saves not the address of the row data record, but all the data contents of the row, while the data field of the secondary index holds the value of the primary index.
Because the secondary index of InnoDB holds the value of the primary key index, using the secondary index needs to retrieve the index twice: first, retrieve the secondary index to get the primary key, and then use the primary key to retrieve the record in the primary index. This is why it is not recommended to use overly long fields as primary keys: because the secondary index contains primary key columns, if the primary key uses overly long fields, it will cause other secondary indexes to become larger, so try to define the primary key as small as possible.
(5) full-text indexing: MyISAM supports full-text indexing. InnoDB does not support full-text indexing before version 5.6. full-text indexing is supported in version 5.6 and later.
(6) specific number of rows of the table:
① MyISAM: the total number of rows of the table is saved. If you use select count () from table, this value will be taken out directly, without the need for a full table scan.
② InnoDB: the total number of rows of the table is not saved, and if you use select count () from table, you will need to traverse the entire table, which is quite expensive.
(7) Storage structure:
① MyISAM is stored as three files on disk: .frm files store table definitions, .MYD files store data, and .MYI files store indexes.
② InnoDB: data and indexes are stored in the tablespace, all tables are stored in the same data file, the size of the InnoDB table is only limited by the size of the operating system file, generally 2GB.
(8) Storage space:
① MyISAM: can be compressed, small storage space. Three different storage formats are supported: static tables (default, but note that there can be no spaces at the end of the data and will be removed), dynamic tables, and compressed tables.
② InnoDB: more memory and storage is required, and it establishes its dedicated buffer pool in main memory for caching data and indexes.
(9) applicable scenarios:
If ① needs to provide ACID transaction capabilities with rollback and crash resilience, and requires row lock level concurrency control, InnoDB is a good choice.
② if the data table is mainly used to query records, the read operation is much more than the write operation and does not require the support of database transactions, then the MyISAM engine can provide higher processing efficiency.
Note: the MyISAM storage engine has been abandoned in the mysql8.0 version
4. ACID and implementation principle of transaction?
Database transaction is the basic unit of concurrency control, which refers to a logical set of operations, either all or none.
4.1.The ACID of the transaction:
(1) atomicity: the transaction is an indivisible unit of work, and all the operations in the transaction either succeed or fail. If the transaction fails, it needs to be rolled back.
(2) isolation: the visibility of the data operated by a transaction to other transactions before it is committed.
(3) persistence: once a transaction commits, its changes to the data in the database are permanent.
(4) consistency: transactions can not destroy the integrity of data and business consistency. For example, when transferring money, regardless of the success or failure of the transaction, the total amount of money of both parties remains the same.
4.2.The realization principle of ACID:
4.2.1, atomicity: atomicity is achieved through MySQL's rollback log undo log: when a transaction modifies the database, InnoDB will generate the corresponding undo log;. If the transaction fails or rollback is called, causing the transaction to need to be rolled back, you can use the information in undo log to roll back the data to the way it was before modification.
4.2.2, isolation:
(1) isolation level of the transaction:
In order to ensure the integrity and consistency of reading data in the concurrent environment, the database provides four transaction isolation levels. The higher the isolation level, the more data integrity and consistency can be guaranteed, but the greater the impact on high concurrency performance, the lower the execution efficiency. (the four isolation levels rise in turn from top to bottom)
Read uncommitted: allows a transaction to read data that has not been committed by other transactions during execution
Read committed: allows a transaction to read data that has been committed by other transactions during execution
Repeatable (default level): query results are consistent at any time within the same transaction
Read serialization: all transactions are executed one by one, table-level shared locks are acquired for each read, and reads and writes block each other.
(2) concurrency of transactions:
If you do not consider the isolation of transactions, in the environment of transaction concurrency, there may be some problems:
Update loss: when two or more transactions operate on the same data, and then update the row based on the selected value, the problem of missing updates occurs because each transaction is unaware of the existence of other transactions: the last update overwrites updates made by other transactions.
Dirty reading: means that transaction An is accessing the data and modifying the data (the transaction is not committed). At this time, transaction B also uses this data, and then transaction A undoes rollback and restores the modified data to the original value. The data read by B is inconsistent with the data in the database, that is, the data read by B is dirty data.
Non-repeatable reading: within a transaction, the same data is read many times, but the data read before and after is inconsistent because another transaction modifies and commits the data during this period.
Illusion: in one transaction, the same data is read twice (usually a range query), but the results are inconsistent because the data is added or deleted by another transaction.
With different transaction isolation levels, there are different concurrency problems in a concurrency environment:
(3) the realization principle of transaction isolation:
The isolation level of Innodb transactions is achieved by MVVC and locking mechanisms:
① MVCC (Multi-Version Concurrency Control, multi-version concurrency control) is a specific way for MySQL's InnoDB storage engine to achieve transaction isolation levels, which are read committed and repeatable. The read uncommitted isolation level always reads the latest rows of data, eliminating the need for MVCC. The read serialization isolation level requires locking all read rows, which cannot be achieved using MVCC alone.
MVCC is achieved by saving two hidden columns after each row of records, a transaction ID that holds the row and a rollback segment pointer that holds the row. Each time a new transaction is started, a new transaction ID is automatically incremented. At the beginning of the transaction, the transaction ID is placed in the transaction ID field of the row affected by the current transaction, and the pointer to the rollback segment contains all the version data on the row record, which is organized as a linked list in the undo log rollback log, that is, the value actually points to the history linked list of the row in undo log.
When accessing the database concurrently, MVCC multi-version management of the data in the transaction is done to avoid write operations blocking read operations, and the problem of illusory reading in snapshot reading mode can be solved by comparing versions, but MVCC can not solve the phantom reading of current reading, which needs to be solved by key lock.
② locking mechanism:
The basic working principle of the MySQL lock mechanism is: before modifying the database, the transaction needs to obtain the corresponding lock, and the transaction that obtains the lock can modify the data; during the transaction operation, this part of the data is locked, and if other transactions need to modify the data, they need to wait for the current transaction to commit or roll back and release the lock.
Exclusive lock to solve dirty reading
Shared lock resolution cannot be read repeatedly
Key lock to solve phantom reading
4.2.3. Persistence:
Persistence relies on redo log logs, where executed SQL statements are saved to a redo log file when SQL is executed, but for efficiency, data is written to the redo log buffer cache in memory before writing to redo log. The writing process is as follows: when writing data to the database, the execution process first writes the modified data in redo log buffer,redo log buffer and periodically flushes to the disk's redo log file, a process called flushing disk (that is, redo log buffer writes logs to the disk's redo log file).
The use of redo log buffer can greatly improve the efficiency of reading and writing data, but it also brings new problems: if MySQL is down and the modified data in redo log buffer is not flushed to disk in memory, the data will be lost and the durability of the transaction can not be guaranteed. To ensure the persistence of the transaction, when the transaction commits, the API fsync is called to flush the redo log. The refresh rate is controlled by the innodb_flush_log_at_trx_commit variable:
0: does not flush into the disk
1: every time a transaction commits, the data in the buffer pool is flushed to disk
2: when committing a transaction, write the data in the buffer pool to the os cache cache corresponding to the disk file instead of going directly to the disk file. It may take a second to write the data in os cache to a disk file.
4.2.4, consistency:
Consistency means that transactions cannot undermine data integrity and business consistency:
Data integrity: entity integrity, column integrity (such as field type, size, length to meet requirements), foreign key constraints, etc.
Business consistency: for example, in a bank transfer, regardless of the success or failure of the transaction, the total amount of money between the two parties remains the same.
5. The locking mechanism in the database?
When multiple transactions in the database access the same data concurrently, if the concurrent operation is not controlled, incorrect data may be read and stored, and the consistency of the database will be destroyed. The basic working principle of the MySQL lock mechanism is that before modifying the database, the transaction needs to obtain the corresponding lock, and the transaction that obtains the lock can modify the data; during the transaction operation, this part of the data is locked, and if other transactions need to modify the data, they need to wait for the current transaction to commit or roll back and release the lock.
According to different classification methods, the types of locks can be divided into the following categories:
Divided by lock granularity: table-level lock, row-level lock, page-level lock
Divided by lock type: shared (lock S lock), exclusive lock (X lock)
Divided according to the use strategy of lock: optimistic lock, pessimistic lock
5.1, table-level lock, row-level lock, page-level lock:
Table-level lock: the lock level with the largest granularity, with the highest probability of lock conflict and the lowest concurrency, but low overhead, fast locking and no deadlock
Row-level lock: with the smallest granularity, the probability of lock conflict is the smallest and the degree of concurrency is the highest, but it is expensive and slow to add lock, and deadlock will occur.
Page-level lock: lock granularity is between table-level lock and row-level lock, and there is a tradeoff between table-level lock and row-level lock. Overhead and locking time are also limited between table locks and row locks, and deadlocks occur.
Different storage engines support different locking mechanisms:
The InnoDB storage engine supports row-level locks and table-level locks. Row-level locks are used by default, but row-level locks are used only when querying data through an index.
MyISAM and MEMORY storage engines use table-level locks
The BDB storage engine uses page locks, but also supports table-level locks
5.2.Row lock of InnoDB:
There are two types of row locks for InnoDB:
Shared lock (S lock, read lock): multiple transactions can share an S lock on the same data row, but it can only be read and not modified.
Exclusive lock (X lock, write lock): after a transaction acquires an exclusive lock, it can write to the data rows within the locked range. During the locking, other transactions can no longer acquire the lock (shared lock, exclusive lock) of this part of the data row. Only transactions that acquire exclusive locks are allowed to update data.
For update,delete,insert operations, InnoDB automatically adds exclusive locks to the rows involved; for normal SELECT statements, InnoDB does not add any locks.
5.3.Table lock and intention lock of InnoDB
Because the InnoDB engine allows row locks and table locks to coexist and implements a multi-granularity locking mechanism, table locks and row locks conflict with each other although they have different locking ranges. When you want to add a table lock, you must first traverse all the records of the table to determine if there is an exclusive lock. This traversal check is obviously an inefficient way, and MySQL introduces intention locks to detect conflicts between table locks and row locks.
Intention lock is also a table-level lock, which is divided into reading intention lock (IS lock) and freehand intention lock (IX lock). When a transaction wants to add a row lock to a record, it first adds a corresponding intention lock to the table. After that, if the transaction wants to lock the table, as long as it first determines whether the intentional lock exists, it can quickly return to the table without enabling the table lock, otherwise it will need to wait to improve efficiency.
The realization of InnoDB row lock and key lock:
InnoDB's row locking is achieved by locking the index items on the index. Row locks can be used only if the data is retrieved through the index, otherwise table locks will be used.
In InnoDB, in order to solve the phenomenon of phantom reading, a key lock (next-key) is introduced. According to the index, it is divided into left open and right closed intervals. When making a range query, if the index is hit and the data can be retrieved, the interval in which the record is located and its next interval are locked. In fact, Next-Key = record lock (Record Locks) + gap lock (Gap Locks)
InnoDB locks: when you use a range query instead of a precise query to retrieve data and request a shared or exclusive lock, InnoDB locks the index entries of existing data records that meet the range conditions; records whose key values are within the conditional range but do not exist are called GAP.
Record lock: use record lock when using a unique index and a precise query in which the record exists
5.5. Use the locking mechanism to solve the concurrency problem:
X lock to solve dirty reading
S lock resolution can not be read repeatedly
Key lock to solve phantom reading
The details of the locking mechanism of the InnoDB storage engine and the locking mechanism of the MyISAM storage engine can be read in this article: MySQL database: locking mechanism _ Zhang Weipeng's blog-CSDN blog _ database
6. The implementation principle of MySQL index:
Index is essentially a data structure that speeds up query performance by reducing the number of rows required for a query, avoiding full table scanning in the database, such as the directory of a book, so that you can find content more quickly. (up to 16 indexes for a table)
6.1. Advantages and disadvantages of the index:
(1) the advantages of the index:
Reduce the number of rows that the query needs to retrieve, speed up the query, and avoid full table scanning, which is also the main reason for creating the index.
If the data structure of the index is a B + tree, you can significantly reduce the time of grouping and sorting in the query when using grouping and sorting.
By creating a uniqueness index, you can ensure the uniqueness of each row of data in the database table.
(2) the shortcomings of the index:
When the data in the table is added, deleted, and modified, the index is also updated, and the maintenance time increases with the increase of the amount of data.
Indexes need to take up physical space, and if you want to build a clustered index, you will need more space.
6.2. The usage scenario of the index:
(1) on which columns to create an index:
Indexes are created on the columns that often appear in the WHERE clause to speed up the judgment of conditions.
Columns accessed by range or used in group by or order by, because the index has been sorted, which can be used to speed up the sort query time.
Often used on connected columns, these columns are mainly foreign keys, which can speed up the connection
Force the uniqueness of the column and organize the arrangement structure of the data in the table on the column as the primary key
(2) which columns are not indexed?
Columns that are not highly differentiated. Because these columns have few values, such as gender, in the results of the query, the data rows of the result set account for a large proportion of the data rows in the table, that is, a large proportion of the data rows that need to be searched in the table. Increasing the index does not significantly speed up the retrieval speed.
The few columns in the query should not be indexed. Because these columns are rarely used, but the index is increased, the maintenance speed of the system is reduced and the space requirement is increased.
When adding an index results in an increase in modification costs that far outweighs the improvement in retrieval performance, indexes should not be created. When the index is added, the retrieval performance is improved, but the modification performance is reduced. When the index is reduced, the modification performance is improved and the retrieval performance is reduced.
Columns defined as text, image, and bit data types should not be indexed. The amount of data in these columns is either considerable or has very little value.
6.3. Classification of the index:
(1) General index, unique index, primary key index, full-text index, combined index.
General index: the most basic index, without any restrictions.
Unique index: but the value of the index column must be unique, allow null values, and can have multiple null values. If it is a combined index, the combination of column values must be unique.
Primary key index: a special unique index that does not allow null values.
Full-text indexing: full-text indexing is only available for MyISAM tables and can only be used from CHAR, VARCHAR, or TEXT types to replace the less efficient like fuzzy matching operation, and multiple fields can be fully fuzzy matched at once through a multi-field combined full-text index.
Composite indexes: mainly to improve mysql efficiency, the columns that are most commonly used as constraints should be placed on the far left when creating a composite index, decreasing in turn.
(2) clustering index and non-clustering index:
If you classify according to the physical order of data storage and the order of index values, indexes can be divided into clustered indexes and non-clustered indexes:
Clustered index: the physical order of data storage in a table is consistent with the order of index values. There can be at most one clustered index in a basic table. Updating data on clustered index columns often leads to changes in the physical order of records in the table, which is costly, so it is not suitable to establish a clustered index for columns that are updated frequently.
Non-clustered index: an index organization in which the physical order of data in a table is inconsistent with the order of index values. A basic table can have multiple clustered indexes.
6.4. The data structure of the index:
The common data structures of indexes are: B+Tree, Hash index.
(1) Hash index: only the Memory storage engine in MySQL supports hash index, which is the default index type for Memory tables. Hash indexes organize data in the form of hash values, so the query is very efficient and can be located at once.
Disadvantages of hash index:
Hash index can only satisfy the query of equivalent value, but can not satisfy the query and sorting of range. Because the size relationship of the data may change after the Hash algorithm.
When creating a composite index, you cannot query using only some of the columns of the composite index. Because the hash index combines multiple column data and then calculates the hash value, it doesn't make sense to calculate the hash value for individual column data.
When Hash collisions occur, Hash indexes cannot avoid scanning table data. Because it is not enough to compare hash values, you need to compare actual values to determine whether they meet the requirements.
(2) B+Tree index: B+Tree is one of the most frequently used index data structures in mysql and is the index type of Innodb and Myisam storage engine patterns. B+Tree index needs to perform several IO operations from root node to leaf node when searching. The query speed is not as fast as Hash index, but it is more suitable for sorting and other operations.
Advantages of B+Tree index:
The nodes in the page do not store content, and IO can read more rows each time, which greatly reduces the number of disk Imax O reads.
All index data of B+Tree:B+Tree with sequential access pointers are stored on leaf nodes, and sequential access pointers are added, and each leaf node has a pointer to adjacent leaf nodes, which is done to improve the efficiency of interval query.
6.5. why use B+Tree as the index:
The index itself is so large that it is impossible to store it all in memory, so the index is often stored on disk as an index file. In this way, the disk Imax O consumption will be generated in the index search process, which is several orders of magnitude higher than the memory access. Therefore, the most important index to evaluate the quality of a data structure as an index is the progressive complexity of the number of disk Imax O operations in the search process. In other words, the data structure of the index should minimize the number of access to disk Imax O during the lookup process.
(1) locality principle and program pre-reading:
Because the access of the disk itself is much slower than that of the main memory, coupled with the cost of mechanical movement, in order to improve efficiency, it is necessary to reduce the disk I and O as much as possible. In order to achieve this goal, the disk is often not strictly read on demand, but will be pre-read every time, even if only one byte is needed, the disk will start from this position and read a certain length of data back into memory in order. The theory of this is based on the famous locality principle in computer science: when a data is used, the data near it is usually used immediately. The data needed during the running of the program is usually centralized.
Because of the high efficiency of disk sequential reading (no seek time, only a little rotation time), pre-reading can improve the efficiency of Ibind O for programs with locality. The length of pre-reading is generally an integral multiple of the page. When the data to be read by the program is not in the main memory, a page fault exception will be triggered, and the system will send a read signal to the disk, which will find the starting position of the data and continuously read one or more pages back into memory, and then return with the exception, and the program continues to run.
(2) performance analysis of B+Tree index:
As mentioned above, disk Ibind O is generally used to evaluate the quality of the index structure. We first analyze the B-tree, which requires a maximum of h nodes to be accessed at a time. At the same time, the database skillfully makes use of the disk pre-reading principle to set the size of a node to equal to one page, that is, each new node directly applies for a page of space, which ensures that a node is physically stored in a page, and that the computer storage allocation is page-aligned. This makes it possible for each node to be fully loaded with only one Ihop O. One search in the B-tree requires at most one time I logdN O (the root node resides in memory), and the time complexity is O (h) = O (memory). In general practical applications, the output degree d is a very large number, usually more than 100, so h is very small. To sum up, it is very efficient to use B-tree as the index structure.
On the other hand, although the time complexity of the red-black tree is O (h), h is obviously much deeper, and because the logically close nodes may be far away in physics and can not make use of the locality, the efficiency of IO is much lower than that of B-tree.
In addition, B+Tree is more suitable as the data structure of the index, because it is related to the degree d of the inner node. From the above analysis, we can see that the larger the d, the better the performance of the index, and the upper limit of the outbound d depends on the size of the key and data in the node. Because the node in the B+Tree removes the data domain, it can have a greater outbound, and the number of disk IO is less.
(3) the comparison between B + tree index and B tree index?
According to the structure of B-Tree and B+Tree, we can find that B + tree has more advantages than B tree in file system or database system for the following reasons:
The main results are as follows: (1) B + tree is beneficial to scan the database: B tree improves the performance of disk IO and does not solve the problem of low efficiency of element traversal, while B + tree only needs to traverse leaf nodes to scan all keyword information, so range query, sorting and other operations, B+ tree has higher performance.
(2) the disk IO cost of the B + tree is lower: the data domain of the internal node of the B + tree does not store data, so its internal node is smaller than the B tree. If all the keywords of the same internal node are stored in the same disk, the more keywords the disk can hold. The more keywords you need to find when you read them into memory at one time, the fewer times you will read and write.
(3) the query efficiency of B+ tree is more stable: because the internal node of B+ tree is only the index of keywords in leaf node, it does not store data. Therefore, any keyword search must take a road from the root node to the leaf node. The path length of all keyword queries is the same, resulting in the same query efficiency for each data.
(4) the implementation of B+Tree index in InnoDB and MyISAM storage engine of MySQL?
Both MyISAM and InnoDB use the B+ tree index. The primary key index of MyISAM and the data field of the secondary index both hold the address of the row, but the primary key index of InnoDB saves not the address of the row, but all the data of the row, while the data field of the secondary index holds the value of the primary index.
Length limit of index:
For the combined index of Innodb, if the length of each column exceeds 767bytes, the prefix index will be taken on the column of more than 767bytes. For the single column index of Innodb, if the length of the column exceeds 767, the prefix index will be taken (take the first 255characters).
For the combined index of MyISAM, the length of the index created cannot exceed 1000 bytes, otherwise an error will be reported and the creation will fail; for the single column index of MyISAM, the maximum length cannot exceed 1000, otherwise the alarm will be given, but the prefix index is created successfully (take the first 333 characters).
7. SQL optimization and index optimization, table structure optimization:
(1) SQL optimization and index optimization of MySQL: https://blog.csdn.net/a745233700/article/details/84455241
(2) Table structure optimization of MySQL: https://blog.csdn.net/a745233700/article/details/84405087
8. Database parameter optimization:
MySQL is an IO-intensive application, and its main responsibility is data management and storage. We know that the time to read a database from memory is in microseconds, while the time to read an IO from an ordinary hard disk is in milliseconds, a difference of three orders of magnitude. So, to optimize the database, the first step is to optimize IO, converting disk IO to in-memory IO as much as possible. So for the parameter optimization of MySQL database, it is mainly aimed at reducing the parameters of disk IO: for example, using query_cache_size to adjust the size of query cache and innodb_buffer_pool_size to adjust the size of buffer.
9. Implementation plan of explain:
The execution plan is a query scheme made by the abstract syntax tree and related table statistics obtained by the SQL statement after passing through the query analyzer, which is automatically analyzed by the query optimizer. Because it is the result of dynamic data sampling and statistical analysis, there may be analysis errors, that is, there are situations where the execution plan is not optimal. We know how MySQL executes the SQL query statement through the explain keyword, and analyze the performance bottleneck of the select statement to improve our query. The results of explain are as follows:
Important are id, type, key, key_len, rows, extra:
(1) the id:id column can be understood as the identification of the order in which SQL is executed, and there are as many select as there are id.
Id values are different: the higher the id value, the higher the priority, the first to be executed.
The id value is the same: execute from top to bottom
The id column null: indicates that this is a result set and does not need to be used for query.
(2) select_type: the type of query, which is mainly used to distinguish between ordinary queries, federated queries, subqueries and other complex queries
(3) table: indicates which table the row of explain is accessing
(4) type: access type, that is, MySQL determines how to find rows in the table. From good to bad: system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL, except all, other type types can use index, except index_merge, other type can only use one index. Generally, type is required to be ref level, and scope lookup needs to reach range level.
System: there is only one data match in the table (equal to the system table), which can be regarded as a special case of the const type
Const: found once through the index, which means using a primary key index or a unique index
Eq_ref: fields in the primary key or unique index are used for joins and only one row of matching data is returned
Ref: a normal index scan that may return multiple rows that meet the query criteria.
Fulltext: full-text index retrieval, the priority of full-text index is very high. If full-text index and general index exist at the same time, mysql will choose to use full-text index first regardless of cost.
Ref_or_null: similar to the ref method, except that the comparison of null values is added.
Index_merge: indicates that the query uses more than two indexes, the optimization method of index merging, and finally takes intersection or union. It is common that the conditions of and and or use different indexes.
Unique_subquery: used for in formal subqueries in where. The subqueries return unique values that are not duplicated
Index_subquery: secondary indexes or lists of in constants are used for in form subqueries. Subqueries may return duplicate values, and you can use indexes to de-duplicate subqueries.
Range: index range scanning, commonly used in >
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.