What are the knowledge points related to MySQL 04/26 Update SLTechnology News&Howtos

What are the knowledge points related to MySQL

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the relevant knowledge points of MySQL, which has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, let the editor take you to understand it.

1. Database architecture 1.1. talk about the infrastructure diagram of MySQL.

Tell the interviewer about the logical structure of MySQL. There is a whiteboard to draw the following picture. The picture comes from the Internet.

The Mysql logical architecture diagram is mainly divided into three layers:

(1) the first layer is responsible for connection processing, authorization, security, etc.

(2) the second layer is responsible for compiling and optimizing SQL

(3) the third layer is the storage engine.

1.2.How is a SQL query executed in MySQL?

First check whether the statement has permissions. If you do not have permissions, you will directly return an error message. If you have permissions, you will query the cache first (before the MySQL8.0 version).

If there is no cache, the parser performs lexical analysis, extracts select and other key elements in the sql statement, and then determines whether the sql statement has syntax errors, such as whether the keywords are correct, and so on.

Finally, the optimizer determines the execution plan for permission verification, returns the error message directly if there is no permission, and calls the database engine interface to return the execution result if there is no permission.

2. SQL Optimization 2.1.How do you optimize SQL in your daily work?

You can answer this question from these dimensions:

2.1.1. Optimize the table structure

(1) use numeric fields as much as possible

If fields that contain only numeric information should not be designed as characters, this will reduce the performance of queries and joins and increase storage overhead. This is because the engine compares each character in the string one by one when processing queries and connections, while for numeric types, it only needs to be compared once.

(2) use varchar instead of char whenever possible

The storage space of variable length field is small, which can save storage space.

(3) when the index column duplicates a lot of data, the index can be deleted.

For example, there is a column of gender, almost only male, female, unknown, such an index is invalid.

2.1.2. Optimize query

Avoid using the! = or operator in the where clause as much as possible

Try to avoid using or to join conditions in the where clause

Do not appear select in any query *

Avoid judging the null value of a field in the where clause

2.1.3, index optimization

Index fields that are query criteria and order by

Avoid building too many indexes and use more composite indexes

What do you think of the implementation plan (explain), and how do you understand the meaning of the fields in it?

Adding the explain keyword before the select statement returns information about the execution plan.

(1) id column: the serial number of the select statement. MySQL divides the select query into simple query and complex query.

(2) select_type column: indicates whether the corresponding row is a simple or complex query.

(3) table column: indicates which table the row of explain is accessing.

(4) type column: one of the most important columns. Represents the association type or access type, that is, the MySQL determines how to find the rows in the table. From the best to the worst: system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL.

(5) possible_keys column: shows which indexes the query might use to find it.

(6) key column: this column shows which index mysql actually uses to optimize access to the table.

(7) key_len column: shows the number of bytes used by mysql in the index, from which columns in the index can be calculated.

(8) ref column: this column shows the columns or constants used by the table to find values in the index of key column records. Common ones are: const (constant), func,NULL, field name.

(9) rows column: this column is the number of rows that mysql estimates to read and detect. Note that this is not the number of rows in the result set.

(10) Extra column: displays additional information. For example, there are Using index, Using where, Using temporary, etc.

2.3.Do you care about the time-consuming sql in your business system? Is the statistics too slow? How have you optimized slow queries?

When we write Sql, we should get into the habit of using explain analysis. Slow query statistics will be given to us by operation and maintenance on a regular basis.

Optimize slow query ideas:

Analyze statements to see if unnecessary fields / data are loaded

Analyze the SQL execution sentence, whether it hits the index, etc.

If the SQL is complex, optimize the SQL structure

If the amount of data in the table is too large, consider sub-tables.

3. Index 3.1, the difference between clustered index and nonclustered index

You can answer according to the following four dimensions:

(1) there can be only one clustered index in a table, while a non-clustered index can have more than one table.

(2) clustered index, the logical order of the key values in the index determines the physical order of the corresponding rows in the table; in the non-clustered index, the logical order of the index in the index is different from the physical storage order of the upper row on the disk.

(3) the index is described by the data structure of the binary tree. We can understand the clustered index as follows: the leaf node of the index is the data node. On the other hand, the leaf node of the non-clustered index is still the index node, but there is a pointer to the corresponding data block.

(4) clustered index: physical storage is sorted by index; nonclustered index: physical storage is not sorted by index.

3.2. Why use B+ tree and why not use ordinary binary tree?

You can look at this question from several dimensions, whether the query is fast enough, whether the efficiency is stable, how much data is stored, and the number of times to find the disk, why not the ordinary binary tree, why not the balanced binary tree, why not the B tree, but the B + tree?

3.2.1. Why is it not an ordinary binary tree?

If the binary tree is specialized into a linked list, it is equivalent to a full table scan. Compared with the binary search tree, the balanced binary tree has more stable search efficiency and faster overall search speed.

3.2.2. Why not a balanced binary tree?

We know that query efficiency is much faster in memory than on disk. If the data structure of the tree is used as an index, then we need to read a node from the disk every time we look for data, that is, a disk block, but a balanced binary tree stores only one key value and data per node. If it is a B-tree, it can store more node data, and the height of the tree will be reduced, so the number of disk reads will be reduced, and the query efficiency will be fast.

3.2.3. Why not B-tree but B + tree?

The data is not stored on the non-leaf node of the B + tree, only the key value is stored, while not only the key value but also the data is stored in the B tree node. The default size of the page in innodb is 16KB. If the data is not stored, more keys will be stored, and the corresponding tree order (the sub-node tree of the node) will be larger, and the tree will be shorter and fatter. In this way, the number of IO we need to find data on disk will be reduced again, and the efficiency of data query will be faster.

All the data of the B+ tree index is stored in the leaf node, and the data is arranged sequentially and linked by the linked list. Then the B + tree makes range lookup, sorted lookup, grouping lookup, and de-relookup extremely easy.

3.3.What is the difference between Hash index and Btree index? How did you decide to design the index?

B+ trees can do range queries, but Hash indexes cannot.

The B+ tree supports the leftmost principle of federated indexes, while Hash indexes do not.

Order by sorting is supported by B+ tree, but not by Hash index.

Hash indexes are more efficient than B+ trees in equivalent queries.

When a B+ tree uses like for fuzzy queries, the words after like (such as the beginning of%) can be optimized, and Hash indexes cannot do fuzzy queries at all.

3.4. What is the leftmost prefix principle? What is the leftmost matching principle?

The leftmost prefix principle is the leftmost priority. When creating a multi-column index, the most frequently used column in the where clause is placed on the leftmost according to the business requirements.

When we create a composite index, for example, (A1 ~ 2), we are equivalent to creating (A1), (A1 ~ 2), and (A1 ~ ~ a2). This is the leftmost matching principle.

3.5. Which scenarios are not suitable for indexing?

The amount of data is too small to be indexed

Fields that are updated more frequently are not suitable for indexing = fields with low discrimination are not suitable for indexing (such as gender)

3.6. What are the advantages and disadvantages of the index?

(1) advantages:

A unique index ensures the uniqueness of the data in each row of the database table.

Index can speed up data query and reduce query time.

(2) disadvantages:

It takes time to create and maintain the index

Indexes need to take up physical space, in addition to data tables occupy data space, each index also takes up a certain amount of physical space

When adding, deleting and changing the data in the table, the index should also be maintained dynamically.

4. Lock 4.1and MySQL have ever encountered deadlock problem, and how do you solve it?

I've met. My general procedure for troubleshooting deadlocks is clear:

(1) View deadlock log show engine innodb status

(2) find the deadlock Sql

(3) analyze the locking situation of sql

(4) simulated deadlock case

(5) analyze deadlock log

(6) analyze the result of deadlock

4.2. What are optimistic locks and pessimistic locks in the database and the difference between them?

(1) pessimistic lock:

Pessimistic lock she is single-minded and lacks sense of security, her heart only belongs to the current transaction, worrying all the time that its beloved data may be modified by other transactions, so after a transaction has (acquired) a pessimistic lock, no other transaction can modify the data, and can only wait for the lock to be released before execution.

(2) optimistic lock:

The "optimism" of the optimism lock is that it believes that the data will not change too often. Therefore, it allows multiple transactions to change the data at the same time.

Implementation: optimistic locks are generally implemented using version number mechanism or CAS algorithm.

Are you familiar with MVCC? do you know its underlying principle?

MVCC (Multiversion Concurrency Control), which is multi-version concurrency control technology.

The main purpose of the implementation of MVCC in MySQL InnoDB is to improve the concurrency performance of the database and to deal with read-write conflicts in a better way, so that even when there are read-write conflicts, it can be unlocked and non-blocking.

5. Transaction 5.1. four characteristics and implementation principles of MySQL transaction

Atomicity: transactions are executed as a whole, and either all or none of the operations on the database are performed.

Consistency: means that the data will not be destroyed before and after the transaction ends. If An account transfers 10 yuan to B account, regardless of success or not, the total amount of An and B will remain the same.

Isolation: when multiple transactions are accessed concurrently, transactions are isolated from each other, that is, one transaction does not affect the running effect of other transactions. In short, there is no intrusion into the river between affairs.

Persistence: indicates that the operational changes made by the transaction to the database after the transaction is completed will be persisted in the database.

5.2. What is the isolation level of the transaction? What is the default isolation level for MySQL?

Read unsubmitted (Read Uncommitted)

Read submitted (Read Committed)

Repeatable read (Repeatable Read)

Serialization (Serializable)

The default transaction isolation level for Mysql is repeatable (Repeatable Read)

5.3. What is illusory reading, dirty reading, and unrepeatable reading?

Transaction An and B execute alternately, and transaction An is disturbed by transaction B, because transaction A reads the uncommitted data of transaction B. this is dirty reading.

In a transaction scope, two identical queries read the same record but return different data, which is non-repeatable.

Transaction A queries the result set of a range, another concurrent transaction B inserts / deletes data into this range and quietly commits, and then transaction A queries the same scope again, and the result set obtained by the two reads is different. This is phantom reading.

6, actual combat 6.1, MySQL database cpu soars, how to deal with it?

Troubleshooting process:

(1) use the top command to observe and determine whether it is caused by mysqld or something else.

(2) if it is caused by mysqld, show processlist, check the session situation to determine whether there is a resource-consuming sql running.

(3) find out the sql with high consumption, and see if the execution plan is accurate, whether the index is missing, and whether the amount of data is too large.

Deal with:

(1) kill drops these threads (at the same time, observe whether cpu usage decreases)

(2) adjust accordingly (such as adding index, changing sql, changing memory parameters)

(3) rerun these SQL.

Other circumstances:

It is also possible that each sql does not consume much resources, but all of a sudden, a large number of session connections cause the cpu to soar. This situation needs to be analyzed together with the application to analyze why the number of connections has surged, and then make adjustments accordingly, such as limiting the number of connections.

6.2. how do you solve the master-slave delay of MYSQL?

Master-slave replication is divided into five steps: (the picture comes from the network)

Step 1: update events (update, insert, delete) of the main library are written to binlog

Step 2: initiate a connection from the library and connect to the main library.

Step 3: at this point, the master library creates a binlog dump thread and sends the contents of the binlog to the slave library.

Step 4: after booting from the library, create an binlog O thread to read the content of the main library and write it to relay log

Step 5: a SQL thread is also created to read the content from the relay log, execute the read update event from the Exec_Master_Log_Pos location, and write the update content to the db of slave

The cause of master-slave synchronization delay

A server opens N links to the client to connect, so there will be large concurrent update operations, but there is only one thread reading binlog from the server. When a SQL is executed on the slave server for a longer time or because a SQL has to lock the table, the SQL of the master server has a large backlog and has not been synchronized to the slave server. This leads to master inconsistency, that is, master-slave delay.

The solution of master-slave synchronization delay

The master server is responsible for the update operation, and the security requirement is higher than that of the slave server, so some setting parameters can be modified, such as sync_binlog=1,innodb_flush_log_at_trx_commit = 1 and so on.

Select a better hardware device as the slave.

Use a slave server as a backup instead of providing a query, and when his load comes down there, it will naturally be more efficient to execute the SQL in relay log.

Increase the slave server, this purpose is to spread the pressure of reading, thereby reducing the load on the server.

6.3. If you were asked to do the design of sub-database and sub-table, what would you do?

Sub-library and sub-table scheme:

Horizontal split: split the data in one database into multiple databases based on fields and in accordance with certain strategies (hash, range, etc.).

Horizontal split table: split the data in one table into multiple tables based on fields and in accordance with certain strategies (hash, range, etc.).

Vertical sub-database: based on tables, different tables are divided into different libraries according to different business attribution.

Vertical subtable: based on the field, the fields in the table are split into different tables (main table and extended table) according to the activity of the field.

Commonly used sub-database sub-table middleware:

Sharding-jdbc

Mycat

Problems that may be encountered in sub-database and sub-table

Transaction problem: distributed transactions are needed

The problem of cross-node Join: this problem can be solved by two queries.

Cross-node count,order by,group by and aggregate function problems: get the results on each node and merge them on the application side.

Data migration, capacity planning, capacity expansion and other issues

ID problem: after the database is segmented, you can no longer rely on the database's own primary key generation mechanism. You can simply consider UUID.

The problem of sorting and paging across fragments

Thank you for reading this article carefully. I hope the article "what are the knowledge points related to MySQL" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.