What are the high-frequency interview questions for MySQL? 07/02 Update SLTechnology News&Howtos

What are the high-frequency interview questions for MySQL?

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "what are the MySQL high-frequency interview questions". In the daily operation, I believe many people have doubts about the MySQL high-frequency interview questions. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the questions of "what are the MySQL high-frequency interview questions?" Next, please follow the editor to study!

What data structures are used in MySQL indexes? Why use B+ for indexing?

Use the B+ tree.

You can think about this question in your mind first. If you were asked to design the index of the database, how would you design it?

Shall we still use Why? What? How? Look at this problem in a three-step way.

Why do you need an index? What is the index? How do you use the index?

Think again, why do you need B + trees? What is a B + tree? How to use the B + tree?

Answer: the main function of most programs is to process data, write, query, transform and output. The most vivid analogy is the relationship between the tree and the content and the catalogue. The directory is the index, and we can quickly get the page number of the content we want according to the catalog.

There are several reasons why it is a B+ tree:

If you are using AVL to balance the binary tree, the height of the tree is too high, the index query needs to access the disk, and each access takes the node as a unit to carry out the disk IUnip O operation, so the tree height must not be too high, storing tens of millions of levels of data. In practice, the height of the B+ tree is only 4 or 5.

B + tree is often used to compare the B tree, B + tree compared to the B tree has a great feature is that all the keywords of the B + tree appear in the linked list of the leaf node (dense index), and the keywords in the linked list happen to be ordered, for range search, for example, the 1550Gore B tree needs to traverse the binary tree in the order, but the B+ tree can be accessed sequentially in the leaf node directly.

What is the leftmost matching principle?

First of all, let me make one point:

Leftmost prefix matching principle: MySQL will follow the leftmost prefix matching principle when establishing a federated index, that is, leftmost priority, starting from the leftmost side of the federated index when retrieving data.

For example, we have a student table, and we set up a joint index index_magor_class (magor,class) based on the college number + class, which consists of two fields.

The bottom layer of the index is a B + tree, so the bottom layer of the federated index is a B + tree, but the B+ tree node of the federated index stores multiple values separated by commas.

For example: create a federated index of index_magor_class (magor,class), and its index tree looks like the following figure.

It sorts first by magor, then by class, and if there are fields behind the index, and so on.

If only the class is passed into the where condition of our query, it will not reach the federated index, but if only the college number is passed, it is possible to go to the federated index. (why is it possible that the execution plan of MYSQL does not exactly match the actual execution process of the query? for example, if the amount of data in your database is very small, it may be faster to directly traverse the entire database, so you will not use the index.)

How do you design the index when creating the table? Have you ever done any index optimization?

1. Use overlay index to carry out query operation to avoid table return operation.

Explanation: if a book needs to know what the title of Chapter 11 is, will it turn to the corresponding page of Chapter 11? Just browse through the catalog, which plays the role of overwriting the index.

What do you mean, for example, your primary key index is a student number, when you write a select statement, you can directly select the student number from table, without select other fields, generally unless it is very necessary, try to select fields on demand, use less or do not use select, otherwise you need to return to the table.

Let me explain back to the table here. For example, the primary key index of our table is the student number. In addition, we have also built the index according to the mobile phone number. If our where condition is the mobile phone number, there are two cases:

Positive example: the types of indexes that can be built by IDB can be divided into [primary key index, unique index, general index], while overlay index is an effect of query. With the result of explain, the column will appear: using index.

If the field obtained by our select is the student number, the data can be obtained directly in the index table of the mobile phone number. There is no need to return to the table.

If we have other fields when we select, the process of our query is like this. First, look up the student number according to the mobile phone number, and then go to the primary key index table to query the data according to the student number. This process is called returning the table.

2. For fields with unique business characteristics, even if they are combined fields, it is recommended to build a unique index. Note: do not think that the unique index affects the speed of insert, this speed loss can be ignored, but the improvement of search speed is obvious; in addition, even if a very perfect checksum control is done in the application layer, as long as there is no unique index, according to Murphy's law, dirty data must be generated.

3. More than three tables prohibit join. For fields that need join, the data types are absolutely consistent; when you associate a query with multiple tables, make sure that the associated fields need to be indexed. Note: even dual-table join should pay attention to table index and SQL performance.

4. When building an index on the varchar field, the index length must be specified. There is no need to index the whole field. The index length is determined according to the actual text differentiation. Note: the length and differentiation of an index are contradictory. Generally, for an index with a length of 20, the discrimination can be as high as 90%, which can be determined by using the discrimination of count (distinct left (column name, index length)) / count (*).

5, the page search strictly forbids left fuzzy or full fuzzy, if necessary, please go to the search engine to solve. Note: the index file has the leftmost prefix matching feature of B-Tree, so this index cannot be used if the value on the left is not determined.

6. The goal of SQL performance optimization: at least range level is required, ref level is required, and const is the best if possible. Description:

1) there is at most one matching row (primary key or unique index) in a single const table, and the data can be read in the optimization phase.

2) ref refers to the use of a normal index. (normal index)

3) range searches the range of the index. Counterexample: the result of explain table, type=index, index physical file full scan, the speed is very slow, this index level is lower than range, and is negligible compared with full table scan.

7. When building a combined index, the one with the highest degree of differentiation is on the far left. Positive example: if where axiom? And bounded?, column an is almost close to a unique value, so you only need to create a single idx_a index. Note: when there is a mixed judgment condition of non-equal sign and equal sign, please precede the column of the equal sign condition when building the index. Such as: where c >? And dudes? Then even if c is more differentiated, d must be placed at the top of the index, that is, the composite index idx_d_c.

8. Prevent the implicit conversion caused by different field types, resulting in index failure.

Have you ever used MyBatis? Is the first and second level cache clear?

First-level cache

The first-level cache of Mybatis refers to SQLSession, the scope of first-level cache is SQlSession, and Mabits enables first-level cache by default. In the same SqlSession, when the same SQL query is executed, the database is queried the first time and written to the cache, and the second time it is fetched directly from the cache. When an addition, deletion or modification occurs between the two queries during the execution of SQL, the cache of SQLSession will be emptied. Each query will first go to the cache, if it can not be found, then go to the database query, and then write the results to the cache. Mybatis's internal cache uses a HashMap,key as a hashcode+statementId+sql statement. Value is a java object mapped to the result set of the query. SqlSession clears the SQLSession cache after performing insert, update, delete, and other operations such as commit.

Second-level cache

Secondary caching is mapper-level, and Mybatis does not enable secondary caching by default. For the first time, call SQL under mapper to query the user's information, and the queried information will be stored in the secondary cache area corresponding to the mapper. The second time you call the mapper mapping file under namespace, the same sql will query the user information and retrieve the result from the corresponding secondary cache.

How does master-slave synchronization of MySQL do? Is binlog clear?

Whenever the Master database changes, it will be recorded in the Binary log log file immediately.

The Slave database starts an Iram O thread to connect to the Master database and requests the binary log of Master changes

The binary log obtained by Slave Ipaw O is saved to its own Relay log log file.

Slave has a SQL thread that regularly checks whether the Realy log has changed, and updates the data if it changes.

Does MySQL do sub-database and sub-table? How is it designed?

Why?:

When the data of a table reaches tens of millions, it takes more time for you to make a query. If there is a joint query, I think you may die there. The purpose of sub-table is to reduce the burden of the database and shorten the query time.

One mechanism in mysql is table locking and row locking to ensure data integrity. Table locking means that none of you can operate on the table until I finish operating on the table. The same is true for row locking, other sql must wait until I have finished working on this data before I can operate on this data.

When? When do you need a sub-meter? ):

Only when the number of rows in a single table exceeds 5 million rows or the capacity of a single table exceeds 2GB is recommended. Note: if it is expected that the amount of data after three years will not reach this level at all, please do not divide the database into tables when creating tables.

Counterexample: a business has a total data volume of only 20, 000 rows in three years, but it is divided into 1024 tables. Ask: why do you design this? A: it is divided into 1024 tables, isn't it standard?

How? (there are several strategies for subdatabase and table):

Vertical split or horizontal split

Split middleware. For more information, please see:

Sharding-sphere, formerly known as sharding-jdbc; Dangdang's database and table middleware

TDDL:jar,Taobao Distribute Data Layer

Mycat: middleware.

Note: please investigate the pros and cons of the tools by yourself. the official website and community are preferred.

According to the userId latitude split, the common ones seen by Angela are 64 tables numbered 01563 according to userId% 64.

Fixed position disassembly, take userId to specify two digits, such as the penultimate 2 and 3 digits to form 000099 a total of 100tables, hundred database tables.

Hash: userId hash a bit, and then% table count

Range: in addition, there are 0-1 thousand tables according to the specified range of userId, which is used less and is easy to produce hot spots.

Separate the tables of different business domains into different libraries, such as order related tables, user information related tables, and marketing related tables in different databases.

Store large fields in a table independently

Take out the less commonly used fields and store them in a table

Use userId to do the sub-library table, now need to use the phone number to query how to do?

Like the return table logic, a separate phone number index table is built to store the phone number and userId. When querying, first query the userId according to the phone number, and then query the data according to userId.

At this point, the study of "what are the MySQL high-frequency interview questions" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.