There are several tree search operations that SQL needs to perform 07/06 Update SLTechnology News&Howtos

There are several tree search operations that SQL needs to perform

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "the tree search operation that SQL needs to perform several times". In the daily operation, I believe that many people have doubts about the tree search operation that SQL needs to perform. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubt that "how many tree search operations need to be performed by SQL"! Next, please follow the editor to study!

What is the index of the interviewer's test site?

Index is a kind of data structure which can improve the efficiency of database query. It can be compared to the directory of a dictionary and can help you find the corresponding records quickly.

The index is generally stored in a file on disk, which takes up physical space.

As the saying goes, water can carry a boat, but it can also overturn it. Proper index can improve query efficiency, and too many indexes will affect the insert and update function of database table.

2. What are the types of indexes

Data structure dimension

B+ tree index: all data is stored in leaf nodes with a complexity of O (logn), which is suitable for range query.

Hash index: suitable for equivalent query, high retrieval efficiency, one time in place.

Full-text indexing: full-text indexing is supported in both MyISAM and InnoDB, and is generally created on the text type char,text,varchar type.

R-Tree indexes: used to create SPATIAL indexes on GIS data types

Physical storage dimension

Clustered index: a clustered index is an index created with a primary key, and the data in the table is stored in the leaf node.

Nonclustered index: a nonclustered index is an index created with a non-primary key, and the primary key and index columns are stored at the leaf node.

Logical dimension

Primary key index: a special unique index that does not allow null values.

Normal index: the basic index type in MySQL, allowing null and duplicate values.

Federated index: an index created by multiple fields, using the leftmost prefix principle.

Unique index: the value in the index column must be unique, but null values are allowed.

Spatial indexing: spatial indexing is supported after MySQL5.7 and follows the OpenGIS geometric data model rules in terms of spatial indexing.

Third, why did the interviewer choose B + tree as the index structure?

You can look at this problem from these dimensions, whether the query is fast enough, whether the efficiency is stable, how much data is stored, and the number of times to find disks, and so on. Why is it not a hash structure? Why not a binary tree, why not a balanced binary tree, why not a B tree, but a B + tree?

When we write a business SQL query, in most cases, it is a scope query, as shown in SQL

Select * from employee where age between 18 and 28

Why not use a hash structure?

We know that the hash structure is similar to the KMurv structure, that is, key and value have an one-to-one relationship. It can be used for "equivalence query", but range query is powerless.

Why not use a binary tree?

First recall the relevant knowledge of binary tree ~ the so-called "binary tree, the characteristics are as follows:"

Each node has at most two subtrees, which are called left subtree and right subtree respectively.

The value of the left child node is less than that of the current node, and the value of the current node is less than that of the right child node.

The top node is called the root node, and the node value that has no child node is called the leaf node.

It's easy to imagine this binary tree structure in our minds:

However, there are some special binary trees, which may be like this:

If the binary tree is specialized into a linked list, it is equivalent to a full table scan. Then why do you need an index? Therefore, the general binary tree is not suitable as an index structure.

Why not use a balanced binary tree?

Balanced binary tree characteristics: it is also a binary search tree, the maximum height difference between the two subtrees of any node is 1. So there is no case of specializing a linked list.

But:

When the balanced binary tree is inserted or updated, it needs to rotate left and right to maintain the balance, which is expensive to maintain.

If the number is large, the height of the tree will be very high. Because the data exists on disk, using it as an index structure, one node at a time is read from disk, and the IO is manipulated more times.

Why not use the B tree?

If the amount of data is large, the height of the balanced binary tree will be very high, which will increase the IO. So why not choose the same amount of data, the "shorter B-tree"?

Compared with the balanced binary tree, B-tree can store more data and have lower height. But why choose the B + tree in the end? Because the B + tree is an upgraded version of the B tree:

The data is not stored on the non-leaf node of the B + tree, only the key value is stored, while not only the key value but also the data is stored in the B tree node. The default size of the page in innodb is 16KB. If the data is not stored, more keys will be stored, and the corresponding tree order (the sub-node tree of the node) will be larger, and the tree will be shorter and fatter. In this way, the number of IO we need to find data on disk will be reduced again, and the efficiency of data query will be faster.

All the data of the B+ tree index is stored in the leaf node, and the data is arranged sequentially and linked by the linked list. Then the B + tree makes range lookup, sorted lookup, grouping lookup, and de-relookup extremely easy.

4. one of the interviewer's test sites is the process of searching B+ tree index.

Interviewer: suppose you have the following table structure and these pieces of data

CREATE TABLE `employee` (`id` int (11) NOT NULL, `name` varchar (255) DEFAULT NULL, `age` int (11) DEFAULT NULL, `date` datetime DEFAULT NULL, `sex` int (1) DEFAULT NULL, PRIMARY KEY (`id`), KEY `idx_ age` (`age`) ENGINE=InnoDB DEFAULT CHARSET=utf8; insert into employee values. Insert into employee values (300); insert into employee values (400); insert into employee values (500); insert into employee values (37); insert into employee values (600). Insert into employee values (700 'Xiaoyan', 28 'pamphlet' 2021-01-21')

Interviewer: "if you execute the following query SQL, how many tree search operations do you need to perform? You can draw the corresponding index structure diagram ~

Select * from Temployee where age=32

"Analysis:" in fact, the interviewer is to examine whether the candidate is familiar with the B + tree index structure map. You can answer like purple.

First draw the index structure diagram of the idx_age index, roughly as follows:

Then draw the id primary key index, and we first draw the clustered index structure diagram, as follows:

Therefore, the approximate process of the execution of this SQL query statement is:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Searching the idx_age index tree and loading disk block 1 into memory may cause the index to fail when 32 not in).

The use of is null, is not null on the index field may cause the index to fail.

The field encoding format associated with the left join query or the right join query is different, which may cause the index to fail.

Mysql estimates that using a full table scan is faster than using an index, so no index is used.

VII. The leftmost prefix principle of the joint index of interviewers' test sites

Interviewer: "if I added a federated index to the name,age field now, how many tree searches would the following SQL perform? Draw the index tree first?

Select * from employee where name like 'Little%' order by age desc

"parsing:" here we examine the leftmost prefix principle of the federated index and whether like is the knowledge point of the index. The schematic diagram of the composite index tree is as follows:

The federated index entries are sorted by name name from smallest to largest, and if the name name is the same, by age age from smallest to largest. The interviewer asked to check all people whose first name is "small". SQL's like 'small%' can be used in the idx_name_age joint index.

The query will follow the idx_name_age index tree, find the first word is a small index value, so in turn find Xiaojun, Xiaolun, Xiaoyan, get Id=600, 100,700 respectively, and then go back to the table three times to find the corresponding record. The leftmost prefix is small, which is the leftmost M characters of the string index. Actually,

This leftmost prefix can be the leftmost N fields of the federated index. For example, a combination index (a) can be equivalent to building (a), (b), (b), (b

The leftmost prefix can also be the leftmost M character of the string index.

VIII. The index of the interviewer's test site is pushed down.

Interviewer: "We still live in the combinatorial index idx_name_age. How many times does the following SQL perform a tree search?"

Select * from employee where name like 'Little%' and age=28 and sex='0'

"parsing:" here examine the knowledge points pushed down by the index. If it is "before Mysql5.6", in the idx_name_age index tree, find out all the people whose first name is "small", get their primary key id, then go back to the table to find the data rows, and then compare other fields such as age and gender. As shown in the figure:

Some friends may find it strange that (name,age) is not a joint index? Why not look at the age age and return to the table after selecting the word "small"? isn't it more efficient? Therefore, MySQL 5.6 introduces "index push-down optimization", which can first judge the fields contained in the index in the process of index traversal, directly filter out the records that do not meet the conditions, and reduce the number of times to return to the table.

Therefore, after the MySQL5.6 version, after selecting the word "small", filter the age=28, in the table, so you only need to return to the table once.

9. add an index to the large table of the interviewer's examination sites.

Interviewer: "if the data level of a table is more than 10 million, what do you need to do to add an index to the table?

Parsing: "We need to know that when you add an index to a table, the table is locked. If you do not operate carefully, there may be production accidents. You can refer to the following methods:

1. First create a new table B with the same data structure as the original table A.

two。 Add a new index that needs to be added to the new table B.

3. Import the data from the original table A to the new table B.

4.rename new table B is the table name An of the original table, and the original table An is changed to another table name

Summary and practice

This article mainly explains the nine key interview sites of the index. I hope it will be helpful to you. Next, let me give you a piece of advice. About the indexed SQL that I have encountered recently in business development, let's see what your answer is. If you are interested, you can contact me to discuss the topics as follows:

Select * from A where type ='1' and status ='s' order by create_time desc

Suppose there are 9 types of type, the degree of distinction is OK, and the degree of discrimination of status is not high (there are 3 types), so how do you index it?

Is to add a single index to type

Or (type,status,create_time) joint index

Or (type,create_time) federated index?

At this point, the study on "how many tree search operations that SQL needs to perform" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.