Database index 07/04 Update SLTechnology News&Howtos

Database index

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Indexes (Index):

The index enables mysql to quickly find and retrieve a set of records in a table that contains millions or even hundreds of millions of data. If you have ever used mysql, no matter how long it takes, you may have used indexes to get query results more easily. You may also find that mysql indexes sometimes don't work as you might think.

For many users, indexing is like black art. Sometimes it works miraculously fast, sometimes it seems to slow down or hinder data insertion.

In this chapter, we will introduce some of the concepts of indexes and the different indexes provided by mysql.

Indexing Basics (Index Base):

To understand how mysql uses indexes, it is best to first understand the basic work and characteristics of indexes. Once you have a basic understanding of the characteristics of indexes, you can use them more reasonably and correctly.

Index Concepts (concept of index):

To understand what the index asks mysql to do, think about how mysql works to get the results of the query. Imagine that the phone book is a phone book collection that contains about 35000000 phones from the state of California. Write it down disorderly in your brain and consider this query:

Select * from phone_book where last_name = 'Zawodny'

There is no index of any kind to consult, and the database must read all the records in the phone_book table and then compare whether the last_name field matches the string "Zawodny". Of course, this method is inefficient. Once the number of phone records has increased, you will need to find the corresponding records given. In computer science, we call it O (n) time complexity problem.

However, given the real phone book, we all know how to quickly locate the place whose name is Zawodny: turn to the back of the book and start with Z. Because the second word is a, we know that all matches will be near the unknown where the list begins with Z. This method is based on the knowledge of data sorting.

It's cheating, isn't it? It's not. The reason you can find Zawodnys quickly is that their last names are sorted alphabetically. Of course, it's so easy to find Zawodny because you know the letter ABC.

Most textbooks, such as this one, provide an index on the back of the number. Because these indexes are arranged in the corresponding page number in order, you can often quickly find the terms and concepts in the book. Do you want to know where the hot copy of the database is discussed? Check the index page for a long time.

Database indexes are similar. The author or publisher of the book will select important concepts and terms as indexes in the book. You can select specific fields in the database table to create an index. Using the previous example, you will create an index with the last name to quickly find the phone number:

ALTER TABLE phone_book ADD INDEX (last_name)

In the process of doing so, you will ask the database for an ordered list of last names in the phone_ booklist. Each name has its own location to match the record-just like the page number of each entry in the index at the back of the book.

From the point of view of the database server, the existence of an index can quickly delete possible rows from fixed results when a query is executed. When there is no index, MYSQL (such as a database server) examines each row of data in the table. This is not only a waste of time, but also takes up a lot of disk input and output, thus seriously soiling the disk cache.

In the real world, it is rare to find dynamic data that has just been sorted and sorted. Books are a special case, and they tend to stay still.

Because the database needs to maintain a separate list of index values and keep them up-to-date while the data is updated. You must not expect all the fields in a table to be indexed. An index is a compromise between space and time. When you do each insert,update,delete query to make most (but not all) of your queries faster, you will sacrifice some extra space on disk space and CPU.

Most database specifications use the term index and key interchangeability. Saying that last_name is a key in the phone_ book table is equivalent to the last_name field is the index of the phone_book table.

Partial indexes (partial index):

An index is an exchange of space and performance. But sometimes you don't want to trade space for performance. Fortunately, MYSQL provides you with a lot of ways to control space through indexes. When you have a phone_book table with 2 billion data in it. Adding an index to last_name will require a lot of space. If each last_name occupies 8 bits, you will find that the index of this data needs about 16G of space. No matter what you want to do, the row pointer adds an additional 4-8 bits to each record.

You can index only the first four bits instead of the entire last name.

ALERT TABLE phone_book ADD INDEX (last_name (4))

When you do this, you have reduced about half of the space needed for some of the data needed for the index. The compromise is that MYSQL cannot eliminate too many exercises using this index. For example, a query like this:

SELECT * FROM phone_book WHERE last_name = 'Smith'

Take out all fields that begin with Smit, including all names such as Smith,Smitty. This query will then discard rows such as Smitty that you don't want to close.

Multicolumn indexes (multiple columns):

Like many relational database engines, MySQL allows you to create indexes that consist of multiple columns.

ALERT TABLE phone_book ADD INDEX (last_name, first_name)

This index can improve query speed if you often use short or single columns that do not have enough kinds of columns in where conditions, but the query results are all columns. Of course, you can also use partial indexes to reduce footprint:

ALERT TABLE phone_table ADD INDEX (last_name (4), first_name (4))

In either case, find the quick execution of Josh Woodward:

Select * from phone_book where last_name = 'Woodward' and first_name =' Josh'

Having two indexes last_name and first_name means that MYSQL can eliminate rows based on two fields. As a result, the number of rows to consider is reduced to a greater extent. After all, there are far more people in the phone book whose last name starts with Wood than those whose last name begins with Wood and whose first name starts with Josh.

When discussing multi-column indexes, you may see that individual index columns are called key parts or "partial keys". A multi-column index is also called a comprehensive index or a hybrid index.

So why not just create two indexes, one on last_name and the other on first_name? You can do that, but MYSQL won't use them at the same time. In fact, MYSQL uses only one index per table per query-except for UNIONs [3]. This fact is sufficient to show that MySQL will always use only one index per query for each table.

One of the first_name and last_name,MySQL on the split index is selected. This choice is based on a well-founded conjecture to determine which index matches fewer rows. We say this is a well-founded conjecture because MySQL will count the tracks of the index to support his inference about which index he thinks is better. Of course, this statistic is general. Although they often let MySQL make informed decisions, if you have very United data, MySQL may make a second-best choice about index use. In some places, if the keys of the index are sparse (such as the name that starts with X) and others are highly concentrated (such as English-speaking countries with the name Smith), we call this data second-best data. This is an important point, which we will mention later in the book.

Index order (order of indexes):

How does MySQL sort values in the index? If you have used other RDBMS, you may want MySQL to have specified syntax for indexes sorted in ascending, descending, or other order. MySQL provides an index value with no internal control over the sorting method. There is a reason for this. In version 4.0, this feature optimizes problems that degrade the performance of other databases.

For example, some databases might perform this quick query:

SELECT * FROM phone_book WHERE last_name = 'Zawodny' ORDER BY first_name DESC

Then the query slows down:

SELECT * FROM phone_book WHERE last_name = 'Zawodny' ORDER BY first_name ASC

Why? Because some databases store indexes in descending order, and optimize reading indexes in this order. In the first example, the database uses a multi-column index to locate all matching records because they have been stored in descending order and it is no longer necessary to sort them. But in the second example, the server finds all the matching records and then performs a second pass on the rows to sort them.

MySQL can traverse the index backwards if necessary. This can also make the query very fast, and it can realize that there is no need to sort the records in any case.

Indexes as constraints (index as constraint):

Indexes are not all used for queries to locate matching rows. A unique index specifies that a specific value can appear only once in a given column. In the phone book example, you can create a unique index on the phone number to ensure that each phone number appears only once.

ALERT TABLE phone_book ADD UNIQUE (phone_number)

A unique index has a dual purpose, and its function is the same as other indexes when you query based on a phone number.

SELECT * FROM phone_book WHERE phone_number = '555-7271'

However, he can also check to make sure that the value already exists while inserting and updating each record. Unique indexes are constrained in this way.

Unique indexes use as much space as non-unique indexes. Record the value and location of each field. It would be a waste to use a unique index as a constraint and never as an index. In other words, you can rely on unique indexes to enforce uniqueness, but do not write queries that use unique keys. In this example, MySQL does not need to store the location of each record in the index: because you will never use them.

Unfortunately, there is no way to express your intentions to MySQL. In the future, we will introduce this specific example in particular. The MyISAM storage engine already supports unique fields without indexes (it uses a hash-based system), but the mechanism is not yet exposed at the SQL level.

Clustered and secondary indexes (Cluster and Secondary Index):

With the MyISAM table, the index is kept in a completely isolated file that contains a list of primary (and possibly secondary) keys and values that represent the byte offset of the record. This ensures that MySQL can find and quickly jump to that point in the database to find records. MySQL must store indexes in this way because records are basically stored out of order.

Cluster indexes, primary keys and records themselves are clustered together, and records are stored in primary key order. InnoDB uses clustered indexes. In the Oracle world, clustered indexes are called index organization tables. It will help you record the relationship between the primary key and the line order.

When almost all of your data is searched through his primary key, the clustered index can make the search incredibly fast. Using a standard MyISAM index, there are two lookups, one to the index and the other to the table itself, specified by the index. Using a clustered index, there is a lookup that points directly to the related record.

It is inefficient for some operations to use clustered indexes. For example, when using a secondary index, go back to the phone book example, when you need to use last_name as the primary index and phone_number as the secondary index, you do the following query:

SELECT * FROM phone_book WHERE phone_number = '555-7271'

MySQL traverses the phone_number index to find items 555-7271, including the primary key item Zawodny, because the primary key of the phonebook is last name,MySQL, which jumps to related items in the database itself.

In other words, queries based on primary keys will be very fast, and lookups based on secondary indexes are basically the same as MyISAM indexes.

But in the case of right (or wrong), clustered indexes can actually degrade performance. When you use it with a secondary index, you must consider the impact of this combination on storage. The secondary index points to the primary key rather than to a specific row. So, if you index on a very large value and have some secondary indexes at the same time, you will eventually get many duplicate copies of the primary index, first as a clustered index stored with the record, and then point to these clustered indexes again with the same number of times as the secondary index. Using a small value as the primary key may not be so bad, but if you use some individual data to index something particularly long, such as URL, repeated storage of the primary key on disk will cause storage problems.

Another unusual but equally problematic condition is that the primary key is modified in the record when the data is inserted. This is the most expensive feature of clustered indexes. Something that happens can make such an operation have more performance impact.

Insert a record into a question based on the result of the query

Determine a new primary key for the record based on the record that inserts the data

The records in this problem will be moved to the correct location in the tablespace.

Update all secondary indexes pointing to the primary key

You may be thinking that if you insert a primary key for some records, it will take a considerable amount of time to execute the UPDATE command, especially in larger tables. Choose the primary key more wisely. Try to use values that rarely change, such as using a Social Security account instead of a last name, a serial number instead of a product name, and so on.

Unique indexes versus primary keys (unique index and primary key):

If you have used other relational databases, you may wonder what is the difference between a primary key and a unique index in MySQL. Usually this depends on the fact that there is no difference in the MyISAM table. The only difference in the primary key is that its value cannot be NULL. The primary key is just a unique index (NOT NULL UNIQUE INDEX) that is not empty and is named PRIMARY. The MyISAM table does not require you to declare a primary key.

InnoDB and BDB require that each table have a primary key, but you are not required to specify one of them. If you do not specify it, the storage engine will automatically create a hidden primary key for you. In both cases, the primary key simply increases the value, similar to an auto-increment column. If you want to add a primary key later, just use ALTER TABLE to add one. Both storage engines will discard their own internal keys to use your keys. The heap table does not require a primary key but will also create one for you. In fact, you can create an unlocked heap table.

Indexing NULLs:

It is difficult to remember that SQL uses three-state logic when performing logic operations. Unless a field declares NOT NUL, there are three possible results in a logical comparison. This comparison may be correct because their values are equivalent, or it may be wrong because their values are not equivalent, or it may not match because the values are NULL. Whenever a value is compared to NULL, the result is also NULL.

Programmers often think that NULL is equivalent to something undefined or unknown. This is a way to tell the database server that "an unknown data is coming". So how can the data with null value affect the index?

Null values will use regular indexes (that is, non-unique indexes), as is the case with all database servers. However, unlike many database servers, MySQL allows you to use null values on unique indexes [6]. You can store as many null values as you want on a unique index. This may seem counterintuitive, but that's the nature of NULL. Because NULL represents a value that is not defined. If it allows only a single value in a unique index, MySQL needs to declare that all null values are the same.

To do something more interesting, a NULL value may appear once in the primary key. Why? Because the SQL standard dominates this behavior. This is the only primary key that is different from the unique key among the few methods in MySQL. And if you want to know, allowing null values in the index does not affect performance.

[1] this is a bit of a deviation. The database is not just the location where the matching records are stored, we soon know why.

[2] this is also an oversimplification, and MySQL uses some strategies to reduce the size of the index at some cost.

[3] in UNION, each logic is run separately and then merged.

[4] except for NULL, of course, NULL is always a special case.

[5] in the real world, however, this may be a bad example where anyone can share a phone with several roommates and make a phone call with you.

[6] this feature is not supported in MySQL version 3.23 and previous versions, and is supported in version 4.0 and later.

The original text is from: https://www.safaribooksonline.com/library/view/high-performance-mysql/0596003064/ch04.html#ftn.hpmysql-CHP-4-FNOTE-4

Another question:

1. Multi-column index such as: ALERT TABLE phone_table ADD INDEX (last_name (4), first_name (4)), can you use this multi-column index for fast search in all of the following situations?

(a) select * from phone_book where last_name = 'Woodward' and first_name =' Josh'

At this point, the query can be looked up quickly using multi-column indexes.

(B) select * from phone_book where last_name = 'Woodward'

At this point, this query can be searched quickly using the given multi-column index.

At this point, the query cannot be queried quickly using the given multi-column index, because the multi-column index finds its corresponding first_name based on last_name.

two。 How the index of the database works. This problem is more complicated. So far, I only have a general idea. I need time to study it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.