The principle of sqlserver index and what are the points for attention in index establishment 07/13 Update SLTechnology News&Howtos

The principle of sqlserver index and what are the points for attention in index establishment

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article shows you the principle of sqlserver index and the matters needing attention in index establishment. The content is concise and easy to understand, which can definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

In a clustered index, the data is actually stored sequentially, and the data page is on the index page. It's as if the reference manual arranges all the topics in order. Once the data to be searched is found, the search is completed. For nonclustered indexes, the index is securely independent of the structure of the data itself. The data is found in the index, and then the actual data is located through the pointer.

Indexes in SQL Server use standard B-trees to store their information. B-trees provide fast access to data by finding a key in the index. B-trees are aggregated with similar key records. B does not represent binary, but represents balanced (balanced). One of the core functions of B-trees is to maintain the balance of the tree. The partner traverses the tree down to find a number and locate the record. Because the tree is balanced, finding any record requires only the same amount of resources, and the speed of acquisition is always the same-because the leaf index from the root index has the same depth. The middle level of the index varies according to the number of rows of the table. If you use a longer key (KEY) to create the index, there will be fewer entries on a page, so the index will need more pages (or more layers). The more pages, then it will take a relatively long time to find the information you need, and the index may not be useful.

Clustered index

The leaf level of a clustered index contains not only index keys but also data pages. Another saying is that the data itself is part of the clustered index, which keeps the data in the table in order based on key values, and the data pages in the table are maintained by a two-way linked table called page chain. Because the actual data pages can only be sorted in one way, a table can only have one clustered index. There may be a myth here that there are many documents that introduce SQL Server indexes that tell readers that clustered indexes physically store data in sort order (sorted order). It would be misleading to think that physical storage is the disk itself. Imagine that any modification would be quite expensive if the clustered index needed to maintain data on the actual disk in a particular order. When a page becomes full and needs to be split in two, the data on all subsequent pages must be moved backward. The sort order (sorted order) in a clustered index simply indicates that the data page chain is logically ordered.

Most tables should require a clustered index. The optimizer is very inclined to use clustered indexes because they can find data directly at the leaf level. Because the logical order of the data is defined, the clustered index can quickly access queries for range values, and the query optimizer can find that only a certain range of data pages need to be scanned.

Nonclustered index

For nonclustered indexes, the leaf level does not contain all the data. In addition to the key value, the index row in each leaf level (the bottom of the tree) contains a bookmark (bookmark) that tells SQL Server where to find the data row corresponding to the index key. A bookmark may take two forms. If there is a clustered index on the table, the bookmark is the clustered index key for the corresponding data row. If Biao is a heap structure, the bookmark is a line representation (row identifier,RID) that locates the actual line in the format "file number: page number: slot number".

Primary key (PRIMARY KEY) and clustered index (CLUSTER INDEX)

Strictly speaking, the primary key has nothing to do with the clustered index, and if anything, when there is no clustered index in the table, the primary key created is the clustered index by default (unless specifically set to NOCLUSTER). In the processing of primary key and clustered index, pay attention to the following: 1, primary key is not separated from clustered index 2, clustered index key columns try to avoid using data types other than int 3, avoid using compound primary key as much as possible

Considerations when creating an index

1. Always include clustered indexes

When the table does not contain a clustered index, the data in the table is unordered, which reduces the efficiency of data retrieval. Even though the index reduces the scope of data retrieval, because the data itself is unordered, frequent positioning problems will occur when extracting the actual data from the table, which makes SQL Server basically will not use the index in the non-clustered index table to retrieve the data.

2. Ensure that the clustered index is unique

Because the clustered index is the row locator of the nonclustered index, if it is not unique, it will make the row locator contain auxiliary data, and at the same time, when the data is extracted from the table, it needs to be located with the auxiliary data in the row locator, which reduces the processing efficiency.

3. Keep the clustered index to a minimum

Each clustered key value is the leaf node record of all nonclustered indexes, and the smaller it is, the more valid data each nonclustered index leaf contains, which is good for improving the efficiency of the index.

4. Overwrite index

The overlay index means that the columns in the index contain all the columns involved in data processing, and the overlay index is equivalent to a subset of the original table, because this subset contains all the columns involved in data processing. Therefore, the operation of this subset can meet the needs of data processing. In general, if most processing involves only certain columns of a large table, consider overriding those columns. The way to build an override index is to use the key columns of the columns to be included as index key columns and other columns as included columns of the index (using the INCLUDE clause in the index creation statement).

5. Appropriate index

When the data changes, SQL Server will synchronously maintain the data in the relevant indexes, and too many indexes will affect the efficiency of data change processing. Therefore, indexes should only be built on frequently used columns. An appropriate amount of index is also reflected in the control of the combination of index columns. For example, if you have two columns col1 and col2, the combination of the two columns produces three uses: using col1 alone, using col2 alone, and using col1 and col2 at the same time. If you have an index for each case, you need to build three indexes. However, it is also possible to set up only one composite index (col1, col2), which can satisfy the queries of col1+col2, col1 and col2 in turn. Among them, col2 is reluctant to use this query (but also with separate statistics). You can determine whether it is necessary to establish a separate index for col2 according to the actual situation. Special note: do not establish a duplicate index, at present, the most common duplicate index is to build a primary key for a column and a clustered index to extract data directly from the table. Compared with retrieving data directly from the table, there is one more index retrieval process. This process requires that the scope of data retrieval can be reduced as much as possible, and the least time can be used, so that the efficiency of data retrieval can be really improved through the index. To achieve the above purpose, the selection of index key columns should follow the following principles:

Selective principle

Selectivity is the percentage of records that meet the criteria as a percentage of the total number of records, which should be as low as possible to ensure that very little data is extracted from the underlying table after the index scan is passed. If this ratio is high, you should not consider indexing on this column.

Data density principle

Data density refers to the percentage of records with unique column values as a percentage of the total number of records, and the higher this ratio, the more suitable the column is for indexing. When considering the data density, we should also pay attention to the problem of data distribution. Only when the density of frequent retrieval is high, it is suitable to establish an index. For example, if a table has 100000 records, although a column has 90, 000 records that are not duplicated, this column is not suitable for indexing if the 20, 000 records that are often retrieved have only a few dozen column values that are not duplicated. Another situation is that the overall data density is not large, but the density of the data that is often retrieved is high, such as the status of the order, generally speaking, there are only several kinds of order status, but the orders that have already Close often account for the vast majority of the whole data, but when the data are processed, they are basically retrieved orders that are not Close. In this case, it is more effective to index the status column of the order (SQL Server 2008 You can build a better filtered index for this column.

6. Index key column size

It is generally not advisable to index columns that exceed 100Byte.

7. Key column order of composite index

In the index, the order of the index is mainly determined by each key column in the index. therefore, for the composite index, the column order in the index is very important, and the columns with high data density and selectivity should be given priority. Columns with small storage space are placed in front of the index key columns.

The above is the principle of sqlserver index and what are the points for attention in index establishment? have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.