Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Case Analysis of establishing efficient Index by mysql

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

This paper gives an example of how to build an efficient index by mysql. Share with you for your reference, the details are as follows:

How to build an ideal index?

Query frequency differentiation index length overlay field

Distinguishing degree

Assuming 1 million users, the gender is basically 50W for each male / female, and the degree of distinction is low.

Small length

The index length directly affects the size of the index file, the speed of addition, deletion and modification, and indirectly affects the query speed (taking up more memory).

High differentiation and small length

Question: what if the differentiation is high and the length is small?

Answer: you can index the values in the column by intercepting them from left to right.

(1) the shorter the truncation is, the higher the repetition is, and the smaller the discrimination is, the worse the indexing effect is.

(2) the longer the section is, the lower the repetition is, the higher the discrimination is, the better the index effect is, but the greater the influence is-the slow change of additions and deletions, and affects the query speed.

Therefore, we have to strike a balance between differentiation and length. Customary technique: intercept different lengths and test their differentiation.

Suppose we have a table: a CET-4 vocabulary list with 13324 records. How can we index the name field?

If you calculate the degree of differentiation?

Intercept the unrepeated number of the first bit of the word:

Select count (distinct left (name,1)) from dict

Total quantity:

Select count (*) from dict

Degree of differentiation: do not repeat the number / total number, the sql statement is as follows:

Select (select count (distinct left (name,1)) from dict) / (select count (*) from dict) as rate

Then follow these steps to find out the degree of distinction corresponding to other lengths. Looking at this chart, we can see that when the length is 11, the repetition is only 1%. We can consider setting up an 11-bit index.

Alter table dict add index name name (11)

Cases in which the left prefix is difficult to distinguish

Techniques for indexing columns that are indistinguishable from the left prefix

Such as url column

Http://www.baidu.com

Http://www.web-bc.cn

The first 11 characters of the column are all the same and are not easy to distinguish. You can solve them in the following two ways

(1) store the column contents upside down and establish an index

Moc.udiab.www//:ptth

Nc.cb-bew.www//://ptth

In this way, the left prefix is highly differentiated.

(2) pseudo-hash indexing effect

Store both url and url_hash columns

# create table create table T10 (id int primary key,url char (60) not null default''); # insert data insert into T10 values (1meme 'http://www.baidu.com'),(2,'http://www.sina.com'),(3,'http://www.sohu.com.cn'),(4,'http://www.onlinedown.net'),(5,'http://www.gov.cn'); # modify the table structure and add the urlcrc column alter table T10 add urlcrc int unsigned not null

When storing, insert the crc32 code corresponding to url into the database, and then build the index according to the urlcrc field, and then when searching, we convert the corresponding url to crc32 in the business layer for lookup, and you can use the upper index.

Because the result of crc is a 32-bit int unsigned number, when the data exceeds 4 billion, there will be repetition, but it is worth it. (index length is int4 bytes)

Multi-column index

Considerations of multi-column index-query frequency of columns, degree of distinction of columns, be sure to be combined with actual business scenarios

Take ecshop Mall as an example, cat_id,brand_id in goods table does multi-column index. from the perspective of differentiation, brand_ id has a higher degree of differentiation, but from the actual business of the mall, customers generally choose big category-> small category-> brand first, and finally choose to set up two indexes:

(1) index (cat_id,brand_id)

(2) index (cat_id,shop_price)

You can even add (3) index (cat_id,brand_id,shop_price), three redundant indexes

But the first two columns in (3) are the same as the first two columns in (1), so you can remove (1) and build two indexes.

Index (cat_id,price) and index (cat_id,brand_id,shop_price)

Readers who are interested in MySQL-related content can check out this site's special topics: "Summary of MySQL Index Operation skills", "Summary of MySQL Common functions", "Collection of MySQL Log Operation skills", "MySQL transaction Operation skills Summary", "MySQL stored procedure skills Collection" and "MySQL Database Lock related skills Summary"

It is hoped that what is described in this article will be helpful to everyone's MySQL database design.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report