How to optimize SQL database correctly 07/19 Update SLTechnology News&Howtos

How to optimize SQL database correctly

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The following content mainly brings you how to optimize SQL database correctly. The knowledge mentioned is slightly different from books. It is summarized by professional technicians in the process of contacting users. It has certain experience sharing value and hopes to bring help to readers.

To optimize SQL correctly, we need to quickly locate performance bottlenecks, that is, quickly find out where the main overhead of our SQL is. In most cases, the slowest device performance will be the bottleneck point, such as downloading network speed may be the bottleneck point, local copy files hard disk may be the bottleneck point, why these general work we can quickly identify the bottleneck point, because we have some basic understanding of the performance data of these slow devices, such as network bandwidth is 2Mbps, hard disk is 7200 rpm and so on. Therefore, in order to quickly find SQL performance bottlenecks, we also need to understand the basic performance indicators of our computer system hardware. The current mainstream computer performance indicators shown in the figure below

As you can see from the chart, basically every device has two indicators:

Delay (response time): indicates the burst processing capability of hardware;

Bandwidth (throughput): represents the continuous processing power of the hardware.

As can be seen from the above figure, the computer system hardware performance from high to generation is:

CPU--Cache(L1-L2-L3)--Memory--SSD Hard Disk--Network--Hard Disk

Based on database knowledge, we can list the main tasks of each hardware:

CPU and memory: cache data access, comparison, sorting, transaction detection, SQL parsing, functions or logical operations;

Network: result data transfer, SQL request, remote database access (dblink);

Hard disk: data access, data writing, logging, large data sorting, large table connection.

According to the basic performance indicators of the current computer hardware and its main operation contents in the database, the basic performance optimization rules shown in the following figure can be sorted out:

This optimization rule is summarized into five levels:

Reduce data access (reduce disk access)

Return less data (reduce network transfers or disk access)

3. Reduce the number of interactions (reduce network transmission)

4. Reduce CPU overhead of Cloud Virtual Machine (reduce CPU and memory overhead)

5. Use more resources (increase resources)

Since each layer of optimization rules is to solve the performance problems of its corresponding hardware, the performance improvement ratio is also different. Traditional database system design is also to provide optimization methods for low-speed equipment as much as possible, so there are more optimization methods for low-speed equipment problems, and the optimization cost is lower. Any SQL performance optimization we should follow this rule from top to bottom to diagnose the problem and propose solutions, rather than the first thought is to increase resources to solve the problem.

The following is an empirical reference to the optimization effect and cost corresponding to each optimization law level:

optimization rule

Performance enhancement effect

optimization cost

Reduce data access

1~1000

low

Return less data

1~100

low

Reduce interaction times

1~20

low

Reduce server CPU overhead

1~5

low

Use more resources

@~10

high

Next, we list common optimization methods for 5 optimization rules

a: Rationalization of the design of tables (in accordance with 3NF)

b: Optimize SQL statements (indexes)

c: sub-table technology (horizontal division, vertical division), partition technology

d: read/write [write: update/delete/add] separation

e: Stored process [modular programming for speed]

f: Optimize mysql configuration [configure maximum concurrency, adjust cache size]

g: mysql server hardware upgrade

h: Regularly clear unwanted data and defragment regularly

1. Rationalization of table design (in accordance with 3NF)

1NF (first normal form)

1NF is defined as follows: (As long as the database is a relational database, it automatically satisfies 1NF)

1. Each column must have a unique name

2. The order of rows and columns does not matter

3. Each column must have a single data type

4. Two rows containing the same value are not allowed

5. Each column must contain a single value (a column cannot hold multiple data values)

6. Columns cannot contain duplicate groups

Update, delete, and insert exceptions exist for the first normal form.

2NF (second normal form)

2NF is qualified as follows: (usually we design a primary key to implement)

1. It fits the first normal form.

2. All non-key fields depend on all key fields

The second normal form also has update, delete, and insert exceptions.

3NF (third normal form)

3NF is qualified as follows:

1. Conform to 2NF

2. No transitive dependencies (i.e., the value of one non-key field depends on the value of another non-key field), no redundant data

Reverse 3NF: A database without redundancy is not necessarily the best database. Sometimes, in order to improve operational efficiency, it is necessary to lower the paradigm standard and properly retain redundant data.

Specific practices:

The third normal form is observed in conceptual data model design, and the work of lowering the normal form standard is considered in physical data model design. Lowering the norm is adding fields, allowing redundancy.

2. Optimize SQL statements

(1) Quickly locate slow execution sentences

a Open slow query

b Set slow query time

c Enable slow query logging

d Classify slow logs with mysqldumoslow tool

(2) Analyze SQL statements

a Analyze queries by explaining

B. More detailed information can be obtained by profiling.

(3) SQL statement optimization

a Create index (primary index/unique index/full-text index/regular index)

b Avoid Select * (do not query redundant columns and rows)

c Where less use NOT,!=,,!, NOT EXISTS, NOT IN, NOT LIKE, OR, which ignore indexes and cause full table scans

d Replace the having clause with a Where clause,having filters the result set only after all records have been retrieved

e Use views (column data that is frequently queried, and which is not frequently modified, deleted)

The principle of database indexes is very simple, but very few people can really use indexes correctly in complex tables, and even professional DBAs may not be able to achieve complete optimization.

Indexing will greatly increase the DML(INSERT,UPDATE,Delete) overhead of table records. Correct indexes can improve performance by more than 100 or 1000 times, and unreasonable indexes may also reduce performance by 100 times. Therefore, what kind of index to create in a table needs to balance various business requirements.

If we think of the contents of a table as a dictionary, then the index is equivalent to the dictionary directory, as shown in the following figure:

The figure shows a dictionary directory by radical + stroke number, which is equivalent to creating a combined index by radical + stroke for the dictionary.

Multiple indexes can be created in a table, just as a dictionary can create multiple directories (by pinyin, stroke, radical, etc.).

An index can also be composed of multiple fields, called a composite index, such as the above figure is a composite directory by radical + stroke.

What fields do we index on?

This is a very complex topic that requires thorough analysis of the business and data before results can be obtained. Primary keys and foreign keys usually have indexes, and other fields that need to be indexed should meet the following conditions:

The a field appears in the query condition, and the query condition can use an index;

b sentence execution frequency is high, there will be more than thousands of times a day;

c The record set that can be filtered by field conditions is very small, so what is the appropriate data filtering ratio?

This has no fixed value and needs to be evaluated according to the amount of table data. The following empirical formula can be used for rapid evaluation:

Small table (table with less than 10000 rows): filter proportion

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.