Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Summary of SQL optimization

2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

SQL is a skill that every Java programmer must return. For all kinds of complex business in the project, whether you can write efficient and concise SQL plays an important role in the efficiency and stability of the project. Through personal understanding and online materials, this paper summarizes the common knowledge of SQL optimization:

(1) choose the most efficient order of table names (valid only in rule-based optimizers):

ORACLE's parser processes the table names in the FROM clause from right to left, and the last table written in the FROM clause (the underlying table driving table) will be processed first. in the case of multiple tables in the FROM clause, you must select the table with the least number of records as the base table. If there are more than 3 table join queries, you need to select an intersection table as the underlying table, which refers to the table referenced by other tables.

(2) the connection order in the WHERE clause.

ORACLE parses the WHERE clause in a bottom-up order. According to this principle, the join between tables must be written before other WHERE conditions, and the conditions that can filter out the maximum number of records must be written at the end of the WHERE clause.

(3) avoid using'*'in the SELECT clause:

In the process of parsing, ORACLE converts'* 'into all column names in turn, which is done by querying the data dictionary, which means it will take more time.

(4) reduce the number of visits to the database:

ORACLE does a lot of work internally: parsing SQL statements, estimating index utilization, binding variables, reading data blocks, etc.

(5) reset the ARRAYSIZE parameter in SQL*Plus, SQL*Forms and Pro*C to increase the amount of retrieval data per database access. The recommended value is 200.

(6) use the DECODE function to reduce processing time:

Use the DECODE function to avoid repeatedly scanning the same records or repeatedly joining the same tables.

(7) simple and unrelated database access:

If you have several simple database queries, you can integrate them into a single query (even if there is no relationship between them).

(8) Delete duplicate records:

Examples of the most efficient ways to delete duplicate records (because of the use of ROWID):

1 DELETE FROM EMP E WHERE E.ROWID > (SELECT MIN (X.ROWID) 2 FROM EMP X WHERE X.EMP_NO = E.EMP_NO)

(9) replace DELETE with TRUNCATE:

When deleting records in a table, the rollback section (rollback segments) is usually used to store information that can be recovered. If you do not have a COMMIT transaction, ORACLE will restore the data to the state it was before the deletion (to be exact, the state before the delete command) and when using TRUNCATE, the rollback segment no longer holds any recoverable information. When the command runs, the data cannot be recovered. As a result, few resources are called and the execution time is very short. (translator's note: TRUNCATE is only applicable for deleting the whole table. TRUNCATE is DDL, not DML).

(10) use COMMIT as much as possible:

Whenever possible, use COMMIT in your program as much as possible, so that the performance of the program is improved and the requirements are reduced because of the resources released by COMMIT:

Resources released by COMMIT:

a. The information used to recover data on the rollback segment.

b. A lock acquired by a program statement

C. Space in redo log buffer

D. ORACLE is used to manage the internal expenses of the above three resources

(11) replace the HAVING clause with Where clause:

Avoid using the HAVING clause, HAVING will filter the result set only after all records have been retrieved. This process requires sorting, totals, and so on. If you can limit the number of records through the WHERE clause, you can reduce the overhead. (in non-oracle) among the three clauses on, where and having that can be conditioned, on is the first to execute, where is the second, and having is the last. Because on first filters the records that do not meet the conditions before counting, it can reduce the data to be processed by intermediate operations, which is supposed to be the fastest, and where should also be faster than having, because it does sum after filtering data, and on is used only when the two tables are joined. So in a table, only where is left to compare with having. In the case of querying statistics in this single table, if the conditions to be filtered do not involve the fields to be calculated, then their results are the same, except that where can use rushmore technology, but having cannot. If the latter is slow in speed, if the calculated field is involved, it means that the value of this field is uncertain before calculation. According to the workflow written in the previous article, the action time of where is completed before calculation. Having works only after calculation, so in this case, the results of the two will be different. On works earlier than where in multi-table join queries. First of all, according to the join conditions between the tables, the system combines several tables into a temporary table, then filters it by where, then calculates it, and then filters it by having after calculation. It can be seen that for the filter condition to work correctly, we must first understand when the condition should work, and then decide where to put it.

(12) reduce queries on tables:

In SQL statements with subqueries, special attention should be paid to reducing queries on tables. Example:

1 SELECT TAB_NAME FROM TABLES WHERE (TAB_NAME,DB_VER) = (SELECT2 TAB_NAME,DB_VER FROM TAB_COLUMNS WHERE VERSION = 604)

(13) improve the efficiency of SQL through internal functions.

Complex SQL often sacrifices executive efficiency. To be able to master the above methods of using functions to solve problems is very meaningful in practical work.

(14) use the alias of the table (Alias):

When joining multiple tables in a SQL statement, use the alias of the table and prefix it on each Column. In this way, parsing time and syntax errors caused by Column ambiguity can be reduced.

(15) replace IN with EXISTS and NOT IN with NOT EXISTS:

In many queries based on underlying tables, it is often necessary to join another table in order to meet one condition. In this case, using EXISTS (or NOT EXISTS) will usually improve the efficiency of the query. In a subquery, the NOT IN clause performs an internal sort and merge. In either case, NOT IN is the least efficient (because it performs a full table traversal on the tables in the subquery). To avoid using NOT IN, we can rewrite it as an external connection (Outer Joins) or NOT EXISTS.

Example:

1 (efficient) SELECT * FROM EMP (base table) WHERE EMPNO > 0 AND EXISTS (SELECT'X' FROM DEPT WHERE DEPT.DEPTNO = EMP.DEPTNO AND LOC = 'MELB') 2 (inefficient) SELECT * FROM EMP (base table) WHERE EMPNO > 0 AND DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE LOC =' MELB')

(16) identify the SQL statement of 'inefficient execution':

Although there are a variety of graphical tools for SQL optimization, writing your own SQL tool to solve the problem is always the best way:

1 SELECT EXECUTIONS, DISK_READS, BUFFER_GETS,2 ROUND ((BUFFER_GETS-DISK_READS) / BUFFER_GETS,2) Hit_radio, 3 ROUND (DISK_READS/EXECUTIONS,2) Reads_per_run, 4 SQL_TEXT 5 FROM V$SQLAREA 6 WHERE EXECUTIONS > 07 AND BUFFER_GETS > 0 8 AND (BUFFER_GETS-DISK_READS) / BUFFER_GETS

< 0.8 9 ORDER BY 4 DESC; (17) 用索引提高效率: 索引是表的一个概念部分,用来提高检索数据的效率,ORACLE使用了一个复杂的自平衡B-tree结构. 通常,通过索引查询数据比全表扫描要快. 当ORACLE找出执行查询和Update语句的最佳路径时, ORACLE优化器将使用索引. 同样在联结多个表时使用索引也可以提高效率. 另一个使用索引的好处是,它提供了主键(primary key)的唯一性验证.。那些LONG或LONG RAW数据类型, 你可以索引几乎所有的列. 通常, 在大型表中使用索引特别有效. 当然,你也会发现, 在扫描小表时,使用索引同样能提高效率. 虽然使用索引能得到查询效率的提高,但是我们也必须注意到它的代价. 索引需要空间来存储,也需要定期维护, 每当有记录在表中增减或索引列被修改时, 索引本身也会被修改. 这意味着每条记录的INSERT , DELETE , UPDATE将为此多付出4 , 5 次的磁盘I/O . 因为索引需要额外的存储空间和处理,那些不必要的索引反而会使查询反应时间变慢.。定期的重构索引是有必要的.: ALTER INDEX REBUILD (18) 用EXISTS替换DISTINCT: 当提交一个包含一对多表信息(比如部门表和雇员表)的查询时,避免在SELECT子句中使用DISTINCT. 一般可以考虑用EXIST替换, EXISTS 使查询更为迅速,因为RDBMS核心模块将在子查询的条件一旦满足后,立刻返回结果. 例子: 1 (低效): 2 SELECT DISTINCT DEPT_NO,DEPT_NAME FROM DEPT D , EMP E 3 WHERE D.DEPT_NO = E.DEPT_NO 4 (高效): 5 SELECT DEPT_NO,DEPT_NAME FROM DEPT D WHERE EXISTS ( SELECT 'X' 6 FROM EMP E WHERE E.DEPT_NO = D.DEPT_NO); 1 (19) sql语句用大写的;因为oracle总是先解析sql语句,把小写的字母转换成大写的再执行 2 (20) 在java代码中尽量少用连接符"+"连接字符串! 3 (21) 避免在索引列上使用NOT 通常,  4 我们要避免在索引列上使用NOT, NOT会产生在和在索引列上使用函数相同的影响. 当ORACLE"遇到"NOT,他就会停止使用索引转而执行全表扫描. 5 (22) 避免在索引列上使用计算. 6 WHERE子句中,如果索引列是函数的一部分.优化器将不使用索引而使用全表扫描. 7 举例: 1 低效: 2 SELECT … FROM DEPT WHERE SAL * 12 >

25000; 3 efficient: 4 SELECT... FROM DEPT WHERE SAL > 25000 Universe 12

(23) replace > with > =

1 High efficiency: 2 SELECT * FROM EMP WHERE DEPTNO > = 4 3 low efficiency: 4 SELECT * FROM EMP WHERE DEPTNO > 3

The difference between the two is that the former DBMS will jump directly to the first record with DEPT equal to 4, while the latter will first locate the record of DEPTNO=3 and scan forward to the first record with DEPT greater than 3.

(24) replace OR with UNION (for index columns)

In general, replacing OR in the WHERE clause with UNION will have a better effect. Using OR on index columns will result in a full table scan. Note that the above rules are valid only for multiple index columns. If column is not indexed, query efficiency may be reduced because you did not choose OR. In the following example, indexes are built on both LOC_ID and REGION.

1 efficient: 2 SELECT LOC_ID, LOC_DESC, REGION 3 FROM LOCATION 4 WHERE LOC_ID = 10 5 UNION 6 SELECT LOC_ID, LOC_DESC, REGION 7 FROM LOCATION 8 WHERE REGION = "MELBOURNE" 9 inefficient: 10 SELECT LOC_ID, LOC_DESC, REGION 11 FROM LOCATION 12 WHERE LOC_ID = 10 OR REGION = "MELBOURNE"

If you insist on using OR, you need to return the index column with the least number of records at the top.

(25) replace OR with IN

This is a simple and easy-to-remember rule, but the actual execution effect still needs to be tested. Under ORACLE8i, the two execution paths seem to be the same.

1 inefficient: 2 SELECT … . FROM LOCATION WHERE LOC_ID = 10 OR LOC_ID = 20 OR LOC_ID = 30 3 efficient 4 SELECT... FROM LOCATION WHERE LOC_IN IN (10, 10, 20, 30)

(26) avoid using IS NULL and IS NOT NULL on index columns

Avoid using any nullable column in the index. ORACLE will not be able to use the index. For a single-column index, if the column contains a null value, the record will not exist in the index. For a composite index, if each column is empty, the record also does not exist in the index. if at least one column is not empty, the record exists in the index. for example: if the uniqueness index is established on columns An and B of the table, and there is a record in the table with a value of An and B (123 null), ORACLE will not accept the next record (insert) with the same A, B value (123 null). However, if all index columns are empty, ORACLE will assume that the entire key value is empty and empty is not equal to empty. So you can insert 1000 records with the same key value, but of course they are all empty! Because null values do not exist in index columns, comparing index columns with null values in the WHERE clause causes ORACLE to deactivate the index.

1 inefficient: (index invalidation) 2 SELECT … FROM DEPARTMENT WHERE DEPT_CODE IS NOT NULL; 3 efficient: (index valid) 4 SELECT … FROM DEPARTMENT WHERE DEPT_CODE > = 0

(27) always use the first column of the index:

If the index is built on multiple columns, the optimizer will choose to use the index only if its first column (leading column) is referenced by the where clause. This is also a simple but important rule. When referencing only the second column of the index, the optimizer uses a full table scan and ignores the index.

28) replace UNION with UNION-ALL (if possible):

When the SQL statement requires two query result sets to UNION, the two result sets are merged as UNION-ALL, and then sorted before the final result is output. If UNION ALL is used instead of UNION, sorting is not necessary. As a result, efficiency will be improved. It is important to note that UNION ALL will repeatedly output the same records in both result sets. Therefore, you still need to analyze the feasibility of using UNION ALL from the business requirements. UNION will sort the result set, which will use the SORT_AREA_SIZE memory. It is also very important to optimize this piece of memory. The following SQL can be used to query the sorted consumption

1 low efficiency: 2 SELECT ACCT_NUM, BALANCE_AMT 3 FROM DEBIT_TRANSACTIONS 4 WHERE TRAN_DATE ='31 SELECT ACCT_NUM DECMI 95'5 UNION 6 SELECT ACCT_NUM, BALANCE_AMT 7 FROM DEBIT_TRANSACTIONS 8 WHERE TRAN_DATE ='31 SELECT ACCT_NUM DECMI 95'9 High efficiency: 10 SELECT ACCT_NUM, BALANCE_AMT 11 FROM DEBIT_TRANSACTIONS 12 WHERE TRAN_DATE = '31-DEC-95'13 UNION ALL14 SELECT ACCT_NUM, BALANCE_AMT 15 FROM DEBIT_TRANSACTIONS 16 WHERE TRAN_DATE =' 31 FROM DEBIT_TRANSACTIONS 95'

(29) replace ORDER BY with WHERE:

The ORDER BY clause uses the index only under two strict conditions.

All columns in the ORDER BY must be contained in the same index and kept in order in the index.

All columns in the ORDER BY must be defined as non-empty.

The index used in the WHERE clause and the index used in the ORDER BY clause cannot be juxtaposed.

For example:

The table DEPT contains the following:

DEPT_CODE competes for NOT NULL

DEPT_DESC NOT NULL

DEPT_TYPE NULL

1 inefficient: (index not used) 2 SELECT DEPT_CODE FROM DEPT ORDER BY DEPT_TYPE 3 efficient: (using index) 4 SELECT DEPT_CODE FROM DEPT WHERE DEPT_TYPE > 0

(30) avoid changing the type of index column.

When comparing data of different data types, ORACLE automatically performs simple type conversions on columns.

Suppose EMPNO is an index column of a numeric type.

SELECT... FROM EMP WHERE EMPNO = '123'

In fact, after ORACLE type conversion, the statement is converted to:

SELECT... FROM EMP WHERE EMPNO = TO_NUMBER ('123')

Fortunately, type conversion does not occur on index columns, and the purpose of the index has not been changed.

Now, suppose EMP_TYPE is an index column of character type.

SELECT... FROM EMP WHERE EMP_TYPE = 123

This statement is converted by ORACLE to:

SELECT... FROM EMP WHERETO_NUMBER (EMP_TYPE) = 123

This index will not be used because of the internal type conversion! To prevent ORACLE from implicitly casting your SQL, it's best to show the conversion explicitly. Note that when comparing characters with numeric values, ORACLE gives priority to converting numeric types to character types

(31) the WHERE clause that needs to be careful:

The WHERE clause in some SELECT statements does not use an index. Here are some examples.

In the following example, (1)'! = 'will not use the index. Remember, indexes can only tell you what exists in the table, not what doesn't exist in the table. (2)'| |'is a character concatenation function. Like other functions, the index is disabled. (3)'+'is a mathematical function. Like other mathematical functions, the index is disabled. (4) the same index columns cannot be compared with each other, which will enable full table scanning.

(32) a. If you retrieve the number of records in a table with more than 30% of the data. There will be no significant efficiency gains in the use of indexes.

b. In general, using an index is several or even thousands of times more block than a full table scan!

(33) avoid resource-consuming operations:

The SQL statement with DISTINCT,UNION,MINUS,INTERSECT,ORDER BY starts the SQL engine

Perform resource-consuming sorting (SORT) functions. DISTINCT requires one sort operation, while others need to perform at least two sort operations. Usually, SQL statements with UNION, MINUS, and INTERSECT can be rewritten in other ways. If the SORT_AREA_SIZE of your database is well provisioned, you can also consider using UNION, MINUS and INTERSECT. After all, they are highly readable.

(34) optimize GROUP BY:

To improve the efficiency of GROUP BY statements, you can filter out unwanted records before GROUP BY. The following two queries return the same results, but the second one is significantly faster.

1 low efficiency: 2 SELECT JOB, AVG (SAL) 3 FROM EMP 4 GROUP by JOB 5 HAVING JOB = 'PRESIDENT' 6 OR JOB =' MANAGER' 7 High efficiency: 8 SELECT JOB, AVG (SAL) 9 FROM EMP 10 WHERE JOB = 'PRESIDENT' 11 OR JOB =' MANAGER'12 GROUP by JOB

SQL avoids index invalidation:

For example, there are millions of pieces of data in a table, and a field is indexed, but the performance of the query does not improve much, which may be caused by the failure of the oracle index. Oracle indexes have some restrictions, and if you violate these index restrictions, oracle will still perform a full table scan even if you have added the index, and the query performance will not be better than that without indexing, but may be worse due to the system overhead of maintaining the index in the database.

1. In order to optimize the query, we should avoid full table scanning as far as possible, and we should first consider establishing indexes on the columns involved in where and order by.

two。 Try to avoid judging the null value of a field in the where clause, otherwise it will cause the engine to abandon the use of the index and perform a full table scan, such as:

Select id from t where num is null

You can set the default value of 0 on num to ensure that there is no null value for the num column in the table, and then query it like this:

Select id from t where num=0

3. The use of the! = or operator in the where clause should be avoided as much as possible, otherwise the engine will abandon the use of indexes and perform a full table scan.

4. Try to avoid using or to join conditions in the where clause, otherwise it will cause the engine to abandon the use of indexes and perform full table scans, such as:

Select id from t where num=10 or num=20

You can query it like this:

Select id from t where num=10

Union all

Select id from t where num=20

5.in and not in should also be used with caution, otherwise it will lead to full table scanning, such as:

Select id from t where num in (1, 2, 3)

For consecutive values, use between instead of in:

Select id from t where num between 1 and 3

6. The following fuzzy query will also cause a full table scan:

Select id from t where name like'% abc%'

To improve efficiency, consider full-text retrieval.

7. Using parameters in the where clause also results in a full table scan. Because SQL parses local variables only at run time, the optimizer cannot defer the choice of an access plan until run time; it must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and cannot be used as an input to the index selection. A full table scan will be performed as follows:

Select id from t where num=@num

You can force the query to use the index instead:

Select id from t with (index (index name)) where num=@num

8. Expression manipulation of fields in the where clause should be avoided as far as possible, which will cause the engine to abandon the use of indexes and perform full table scans. Such as:

Select id from t where num/2=100

It should be changed to:

Select id from t where num=100*2

9. Functional manipulation of fields in the where clause should be avoided as far as possible, which will cause the engine to abandon the use of indexes and perform full table scans. Such as:

Select id from t where substring (name,1,3) = 'abc'--name id that begins with abc

Select id from t where datediff (day,createdate,'2005-11-30') = id generated by Murray 2005-11-30'

It should be changed to:

Select id from t where name like 'abc%'

Select id from t where createdate > = '2005-11-30' and createdate

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report