In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces the relevant knowledge of "what are the strategies for optimizing the performance of SQL statements". The editor shows you the operation process through actual cases, the operation method is simple and fast, and it is practical. I hope this article "what are the strategies for optimizing the performance of SQL statements" can help you solve the problem.
52 SQL statement performance optimization strategies:
1. In order to optimize the query, we should avoid full table scanning as far as possible, and first consider establishing indexes on the columns involved in WHERE and ORDER BY.
2. Try to avoid judging the NULL value of the field in the WHERE clause. When creating a table, NULL is the default value, but most of the time you should use NOT NULL, or use a special value, such as 0Perry 1 as the default value.
3. Try to avoid using the! = or operator in the WHERE clause. MySQL uses indexes only for the following operators:, > =, BETWEEN,IN, and sometimes LIKE.
4. Try to avoid using OR to join conditions in the WHERE clause, otherwise it will cause the engine to give up using the index and do a full table scan. You can use UNION to merge the query: select id from t where num=10 union all select id from t where num=20.
5. IN and NOT IN should also be used with caution, otherwise it will lead to full table scanning. For consecutive values, do not use IN:select id from t where num between 1 and 3 if you can use BETWEEN.
6. The following query will also lead to a full table scan: select id from t where name like'%abc%' or select id from t where name like'%abc' can consider full-text retrieval to improve efficiency. The index is used by select id from t where name like'abc%'.
7. If you use parameters in the WHERE clause, it will also cause a full table scan.
8. We should try to avoid expression operations on fields in the WHERE clause and function operations on fields in the WHERE clause as far as possible.
9. In many cases, using EXISTS instead of IN is a good choice: select num from a where num in (select num from b). Replace it with the following statement: select num from a where exists (select 1 from b where num=a.num).
10. Although the index can improve the efficiency of the corresponding SELECT, it also reduces the efficiency of INSERT and UPDATE. Because the index may be rebuilt during INSERT or UPDATE, how to build the index needs to be carefully considered, depending on the circumstances. It is best to have no more than 6 indexes in a table, and if there are too many, consider whether it is necessary to build indexes on some infrequently used columns.
11. Updating clustered index data columns should be avoided as much as possible, because the order of clustered index data columns is the physical storage order of table records. Once the value of this column changes, it will lead to the adjustment of the order of the whole table records, which will consume a lot of resources. If the application system needs to update the clustered index data column frequently, it needs to consider whether the index should be built as a clustered index.
12. try to use numeric fields, and try not to design character fields that contain only numeric information, which will reduce the performance of queries and connections, and increase storage overhead.
13. Use varchar and nvarchar instead of char and nchar as much as possible. Because first of all, the storage space of the long field is small, which can save the storage space, and secondly, for the query, searching in a relatively small field is obviously more efficient.
14. It's best not to use return all: select from t, replace "*" with a specific list of fields, and don't return any fields that you don't need.
15. Try to avoid returning a large amount of data to the client. If the amount of data is too large, you should consider whether the corresponding requirements are reasonable.
16. Use table aliases (Alias): when joining multiple tables in a SQL statement, use the table alias and prefix the alias on each Column. In this way, you can reduce parsing time and reduce syntax errors caused by Column ambiguity.
17. Use "temporary Table" to temporarily store the intermediate results:
An important way to simplify SQL statements is to use temporary tables to temporarily store intermediate results. But the benefits of the temporary table are far more than these. The temporary results are temporarily stored in the temporary table, and the subsequent query is in tempdb, which can avoid scanning the main table many times in the program, and greatly reduce the "shared lock" blocking "update lock" in the program execution, reduce the blocking and improve the concurrency performance.
18. Some SQL queries should be added with nolock, reading and writing will block each other, in order to improve concurrency performance. For some queries, you can add nolock so that you can write when reading, but the disadvantage is that you may read unsubmitted dirty data.
There are three principles for using nolock:
Query results for "insert, delete, change" can not add nolock; query table is a frequent page split, be careful to use nolock; temporary table can save the "data foreground", play a similar Oracle undo table space function, can use temporary table to improve concurrent performance, do not use nolock.
19. Common simplified rules are as follows:
Instead of having more than 5 table joins (JOIN), consider using temporary tables or table variables to store intermediate results. Use less subqueries, do not nest views too deeply, and generally do not nest more than 2 views.
20. Pre-calculate the results of the query and put them in the table, and then Select the query. This was the most important method before SQL7.0, such as the calculation of hospital hospitalization fees.
21. Using OR words can be decomposed into multiple queries, and multiple queries can be connected through UNION. Their speed is only related to whether or not to use indexes, and if the query needs to use federated indexes, it is more efficient to execute with UNION all. Multiple OR words do not use the index, rewrite into the form of UNION and then try to match the index. A key question is whether indexes are used.
22. In the list of face values after IN, put the values that appear most frequently at the front and those that appear least at the back, so as to reduce the number of judgments.
23. Try to put the data processing work on the server to reduce the overhead of the network, such as using stored procedures.
Stored procedures are compiled, optimized, and organized into an execution plan and stored in the database SQL statements, is a collection of control flow language, of course, fast. For dynamic SQL executed repeatedly, you can use temporary stored procedures, which (temporary tables) are placed in the Tempdb.
24. When the server has enough memory, configure the number of threads = maximum number of connections + 5, which can maximize efficiency; otherwise, use the thread pool that configures the number of threads and enable SQL SERVER to solve the problem, if it is still the number = maximum number of connections + 5, which will seriously damage the performance of the server.
25. The association of the query is the same as the writing order:
Select a.personMemberID, * from chineseresume a personmember b where personMemberID = b.referenceid and a.personMemberID = 'JCNPRH39681' (A = B, B =' number') select a.personMemberID, * from chineseresume a dint personmember b where a.personMemberID = b.referenceid and a.personMemberID = 'JCNPRH39681' and b.referenceid =' JCNPRH39681' (A = B, B = 'number', A = 'number') select a.personMemberID, * from chineseresume a Personmember b where b.referenceid = 'JCNPRH39681' and a.personMemberID =' JCNPRH39681' (B = 'number', A = 'number')
Try to use EXISTS instead of select count (1) to determine whether there is a record. The count function is only used when counting all rows in the table, and count (1) is more efficient than count (*).
27. Try to use "> =" instead of ">".
28. Specifications for the use of the index:
Index creation should be considered in combination with application. It is recommended that large OLTP tables should have no more than 6 indexes; use index fields as query conditions as far as possible, especially clustered indexes, and specify indexes through index index_name if necessary; avoid table scan when querying large tables, and consider creating new indexes if necessary When using an index field as a condition, if the index is a federated index, the first field in the index must be used as a condition to ensure that the system uses the index, otherwise the index will not be used; pay attention to the maintenance of the index, periodically rebuild the index, and recompile the stored procedure.
29. The columns in the following SQL conditional statements are properly indexed, but the execution speed is very slow:
SELECT * FROM record WHERE substrINg (card_no, 1,4) = '5378'-13 seconds SELECT * FROM record WHERE amount/30' 19991201'-10 seconds
Analysis:
Any operation on a column in the WHERE clause is calculated column by column while SQL is running, so it has to do a table search instead of using the index above that column.
If these results are available at query compilation time, they can be optimized by the SQL optimizer, using indexes to avoid table searches, so rewrite the SQL as follows:
SELECT * FROM record WHERE card_no like '5378%'-- '1999 Universe 12AG01'--
30. When there is a batch of inserts or updates, batch inserts or batch updates will never be updated one by one.
31. In all stored procedures, I will never use loops to implement those that can use SQL statements.
For example: listing every day of the last month, I will use connect by to recursively query, never use a loop from the first day of the last month to the last day.
32. Select the most efficient table name order (valid only in rule-based optimizers):
Oracle's parser processes the table names in the FROM clause from right to left, and the last table written in the FROM clause (the underlying table driving table) will be processed first. in the case of multiple tables in the FROM clause, you must select the table with the least number of records as the base table.
If there are more than three table join queries, you need to select an intersection table as the underlying table, which refers to the table referenced by other tables.
33. To improve the efficiency of GROUP BY statements, you can filter out unwanted records before GROUP BY. The following two queries return the same results, but the second one is significantly faster.
Inefficient:
SELECT JOB, AVG (SAL) FROM EMP GROUP BY JOB HAVING JOB = 'PRESIDENT' OR JOB =' MANAGER'
Efficient:
SELECT JOB, AVG (SAL) FROM EMPWHERE JOB = 'PRESIDENT' OR JOB =' MANAGER' GROUP BY JOB
34. SQL statements are in uppercase, because Oracle always parses SQL statements first, converting lowercase letters to uppercase before execution.
35. The use of aliases, aliases are the application skills of large databases, that is, table names and column names are aliased with one letter in the query, and the query speed is 1.5 times faster than building a connection table.
36. Avoid deadlocks and always access the same table in the same order in your stored procedures and triggers; transactions should be as short as possible and the amount of data involved in a transaction should be reduced as much as possible; never wait for user input in a transaction.
37. Avoid using temporary tables. Unless it is really necessary, try to avoid using temporary tables. Instead, use table variables instead. Most of the time (99%), table variables are stationed in memory, so they are faster than temporary tables, which are stationed in TempDb databases, so operations on temporary tables need to communicate across databases, which is naturally slow.
38. It is best not to use triggers:
Firing a trigger and executing a trigger event is itself a resource-consuming process; if you can use constraints, try not to use triggers; do not use the same trigger for different trigger events (Insert, Update, and Delete); do not use transactional code in triggers.
39. Index creation rules:
Tables with primary and foreign keys must have indexes; tables with more than 300 data should have indexes; tables that are often joined with other tables should establish indexes on join fields; fields that often appear in WHERE clauses, especially those of large tables, should be indexed; indexes should be built on fields with high selectivity; indexes should be built on small fields, but not on large text fields or even super-long fields. The establishment of the composite index needs to be carefully analyzed, and the single-field index should be replaced as far as possible; the main column fields in the composite index are generally fields with good selectivity; do several fields of the composite index often appear in the WHERE clause in the form of AND at the same time? Are there few or no single-field queries? If so, a composite index can be established; otherwise, a single-field index can be considered; if the fields contained in the composite index often appear separately in the WHERE clause, they can be decomposed into multiple single-field indexes; if the composite index contains more than 3 fields, then carefully consider its necessity and consider reducing the number of composite fields If you have both a single-field index and a composite index on these fields, you can generally delete the composite index; do not build too many indexes for tables with frequent data operations; delete useless indexes to avoid a negative impact on the execution plan; each index established on the table will increase storage overhead, and indexes will also increase processing overhead for insert, delete, and update operations. In addition, too many composite indexes generally have no value when there is a single-field index; on the contrary, it will reduce the performance when data is added and deleted, especially for tables that are updated frequently. Try not to index a field in the database that contains a large number of duplicate values.
40. Summary of MySQL query optimization:
Use slow query logs to find slow queries, use execution plans to determine if queries are running properly, and always test your queries to see if they are running at their best.
Performance changes over time to avoid using count (*) on the entire table, which may lock the entire table so that queries are consistent so that subsequent similar queries can use query caching, use GROUP BY instead of DISTINCT where appropriate, and use indexed columns in WHERE, GROUP BY, and ORDER BY clauses to keep the index simple and not include the same column in multiple indexes.
Sometimes MySQL will use the wrong index, for which case use USE INDEX, check for problems using SQL_MODE=STRICT, for index fields with less than 5 records, use LIMIT instead of OR when using UNION.
To avoid SELECT before updating, use INSERT ON DUPLICATE KEY or INSERT IGNORE; instead of UPDATE, and do not use MAX Using the index field and the ORDER BY clause LIMIT MForce N can actually slow down the query in some cases, using UNION instead of the subquery in the WHERE clause, in the restart MySQL, remember to warm your database to ensure that the data is fast in memory and query, and consider persistent connections rather than multiple connections to reduce overhead.
Benchmark queries, including using the load on the server, sometimes a simple query can affect other queries, when the load increases on the server, use SHOW PROCESSLIST to view slow and problematic queries, all suspicious queries tested in mirrored data generated in the development environment.
41. MySQL backup process:
Backup from a secondary replication server; stop replication during backup to avoid inconsistencies on data dependencies and foreign key constraints; stop MySQL completely and back up from database files; if you use MySQL dump for backup, back up binary log files at the same time-make sure the replication is not interrupted; do not trust LVM snapshots, which is likely to cause data inconsistencies and cause trouble for you in the future To make it easier to recover from a single table, export data on a table-by-table basis-- if the data is isolated from other tables. When using mysqldump, use-opt; to check and optimize the table before backup; for faster import, temporarily disable foreign key constraints on import. For faster import, temporarily disable uniqueness detection during import; calculate the size of databases, tables, and indexes after each backup to better monitor the growth of data size; monitor errors and delays in replication instances through automatic scheduling scripts; perform backups on a regular basis.
42. Query buffering does not automatically handle spaces, so when writing SQL statements, you should minimize the use of spaces, especially those at the beginning and end of SQL (because query buffering does not automatically intercept leading and trailing spaces).
43. Is it convenient for member to use mid as a standard for sub-table query? General business requirements are basically based on username query, normal should be username to do hash module to sub-table.
This is what the partition function of MySQL does when it is divided into tables, which is transparent to the code; it seems unreasonable to implement it at the code level.
44. We should set an ID as its primary key for each table in the database, and preferably an INT type (UNSIGNED is recommended), and set the automatically added AUTO_INCREMENT flag.
Set SET NOCOUNT ON at the beginning of all stored procedures and triggers and set SET NOCOUNT OFF at the end. There is no need to send a DONE_IN_PROC message to the client after each statement of the stored procedure and trigger is executed.
46. MySQL queries can enable query caching. This is one of the effective MySQL optimization methods to improve database performance. When the same query is executed multiple times, it is much faster to extract data from the cache and return data directly from the database.
47. EXPLAIN SELECT query is used to track the results:
Use the EXPLAIN keyword to let you know how MySQL handles your SQL statements. This can help you analyze the performance bottlenecks of your query or table structure. EXPLAIN's query results will also tell you how your index primary key is used and how your data tables are searched and sorted.
48. Use LIMIT 1 when only one row of data is needed:
Sometimes when you look up the table, you already know that the result will have only one result, but because you may need to go to the fetch cursor, or you may check the number of records returned.
In this case, adding LIMIT 1 can increase performance. In this way, the MySQL database engine stops searching after finding a piece of data, rather than continuing to look for the next piece of data that matches the record.
49. Select the appropriate storage engine for the table:
* * myisam:** is mainly used for read and insert operations, with only a few updates and deletions, and the requirements for transaction integrity and concurrency are not very high.
* * InnoDB:** transaction processing, and data consistency is required under concurrent conditions. In addition to inserts and queries, there are a lot of updates and deletions. InnoDB effectively reduces locks caused by deletions and updates.
For InnoDB-type tables that support transactions, the main reason for affecting the speed is that the default setting of AUTOCOMMIT is on, and the program does not explicitly call BEGIN to start the transaction, which results in automatic commit of each insert, which seriously affects the speed. You can call begin before executing the SQL, and multiple SQL forms a single thing (even if the autocommit is open), which will greatly improve performance.
50. Optimize the data type of the table and select the appropriate data type:
* * principle: * * smaller is usually better, as long as it is simple. All fields must have default values. Avoid NULL as much as possible.
For example, database tables are designed to take up as little disk space as possible using smaller integer types. (mediumint is more suitable than int.)
For example, the time fields: datetime and timestamp. Datetime takes up 8 bytes and timestamp takes up 4 bytes, which is only half of it. The range indicated by timestamp is 1970 Murray 2037, which is suitable for updating.
MySQL can well support access to a large amount of data, but in general, the smaller the table in the database, the faster the query will be executed on it.
Therefore, in order to achieve better performance when creating the table, we can set the width of the fields in the table as small as possible.
For example, when defining the zip code field, if you set it to CHAR, it obviously adds unnecessary space to the database. Even the use of the VARCHAR type is superfluous, because CHAR (6) can do a good job.
Similarly, if possible, we should use MEDIUMINT instead of BIGIN to define integer fields, and we should try to set the field to NOT NULL so that the database does not have to compare NULL values when executing queries in the future.
For some text fields, such as "province" or "gender", we can define them as ENUM types. Because in MySQL, ENUM types are treated as numeric data, and numeric data is processed much faster than text types. In this way, we can improve the performance of the database.
51. String data types: char, varchar, text selection differences.
52. Any operation on a column will result in a table scan, which includes database functions, evaluation expressions, and so on. When querying, move the operation to the right of the equal sign as much as possible.
This is the end of the introduction to "what are the strategies for optimizing the performance of SQL statements". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.