In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "MySql deletion and update operations have an impact on performance". In daily operations, I believe many people have doubts about whether MySql deletion and update operations have any impact on performance. The editor has consulted all kinds of information and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts that "MySql deletion and update operations have an impact on performance". Next, please follow the editor to study!
Delete and update operations tend to be more expensive than inserts, so a good design requires fewer updates and deletions to the database.
3.1 Update operation
The update operation of the database will bring a series of "effects": the update operation needs to log (so that it can be rolled back in case of error); updating variable length fields (such as varchar type) will bring about changes in the physical storage of data (movement of records); updating index fields will lead to index reconstruction; updating primary keys will lead to data reorganization, and so on. All this will not only cause the low efficiency of the update operation itself, but also reduce the query performance in the future due to the generation of disk fragments. In order to deal with this situation, there are two strategies: first, reduce the number of updates, write the updates of multiple fields into the same statement; second, avoid updates. These two strategies are applicable to different situations, and the following examples will illustrate both situations.
3.1.1 reducing the number of updates there is a code cleaning process in the integration library, which is to assign values to the self-encoded fields of business data by connecting the code table. Code cleaning is actually a process of updating business data tables by associating code tables, which requires connecting multiple code tables and updating multiple self-coding fields. To complete this update, there are two ways to write update statements: one is to write multiple SQL statements, each of which updates a self-encoded field; the other is to write all updates in one statement. The update statement to update the bank code is as follows:
UpdateTBL_INCOME_TMP AsetBANKCODESELF = (select SELFCODE from TBL_BANKINFO B where A.BANKCODE = B.BANKCODE)
The statement that updates multiple self-encoded fields through a single update statement is shown below:
The copy code is as follows: updateTBL_INCOME_TMP
Set code 1 self-encoding = get self-coding by associating code 1 table, code 2 self-coding = getting self-coding by association code 2 table,., code n self-coding = getting self-coding by association code n table
Using 20 million of the test data. The test results of the two methods are shown in the table below. The test results show that the performance of the one-time update method is improved ten times, and the performance is greatly improved.
Treatment process
Multiple update methods are time-consuming
One update method is time-consuming.
Code cleaning
0:29:480:02:59
3.1.2 avoid updates
The following is a popular example, this kind of situation is often encountered. A company has a set of employee attendance system. In order to improve the performance of query statistics, some tables containing redundant information are established on the basis of the original system. Take the employee table as an example, the process of obtaining data is shown in figure 12. The first step is to put the employee information in the new table, and then the connection updates the Department name through the field "Department ID" connection.
Figure 12. Association update
In general, fields such as department names are designed to be variable length in order to save storage costs. Therefore, when updating it, it will cause the reorganization of disk data, form disk fragments, and affect the query performance.
To avoid this, we can use the method shown in figure 13 to avoid updates. This method completes the insertion of the redundant data table in one step, and then connects the department table to obtain the "department name", thus avoiding the update operation.
Figure 13. Avoid updates
3.2 Delete operation
Beginners may think that deletion is simple and can be done quickly. In fact, this is a misunderstanding, the deletion process requires a large number of disk scans; the database log needs to be recorded; and the deletion process does not free up disk space, wastes the disk, and shatters the data on the disk, which is a fatal blow to the performance of subsequent queries. There are usually two ways to deal with it: first, to reorganize the tables that often do delete operations (reorg); second, to avoid deletion.
3.2.1 reorganization
The reorg operation rearranges the physical order of the list data and removes free space from the fragmented data.
Because the delete operation does not free disk space, the table becomes fragmented after the delete operation, which results in a serious performance degradation, which also occurs after multiple update operations. If statistics are collected but there is no significant performance improvement, it may be helpful to reorganize the table data. When reorganizing table data, rearrange the physical order of the data according to the specified index and remove the free space in the fragmented data. This allows the data to be accessed more quickly, thereby improving performance.
3.2.2 avoid deletions-intermediate tables and formal table schemas
Intermediate tables and formal table schemas are often used when data requires more complex processing. The data is processed in the intermediate table, and then the data that meets the condition is transferred to the formal table, and the data that does not meet the condition is retained in the intermediate table. Figure 14 illustrates the process of transferring data from an intermediate table to a formal table: after data processing is completed, the data of flag = 1 in the intermediate table temp1 needs to be inserted into the formal table, and the data of flag = 1 in the intermediate table temp1 needs to be deleted.
Figure 14. Transfer data from intermediate table to formal table
Because the flag field is not a clustered index, when the intermediate table temp1 is deleted, a large amount of fragmentation is left on disk, as shown in figure 15. Not only will so many disk fragments be left, but the space of deleted data will not be automatically freed. The result is not only a waste of disk space, but also a sharp decline in query performance.
Figure 15. Delete disk fragments after operation
We can use the command to empty the table to avoid deletion. In addition to the intermediate table temp1 and the formal table, add a secondary temporary table temp2. If only 10% of the data flag=0 is retained in the temp1, this optimization will significantly improve performance. The specific steps are as follows:
1. Insert the data of flag=0 in temp1 into temp2
two。 Clear table temp1
The copy code is as follows: alter table temp1 ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE
3. Insert data from temp2 into temp1
3.3 how to make access more efficient
1. Connect to the database at once and do a lot of things. The connection was not disconnected until it was finished.
2. A SQL statement contains as many operations as possible. To put it figuratively: thousands of sentences, looping constantly with the help of cursors, very slow. It is still very slow to process the same data with a few statements. It is best to change to a sentence to solve the problem.
3. Close to the core of DBMS. Try to use the functions that come with the database. Reduce custom functions. Because no matter how smart the database optimizer is, it doesn't recognize custom functions.
4. Do not join too many tables in one statement. The recommended upper limit is 5.
5. Centralize frequently updated columns: when updating a row, DB2 records all columns that have changed, so putting frequently updated columns together reduces DB2's recording work. This is just a small suggestion about performance, so no major application or database design changes should be made to implement it.
At this point, the study on "whether MySql deletion and update operations have any impact on performance" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.