Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the reasons for not using insert into select

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "what are the reasons for not using insert into select". In daily operation, I believe that many people have doubts about the reasons for not using insert into select. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts about "the reasons for not using insert into select". Next, please follow the editor to study!

The cause of the matter

The company's transaction volume is relatively large, the database used is mysql, and the daily increment is about one million. The company does not have a sub-database and sub-table, so in order to maintain the performance of this table, we can only consider data migration.

Colleague Li received the task, so he came up with these two plans.

First query it through the program, then insert the history table, and then delete the original table

Use insert into select to let the database IO do all the operations

When the first program was used, it was found that it was all loaded at once, and the system was directly OOM, but if it was done in batches, there was too much io and a long time, so I chose the second scheme, which did not have any problems during the test. I was happy and was fired.

What happened? let's review.

First, let's take a look at the first scheme, first look at the pseudo code.

/ 1. Query the data to be migrated List list = selectData (); / / 2, insert the data into the history table insertData (list); / / 3, delete the original table data deleteByIds (ids)

We can see from this code that the reason for OOM is very simple, we directly load all the data into memory, no wonder the memory does not explode.

Let's take a look at the second plan, what happened?

In order to maintain the performance of the table while retaining the valid data, we have negotiated a quantity to retain the data for 10 days, and it is almost necessary to keep the 1kw data in the table. So the colleague did a time screening operation, directly insert into select. DateTime

< (Ten days ago),爽极了,直接就避免了要去分页查询数据,这样就不存在OOM啦。还简化了很多的代码操作,减少了网络问题。 为了测试,还特意建了1kw的数据来模拟,测试环境当然是没有问题啦,顺利通过。考虑到这个表是一个支付流水表,于是将这个任务做成定时任务,并且定在晚上8点执行。 晚上量也不是很大,自然是没有什么问题,但是第二天公司财务上班,开始对账,发现资金对不上,很多流水都没有入库。最终排查发现晚上8点之后,陆陆续续开始出现支付流水插入失败的问题,很多数据因此丢失。最终定位到了是迁移任务引起的问题,刚开始还不明所以,白天没有问题,然后想到晚上出现这样的情况可能是晚上的任务出现了影响,最后停掉该任务的第二次上线,发现没有了这样的情况。 复盘 问题在哪里? 为什么停掉迁移的任务之后就好了呢?这个insert into select操作到底做了什么?我们来看看这个语句的explain。 我们不难从图中看出,这个查询语句直接走了全表扫描。这个时候,我们不难猜想到一点点问题。如果全表扫描,我们这个表这么大,是不是意味着迁移的时间会很长?假若我们这个迁移时间为一个小时,那是不是意味着就解释了我们白天没有出现这样问题的原因了。但是全表扫描是最根本的原因吗? 我们不妨试试,一边迁移,一边做些的操作,还原现场。最终还是会出现这样的问题。这个时候,我们可以调整一下,大胆假设,如果不全表扫描,是不是就不会出现这样的问题。当我们将条件修改之后,果然发现没有走了全表扫描了。最终再次还原现场,问题解决了

It is concluded that the full table scan led to the accident.

This solves the problem, but it's hard to explain why it starts to fail one after another.

Reason

At the default transaction isolation level: insert into a select b's operation a means to lock the table directly, and table b is locked one by one. This also explains the reasons for the successive failures. When locking one by one, because most of the flow meters are compound records, the final part is locked during scanning, and some can not get the lock, resulting in timeout or direct failure, and some locking here are successful.

Why is there nothing wrong with the test?

When testing, make full use of the data of the formal environment to test, but do not ignore a problem, that is, the test environment is, after all, the test environment, in testing, the real amount of data does not mean that it is the real business scenario. For example, in this case, there is less of a migration, the insertion of a large amount of data. Eventually lead to online bug

Solution.

Since we can solve this problem by avoiding full table scans, we can just avoid it. To avoid full table scanning, index the conditions after where, and let all our select queries go through the index.

Does insert into still work?

Sure

At this point, the study on "what are the reasons for not using insert into select" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report