In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
How to explore the efficiency of select in in postgresql, I believe that many inexperienced people do not know what to do. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
I can see this question on Zhihu:
How can MySQL query select * from table where id in (hundreds or thousands of id) improve efficiency? Modify
E-commerce website, a commodity property sheet, hundreds of thousands of records, 80m, index only primary key id, how to improve efficiency to do such a query?
Select * from table where id in (hundreds or thousands of id)
These id are irregular and scattered.
Looked at the answer, felt that there are a lot of unreliable, but there is no evidence, so I wrote a few queries on my computer to test. I use Postgresql9.4, but I feel that mysql should be about the same. First, create a simple table with only three simple columns. At the bottom of this question, many people mentioned that you need to look at the size of the table. In fact, this problem has nothing to do with the size of the table, only related to the size of index. Because index is based on int, it only has to do with the number of records.
Table "public.t9" Column | Type | Modifiers-+-+-c1 | integer | c2 | character | c3 | character | Indexes: "i1" UNIQUE, btree (C1) insert into T9 values (generate_series (1000, 500000), repeat ('axiom, 90), repeat (' baked)
After that, generate some random numbers, use jot,Linux on Mac and shuf on jot,Linux.
For ((iSuppli > rand.filedone)
Then generate the query statement according to rand.file:
Select * from t9 where C1 in (494613 575087 in 363588 527650 251670 1464 426858 202886 254037.. .1)
Generate three sql files respectively, and the number of variables in in is 100j1000 and 10000, respectively. Execute these three sql files to see the time.
Try psql study-f test_100.sql-o / dev/nullLOG: duration: 2.879 mstry psql study-f test_1000.sql-o / dev/nullLOG: duration: 11.974 mstry psql study-f test_10000.sql-o / dev/nullLOG: duration: 355.689 ms
You can see that the data time will change greatly only when the number of data in the in reaches 10000, but it is only completed within more than 10000 ms.
What if, according to some answers, first build a temporary table, then use in subquery, and hope that you can have two tables join at this time? For the sake of simplicity, I directly use two tables join.
Drop table tactitmpbot create table t_tmp (id int); insert into t_tmp (id) values (494613), (575087), (363588), (345980),... (1); select t9.* from T9, t_tmpwhere t9.c1 = t_tmp.id
What about the time?
Try psql study-f test_create_10000.sql-o / dev/nullLOG: duration: 2.078 msLOG: duration: 1.233 msLOG: duration: 224.112 msLOG: duration: 322.108 ms
Except for drop and create, it still takes 500 + time, the premise here is that I use the SSD disk, so the time to write LOG will be much faster. Why is it so slow? Take a look at it with explain. At this time, there is a large amount of data, so you can go directly to Merge join.
What about the efficiency of 1000 rows of data?
Try psql study-f test_create_1000.sql-o exp.outLOG: duration: 2.476 msLOG: duration: 0.967 msLOG: duration: 2.391 msLOG: duration: 8.780 ms
The data for 100 rows is as follows:
Try psql study-f test_create_100.sql-o / dev/nullLOG: duration: 2.020 msLOG: duration: 1.028 msLOG: duration: 1.074 msLOG: duration: 1.912 ms
You can see that in the case of 100s and 1000 values, create table is not much better than writing all the variables directly in in. Explain is using NLJ. However, in the case of a larger amount of data (according to the original problem, the number of in here is actually unpredictable), the efficiency will only be lower, coupled with the additional table maintenance costs and redundant SQL statements, DBA certainly does not like it, or trust the database, rest assured to directly use in list to solve these problems.
After reading the above, have you mastered the method of how to discuss the efficiency of select in in postgresql? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.