In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "how to optimize the creation of an association table with sequoiasql-postgresql". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
[user consultation]
Create an associated table with sequoiasql-postgresql and store 500W data. In the case of creating an index, it takes an average of 50 seconds to do a simple group by query with pg. Can it be further optimized?
[problem description]
1. Cluster deployment
The three hosts correspond to a partition group, and each host is deployed with coord, cata and data nodes.
two。 Table attribute
A 500w data associated table b_qt_swdj and an 800w data pg native table b_qt_swdj1, the two tables have the same structure, a total of 39 fields.
3.sample and statement
Execute a query against the associated table:
① SELECT sw_scjy_yb,count (se_scjy_yb) as count FROM b_qt_swdj group by sw_scjy_yb
② SELECT count (1) from b_qt_swdj
Execute the query against the original table:
③ SELECT sw_scjy_yb,count (se_scjy_yb) as count FROM b_qt_swdj1 group by sw_scjy_yb
④ SELECT count (1) from b_qt_swdj1
It takes 57 seconds to execute ① with db built-in sql. It took 33 seconds to execute the ②.
It takes 50 seconds to execute ① with pg. It took 44 seconds to execute the ②.
It takes 8 seconds to execute ③ with pg. It takes 3 seconds to execute ④.
4. Network bandwidth
When transmitting data, it is about 46MB/S.
[solution]
SELECT sw_scjy_yb,count (se_scjy_yb) as count FROM b_qt_swdj1 group by sw_scjy_yb; takes about the same time in pg as it takes to use sdb's built-in sql, which is about 50s. It has little to do with pg.
Judging from pg's access plan (see screenshot 3), most of the time is also spent on select.
Suggestion 1: we should consider partitioning the table on the sbb side to improve the concurrency of data extraction to improve performance.
Recommendation 2: for large tables, it is recommended to use SparkSQL as the sql engine.
This is the end of the content of "how to optimize creating an associated table with sequoiasql-postgresql". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.