In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces how to use Prestodb, a PB-level data analysis tool, which is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.
What is prestodb?
Prestodb, an open source sql on hadoop system of facebook, is a high-performance query engine developed by facebook engineers after they are fed up with the query speed of hive. It is written based on java8 and based on pipeline technology of page, which makes it have efficient interactive query performance, and can efficiently control GC; and its decoupling from underlying data sources, so that it can connect with all kinds of data sources and has the characteristic of cross-source query. At present, in China, companies such as JD.com, Meituan, Tongcheng and Didi have deep use of prestodb. Abroad, in addition to facebook, uber and other companies have deep use of prestodb, while teradata is maintaining an independent branch and using it as its main background for inquiring products. This article introduces prestodb. First, it mainly introduces the architecture and query principle of presto. The construction of presto is relatively simple. You can refer to the article on the official website for operation.
The overall architecture of prestodb
As shown in the figure above, prestodb is mainly composed of a coordinator and multiple worker. The coordinaor node is responsible for interfacing with client and receiving all kinds of requests (DDL and DML) sent by client. After receiving the request from client, coordinator begins to process the request and finally returns the Charlie result to client. Coordinator carries out lexical parsing, syntax analysis, semantic analysis, optimization and generation of execution plan for all kinds of sql statements when processing the request. Finally, the tasks are distributed in the scheduling module, and the sub-tasks are distributed to each worker node. The worker node is the actual execution node that performs operations including aggregation, sorting, join, and de-duplication. The overall execution process is shown in the following figure:
Most of these processes will be described in detail later.
The main purpose of this article is to introduce and popularize the implementation principle of distributed sql. After reading some other related articles, they are all introduced from top to bottom. I feel that this is not conducive to getting started, and many people "back down" when they see the implementation plan. So when I introduce presto, I'm going to introduce it from the bottom up.
To put it bluntly, distributed sql is also sql. Since it is sql, several typical query statements are groupby, orderby, join and so on. This article takes groupby as an example to introduce, and the implementation process of orderby and join will also be introduced in subsequent articles.
Physical execution plan
The physical execution plan is the closest step to what we understand, so let's first take a look at the physical execution plan in presot. Suppose we have an order table whose data is distributed on two nodes. The data fragments on node1 are:
The data fragments on node2 are as follows:
Suppose we have a grouped aggregate query:
SELECT sum (totalprice), orderpriorityFROM orderswhere custkey
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.