In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Today, I will talk to you about the interactive computing engines Impala and Presto in the scene, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following content for you. I hope you can get something from this article.
Interactive computing engines Impala and Presto applied to ROLAP scenarios
It has the following characteristics:
1. Integrated with Hadoop ecosystem, it can dock with Hive Metastore, process tables in hive, and directly deal with data stored in HDFS and Hbase.
2. Calculation and storage analysis: only a query engine, no data storage service is provided.
3. MPP architecture, which adopts the classical MPP architecture, has good expansibility and can meet the interactive query requirements of TB and even PB-level data.
4. Nested data storage, supporting common column storage formats, such as ORC and Parquet.
Impala: developed by Cludera Company, fully combines the advantages of traditional database and big data system Hadoop to construct a new high-performance query engine that supports SQL and multi-tenancy, and has good flexibility and expansibility.
First, characteristics:
1. Impala completely abandons MapReduce, which is not suitable for SQL query, draws lessons from the idea of MPP parallel database, and adopts the design architecture of full-service process.
2. It is implemented in full memory, and there is no need to write the intermediate results to disk, which saves a lot of IUnix O overhead.
3. Make full use of local reading and allocate data and calculations to the same machine as much as possible.
4, using C++ to achieve, do a lot of optimization for the underlying, eg:SSE instructions.
2. Basic structure:
1. Catalogd: meta-information management service
2. Statestored: status management server
3. Impalad: assume the dual roles of coordinator and executive at the same time.
III. Access method
Access through JDBC/ODBC, authentication through Kerberos or LADP.
Presto: open source by Facebook, it can handle TB or even PB-level data. Because Presto can seamlessly integrate with Hive, it has become the mainstream OLAP engine.
1. Basic structure:
Is a Master-Slave architecture that consists of a Coordinator service, a Discovery Server service, and multiple Worker services.
1. Coordinator: coordinator, receives client-side query request (SQL) and performs lexical analysis, syntax analysis generates logical query plan and physical query plan, schedules each task to each worker for execution, and further summarizes the results after worker returns. Multiple Coordinator can exist simultaneously in a Presto cluster to prevent a single point of failure.
2. Discovery Server: service discovery component. When each Worker starts, it registers with Discovery Server periodically and reports the status information to Discovery Server.
3. Worker: task executor.
Presto is a distributed query engine, which does not provide data storage function. For this reason, Presto adopts plug-in design idea to support a variety of data samples, including Hive, HDFS, Mysql, Cassanddra, Hbase and Redis.
Second, access mode
Presto is a plug-in architecture, which accesses external data sources through connectors. In order to distinguish the data from each data source, it introduces a layer of namespace: catalog. The aforementioned Hive, Cassandra and Mysql all exist as catalog in Presto. There can be multiple databases in different catalog, and further multiple tables can exist in each database.
OLAP query engines Druid and Kylin, which are different from the MOLAP type of multi-dimensional data organization.
After reading the above, do you have any further understanding of the interactive computing engines Impala and Presto in the scenario? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.