Steps for Presto distributed installation to query Hive 04/26 Update SLTechnology News&Howtos

Steps for Presto distributed installation to query Hive

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "Presto distributed installation query Hive steps", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Next let the editor to take you to learn "Presto distributed installation query Hive steps" it!

The version of Hadoop is 2.7.2Magi Hive is 2.1.1 and the version of Magi Presto is 0.197. Presto is a query engine for master-slave results, so we use three machines to install it, as follows: List-1

List-1

192.168.33.34 presto-coordinator192.168.33.35 presto-slave1192.168.33.36 presto-slave2

Figure 1 below

Figure 1

JDK is installed on each machine, and the JDK version I use is 1.8.0x131, with brief steps; copy the Hadoop package from the Hadoop cluster to / opt and add Hadoop to PATH, each one.

Presto-coordinator

On presto-coordinator, unzip the presto installation package under / opt

1. Config.properties, create a new config.properties file under etc with the following contents: List-2

List-2

Does coordinator=truenode-scheduler.include-coordinator=falsehttp-server.http.port=18080query.max-memory=1GBdiscovery-server.enabled=truediscovery.uri= http://192.168.33.34:18080# allow presto server services to have as worker as coordinator? we are doing this for falsenode-scheduler.include-coordinator=false.

2. Jvm.config, create a new jvm.config under etc, and the file content is as follows: List-3

List-3

-server-XX:+UseConcMarkSweepGC-XX:+ExplicitGCInvokesConcurrent-XX:+CMSClassUnloadingEnabled-XX:+AggressiveOpts-XX:+HeapDumpOnOutOfMemoryError-XX:OnOutOfMemoryError=kill-9% p-XX:ReservedCodeCacheSize=256M

3. Log.properties. Create a new log.properties under etc. The contents of the file are as follows: List-4

List-4

Com.facebook.presto=INFO

4. Node.properties. Create a new node.properties under etc. The contents of the file are as follows: List-5

List-5

Node.environment=productionnode.id=node_masternode.data-dir=/opt/prestoserver/data

5. Create a new catalog directory under etc and a new hive.properties under etc/catalog, as shown in the following List-6. 192.168.33.33 List-6 9083 is the metastore service address of hive.

List-6

Connector.name=hive-hadoop2hive.metastore.uri=thrift://192.168.33.33:9083hive.config.resources=/opt/hadoop/core-site.xml,/opt/hadoop/etc/hadoop/hdfs-site.xmlpresto-slave1

Like presto-coordinator, create config.properties, jvm.config, log.properties, node.properties and catalog/hive.properties under etc, except that the content of config.properties is different from that of coordinator. The following values of List-7,node.properties are different from those of coordinator, as shown in List-8 below.

The value of List-7 coordinator is false

Coordinator=falsehttp-server.http.port=18080query.max-memory=1GBdiscovery-server.enabled=truediscovery.uri= http://192.168.33.34:18080node-scheduler.include-coordinator=false

List-8 node-id needs to be modified

Node.environment=productionnode.id=node_node1node.data-dir=/opt/prestoserver/datapresto-slave2

It is the same as slave1 on slave2, except that the value of node.properties is different from that of slave1. Change node.id to your own, as shown in List-9 below.

List-9

Node.environment=productionnode.id=node_node2node.data-dir=/opt/prestoserver/data start

Execute "bin/launcher run" on presto-coordinator, which will print the log to the console so that we can debug it. If it is launcher start, it will run in the background without seeing the log.

Execute "bin/launcher run" on presto-slave1

Execute "bin/launcher run" on presto-slave2

After that, the browser accesses 192.168.33.34mm 18080.

Presto consists of three parts.

A Coordinator node, Coordinator: responsible for parsing SQL statements, generating execution plans, and distributing execution tasks to Worker nodes for execution

A Discovery Server node, Discovery Server: usually embedded in the Coordinator node, where worker registers itself

Multiple Worker nodes are responsible for actually performing query tasks and for interacting with HDFS to read data

Reasons for the low latency of the Presto query engine:

Memory-based parallel computing

Pipelined computing operation

Localized computing

Dynamic compilation execution plan

Hive is a storage and computing engine, but Persto does not do storage.

Reference

Https://blog.51cto.com/simplelife/1967220

Https://www.liangzl.com/get-article-detail-40387.html

At this point, I believe that everyone on the "Presto distributed installation query Hive steps" have a deeper understanding, might as well to the actual operation of it! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.