Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

0039-how to use the Python Impyla client to connect Hive and Impala

2025-04-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Warm Tip: to see the high-definition no-code picture, please open it with your mobile phone and click the picture to enlarge.

1. Purpose of document writing

Following the previous chapter on how to install Anaconda& to build a Python private source in a CDH cluster, this chapter focuses on how to use Pyton Impyla clients to connect to HiveServer2 and Impala Daemon of a CDH cluster, and to perform SQL operations.

Content Overview

1. Dependent package installation

two。 Code writing

3. Code testing

Test environment

Version 5.11.2 for 1.CM and CDH

2.RedHat7.2

Precondition

The 1.CDH cluster environment is running normally.

2.Anaconda has installed and configured environment variables

The 3.pip tool can install Python packages normally.

4.Python version 2.6 + or 3.3 +

5. Non-secure cluster environment

2.Impyla dependency package installation

Python package on which Impyla depends

Sixbit_arraythrift (on Python 2.x) orthriftpy (on Python 3.x) thrift_saslsasl

1. First install the Python package that Impyla depends on

[root-31-22-86 ~] # pip install bit_ array [root @ ip-172-31-22-86 ~] # pip install thrift== 0.9.3 [root @ ip-172-31-22-86 ~] # pip install [root @ ip-172-31-22-86 ~] # pip install thrift_ sasl [root @ ip-172-31-22-86 ~] # pip install sasl

Note: the version of thrift must use 0.9.3. The default installation is 0.10.0. You need to uninstall and reinstall 0.9.3. Uninstall the command pip uninstall thrift.

two。 Install the Impyra package

For impyla version, 0.14.0 is installed by default. Version 0.13.8 needs to be installed after uninstallation.

[root@ip-172-31-22-86 ec2-user] # pip install impyla==0.13.8Collecting impyla Downloading impyla-0.14.0.tar.gz (151kB) 100% | ██ | 153kB 1.0MB/s Requirement already satisfied: six in / opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla) Requirement already Satisfied: bitarray in / opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla) Requirement already satisfied: thrift in / opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla) Building wheels for collected packages: impyla Running setup.py bdist_wheel for impyla. Done Stored in directory: / root/.cache/pip/wheels/96/fa/d8/40e676f3cead7ec45f20ac43eb373edc471348ac5cb485d6f5Successfully built impylaInstalling collected packages: impylaSuccessfully installed impyla-0.14.0

3. Write Python code

Python connection Hive (HiveTest.py)

From impala.dbapi importconnect

Conn = connect (host='ip-172-31-21-45.aptel SoutheastMechan authentic mechan

Ism='PLAIN')

Print (conn)

Cursor = conn.cursor ()

Cursor.execute ('show databases')

Print cursor.description # prints the result set's schema

Results = cursor.fetchall ()

Print (results)

Cursor.execute ('SELECT * FROM test limit 10')

Print cursor.description # prints the result set's schema

Results = cursor.fetchall ()

Print (results)

Python connection Impala (ImpalaTest.py)

From impala.dbapi importconnect

Conn = connect (host='ip-172-31-26-80.apwaysoutheastMuth1.compute.Practicals, paperwork portals 21050)

Print (conn)

Cursor = conn.cursor ()

Cursor.execute ('show databases')

Print cursor.description # prints the result set's schema

Results = cursor.fetchall ()

Print (results)

Cursor.execute ('SELECT * FROM test limit 10')

Print cursor.description # prints the result set's schema

Results = cursor.fetchall ()

Print (results)

4. Test code

Perform Python code tests on the shell command line

1. Test connection to Hive

_ root@ip-172-31-22-86_ec2-user# python HiveTest.py

_

('database_name',' STRING', None, None)

('default',)

('test.s1',' STRING',None, None), ('test.s2',' STRING',None, None)

('name1',' age1'), ('name2',' age2'), ('name3',' age3'), ('name4',' age4'), ('name5',' age5'), ('name6',' age6'), ('name7',' age7'), ('name8',' age8'), ('name9',' age9'), ('name10',' age10')

[root@ip-172-31-22-86 ec2-user] #

two。 Test connection to Impala

_ root@ip-172-31-22-86_ec2-user# python ImpalaTest.py

_

('name',' STRING', None, None), ('comment',' STRING', None, None)

('_ impala_builtins', 'Systemdatabase for Impala builtin functions'), (' default', 'Default Hive database')

('s 1, 'STRING', None,None, None,None, None), (' s 2, 'STRING', None,None, None,None, None)

('name1',' age1'), ('name2',' age2'), ('name3',' age3'), ('name4',' age4'), ('name5',' age5'), ('name6',' age6'), ('name7',' age7'), ('name8',' age8'), ('name9',' age9'), ('name10',' age10')

[root@ip-172-31-22-86 ec2-user] #

5. common problem

1. Error one

Building 'sasl.saslwrapper' extension creating build/temp.linux-x86_64-2.7creating build/temp.linux-x86_64-2.7/sasl gcc-pthread-fno-strict-aliasing-g-O2-DNDEBUG-g-fwrapv-O3-Wall-Wstrict-prototypes-fPIC-Isasl-I/opt/cloudera/parcels/Anaconda/include/python2.7-c sasl/saslwrapper.cpp-o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o unable to execute' Gcc': No such file or directory error: command 'gcc' failed with exit status 1-Command "/ opt/cloudera/parcels/Anaconda/bin/python-u-c" import setuptools Tokenize _ _ file__='/tmp/pip-build-kD6tvP/sasl/setup.py';f=getattr (tokenize, 'open', open) (_ _ file__); code=f.read (). Replace ('\ r\ n'); f.close () Exec (compile (code, _ _ file__, 'exec')) "install--record/ tmp/pip-WJFNeG-record/install-record.txt-- single-version-externally-managed-- compile" failed with error code 1 in / tmp/pip-build-kD6tvP/sasl/

Solution:

[root@ip-172-31-22-86 ec2-user] # yum-y install gcc [root@ip-172-31-22-86 ec2-user] # yum install gcc-c++

two。 Error two

Gcc-pthread-fno-strict-aliasing-g-O2-DNDEBUG-g-fwrapv-O3-Wall-Wstrict-prototypes-fPIC-Isasl-I/opt/cloudera/parcels/Anaconda/include/python2.7-c sasl/saslwrapper.cpp-o build/temp.linux-x86_64-2.7/sasl/saslwrapper.occ1plus: warning: command line option'- Wstrict-prototypes' is valid for C/ObjC but not for C++ [enabled by default] In file included from sasl/saslwrapper.cpp:254:0:sasl/saslwrapper. HJR 22 compilation terminated.error 23: fatal error: sasl/sasl.h: No such file or directory#include ^ compilation terminated.error: command 'gcc' failed with exit status 1

Solution:

[root@ip-172-31-22-86 ec2-user] # yum-y install python-devel.x86_64 cyrus-sasl-devel.x86_64

Drunken whips are famous horses, and teenagers are so pompous! Lingnan Huan Xisha, under the vomiting liquor store! The best friend refuses to let go, the flower of data play!

Warm Tip: to see the high-definition no-code picture, please open it with your mobile phone and click the picture to enlarge.

It is recommended to follow Hadoop practice, the first time, share more Hadoop practical information, welcome to forward and share.

Original article, welcome to reprint, reprint please indicate: reproduced from the official account of Wechat Hadoop

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 207

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report