Why use Hive 02/10 Update SLTechnology News&Howtos

Why use Hive

2026-02-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article is to share with you about why you use Hive. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

What is Hive?

Hive is a data warehouse tool based on Hadoop, which can map structured data files to a database table, provide simple sql query function, and transform sql statements into MapReduce tasks to run. At the same time, the language also allows familiar with MapReduce developers to develop custom mapper and reducer to handle complex analytical work that cannot be done by built-in mapper and reducer.

Why use Hive

The cost of learning is low, simple MapReduce statistics can be quickly realized through SQL-like statements, and there is no need to develop special MapReduce applications, so it is very suitable for statistical analysis of data warehouse.

Hive system architecture

There are three main user interfaces: CLI,JDBC/ODBC and WebUI

CLI, the Shell command line

JDBC/ODBC is the Java of Hive, similar to the way you use traditional database JDBC

WebGUI accesses Hive through a browser

Metastore (metadata for Hive)

Metadata contains table name, field partition properties and other table attribute information of Hive package table.

Derby database is used as the default metadata warehouse by default (embedded, only single session reply is supported)

Comparison between Hive and traditional data

Hive

RDBMS

Query language

HQL

SQL

Data storage

HDFS

Raw Device or Local FS

Execution

MapReduce

Excutor

Execution delay

High

Low

Processing data scale

Big

Small

Indexes

Add bitmap index after version 0.8

There are complex indexes.

The calling relationship between Hive and hadoop

Hive installation

1. Download the hive source file

two。 Extract the hive file

3. Enter $HIVE_HOME/conf/ to modify the file

A) cp hive-env.sh.template hive-env.sh

B) cp hive-default.xml.template hive-site.xml

4. Modify the hive-env.sh of $HIVE_HOME/bin by adding the following three lines

A) export JAVA_HOME=/usr/local/jdk1.7.0_45

B) export HIVE_HOME=/usr/local/hive-0.14.0

C) export HADOOP_HOME=/usr/local/hadoop-2.6.0

5. The pseudo-distribution mode can launch the Hive console directly using the default Derby, but it will generally be changed to MySQL

Modify $HIVE_HOME/conf/hive-site.xml

Javax.jdo.option.ConnectionURL

Jdbc:mysql://192.168.1.100:3306/crxy_job?

CreateDatabaseIfNotExist=true

Javax.jdo.option.ConnectionDriverName

Com.mysql.jdbc.Driver

Javax.jdo.option.ConnectionUserName

Root

Javax.jdo.option.ConnectionPassword

Admin

Store the driver package jar file of MySQL under HIVE_HOME/lib

Start Hive, and now you can execute the Sql statement to create the table!

Thank you for reading! This is the end of the article on "Why use Hive". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.