Introduction to the concept and system structure of hive 04/29 Update SLTechnology News&Howtos

Introduction to the concept and system structure of hive

2025-04-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly explains the "introduction of the concept and system structure of hive". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "the concept and system structure of hive".

Hive introduces concepts:

1.Hive is a data warehouse infrastructure built on Hadoop. It provides a series of tools that can be used for data extraction, transformation loading (ETL), a mechanism that can store, query, and analyze large-scale data stored in Hadoop. Hive defines a simple SQL-like query language called QL, which allows users who are familiar with SQL to query data. At the same time, the language also allows familiar with MapReduce developers to develop custom mapper and reducer to handle complex analytical work that cannot be done by built-in mapper and reducer.

2.Hive is a SQL parsing engine that translates SQL statements into M _ Job and then executes them in Hadoop.

The table of 3.Hive is actually the directory / file of HDFS, separating the folders by table name. If it is a partition table, the partition value is a subfolder, and you can use this data directly in the MZR Job.

The great thing about Hive is that:

1. It is based on MapReduce and supports sql syntax

two。 There are no format requirements for data uploaded to the data warehouse

System Architecture of Hive

User interface, including CLI,JDBC/ODBC,WebUI

Metadata storage, usually stored in relational databases such as mysql, derby

Interpreter, compiler, optimizer, executor

Hadoop: use HDFS for storage and MapReduce for calculation

There are three main user interfaces: CLI,JDBC/ODBC and WebUI

CLI, the Shell command line

JDBC/ODBC is the Java of Hive, similar to the way you use traditional database JDBC

WebGUI accesses Hive through a browser

L Hive stores metadata in a database (metastore). Currently, only mysql and derby are supported. The metadata in Hive includes the name of the table, the columns and partitions of the table and its attributes, the attributes of the table (whether it is an external table, etc.), the directory where the data of the table is located, and so on.

The interpreter, compiler and optimizer complete the generation of HQL query statements from lexical analysis, syntax analysis, compilation, optimization and query plan (plan). The generated query plan is stored in HDFS and subsequently executed by a MapReduce call

L Hive's data is stored in HDFS, and most queries are done by MapReduce (queries that include *, such as select * from table do not generate MapRedcue tasks)

Thank you for your reading, the above is the content of "introduction to the concept and system structure of hive". After the study of this article, I believe you have a deeper understanding of the concept and system structure of hive, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.