What are the high-frequency interview questions for Hive? 04/17 Update SLTechnology News&Howtos

What are the high-frequency interview questions for Hive?

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what are the Hive high-frequency interview questions". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "what are the Hive high-frequency interview questions?"

What is Hive and why do you use Hive? how do you understand Hive?

Interviewers often come up with a "soul three questions", many friends who are not prepared in advance basically answer stumbling, the effect is not very good. Brother fungus's answer is posted below:

Hive is a data warehouse tool based on Hadoop, which can map structured data files to a database table and provide SQL-like query function (HQL). Hive is essentially the task of converting SQL to MapReduce to perform operations.

Personal understanding: hive stores the mapping relationship with hdfs, hive is a logical data warehouse, the actual operation is the files on hdfs, HQL is a mr program written in SQL syntax.

Second, introduce the structure of Hive

Hive can be accessed through clients such as CLI,JDBC and ODBC. In addition, Hive also supports WUI access

Hive internal execution flow: parser (parsing SQL statements), compiler (compiling SQL statements into MapReduce programs), optimizer (optimizing MapReduce programs), executors (submitting the results of running MapReduce programs to HDFS)

The metadata of Hive is stored in databases, such as MySQL,SQLServer,PostgreSQL,Oracle and Derby. The metadata information in Hive includes table name, column name, partition and its properties, table properties (including whether it is an external table), table data directory, and so on.

Hive converts most of the HiveSQL statements into MapReduce jobs and submits them to the Hadoop for execution; a few HiveSQL statements are not converted into MapReduce jobs and are output sequentially after getting data directly from the DataNode.

III. Comparison between Hive and database

Hive and databases are not actually comparable, and there are no similarities except for similar query languages.

Data storage location

The Hive is stored in HDFS, and the database stores the data in the block device or in the local file system.

Data update

Rewriting data is not recommended in Hive, and data in a database usually needs to be modified frequently.

Execution delay

Hive execution latency is high. The execution latency of the database is low. Of course, this is conditional, that is, the data scale is small, when the data scale is larger than the processing capacity of the database, the parallel computing of Hive can obviously show its advantages.

Data scale

Hive supports large-scale data computing; databases can support small-scale data.

What Hive functions have you known and used?

There are a lot of answers to this.

For example, the common relation function =

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.