Why the Hive task executes slowly but imports data very fast? 04/21 Update SLTechnology News&Howtos

Why the Hive task executes slowly but imports data very fast?

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains the "Hive task execution is slow but the import of data is very fast is why", the article explains the content is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in-depth, together to study and learn "Hive task execution is slow but the import of data is very fast is why" it!

Read-time mode and write-time mode

Hive uses Hadoop to execute queries, and its query execution is slow, but using load data to import data into Hive is very fast because Hive is in read-time (Schema On Read) mode.

Read-time mode: when reading data, check the type and format of the data

Write-time mode: when writing data, check the data type, format and other specifications

When saving the data to the Hive data table, Hive uses the "read-time mode", which means that no verification is made for the write operation, but simply copies the file to the HDFS directory corresponding to the Hive table. Corresponding to the "read-time mode" is the "write-time mode". RDBMS generally uses the "write-time mode". When writing data to the data table, it will check whether each record is legitimate, and if the check fails, it will directly return a failure message.

Because it is only a simple file copy and paste to store data in Hive, the speed of importing data is very fast. When reading and querying, the data is interpreted according to the table schema. At this time, if you encounter data that does not conform to the schema, Hive will directly parse the data into NULL.

The benefits of reading mode

Hive's adoption of read-time mode brings the following benefits:

It is very fast to add data to the Hive table. Usually, for foreign data, the method is to upload the file to a HDFS directory directly with the Hadoop command, and Hive reads this directory directly.

A piece of data can be parsed into multiple schemas. The data stored in the Hive table has nothing to do with Hive itself, and the data can also be processed by other tools such as Pig.

Import data

Hive > load data local inpath'/ root/usr.data' into table usr

Thank you for your reading, the above is the content of "Hive task execution is slow but import data is very fast". After the study of this article, I believe you are slow to execute Hive task but import data very fast is why this problem has a deeper understanding, the specific use of the situation also needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.