Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Hive common functions: Hive data import and export mode

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

As a data warehouse, Hive stores a large amount of data used by users. In the normal use of Hive, it is inevitable to import external data into Hive or export data from Hive. Today, let's mainly learn several ways of data import and export of Hive.

1. Hive data import mode

Here are four main types:

Import data from the local file system into the Hive table

Import data from HDFS to Hive table

Query the corresponding data from other tables and import them into the Hive table

When you create a table, you query the corresponding records from other tables and insert them into the created table.

1. Import data from the local file system into the Hive table

Basic syntax:

Load data local inpath local file path into table Hive table name

First create a table in Hive (demo is online), as follows:

Hive > create table wyp (id int, name string,age int, tel string) ROW FORMAT DELIMITEDFIELDS TERMINATED BY'\ t'STORED AS TEXTFILE;OKTime taken: 2.832 seconds

This table is very simple, there are only four fields, I will not explain the specific meaning. There is a / home/wyp/wyp.txt file in the local file system, which is as follows:

[wyp@master ~] $cat wyp.txt

1 wyp 25 13188888888888

2 test 30 13888888888888

3 zs 34 899314121

The data columns in the wyp.txt file are split using\ t. You can import the data in this file into the wyp table with the following statement, as follows:

Hive > load data local inpath 'wyp.txt' into table wyp;Copying data from file:/home/wyp/wyp.txtCopying file: file:/home/wyp/wyp.txtLoading data to table default.wypTable default.wyp stats: [num _ partitions: 0, num_files: 1, num_rows: 0, total_size: 67] OKTime taken: 5.967 seconds

In this way, you can import the contents of wyp.txt into the wyp table and view them in the data directory of the wyp table, as shown in the following command:

Hive > dfs-ls / user/hive/warehouse/wyp

Found 1 items

-rw-r--r--3 wyp supergroup 67 2014-02-19 18:23 / hive/warehouse/wyp/wyp.txt

It is important to note that:

Unlike the relational database we are familiar with, Hive does not support giving the literal form of a set of records directly in the insert statement, that is, Hive does not support INSERT INTO. . A statement in the form of VALUES.

2. Import data into Hive table on HDFS

Basic syntax:

Load data inpath HDFS file path into table Hive table name

In the process of importing data into the Hive table from the local file system, you are actually temporarily copying the data to a directory in HDFS (typically to the uploaded user's HDFS home directory, such as / home/wyp/), and then moving the data from that temporary directory (note, this is about moving, not copying! Into the data directory of the corresponding Hive table In that case, Hive certainly supports moving data directly from a directory on HDFS to the data directory of the corresponding Hive table, assuming the following file / home/wyp/add.txt, as follows:

[wyp@master / home/q/hadoop-2.2.0] $bin/hadoop fs-cat / home/wyp/add.txt

5 wyp1 23 131212121212

6 wyp2 24 134535353535

7 wyp3 25 132453535353

8 wyp4 26 154243434355

The above is the content that needs to be inserted. This file is stored in the HDFS / home/wyp directory (different from the one mentioned in one, the file mentioned in one is stored on the local file system). We can import the contents of this file into the Hive table with the following command, as follows:

Hive > load data inpath'/ home/wyp/add.txt' into table wyp;Loading data to table default.wypTable default.wyp stats: [num _ partitions: 0, num_files: 2, num_rows: 0, total_size: 215] OKTime taken: 0.47 secondshive > select * from wyp;OKwyp1 23 131212121212wyp2 24 134535353535wyp3 25 132453535353wyp4 26 154243434355wyp 25 13188888888888test 30 13888888888888zs 34 899314121Time taken: 0.096 seconds, Fetched: 7 row (s)

From the execution results above, we can see that the data is indeed imported into the wyp table! Please note that there is no word local in load data inpath'/ home/wyp/add.txt' into table wyp;. This is the difference from No.1 Middle School.

3. Query the corresponding data from other tables and import them into the Hive table

Basic syntax:

Insert into table target table name [partition (partition field = value)] select A set of fields from source table name

Suppose there is a test table in Hive, and its table creation statement is as follows:

Hive > create table test (id int, name string,tel string) partitioned by (age int) ROW FORMAT DELIMITEDFIELDS TERMINATED BY'\ t'STORED AS TEXTFILE;OKTime taken: 0.261 seconds

In general, it is similar to the table-building statement of the wyp table, except that the test table uses age as the partition field. For partitions, here is an explanation:

Partitions: in Hive, each partition of a table corresponds to the corresponding directory under the table, and the data of all partitions is stored in the corresponding directory. For example, if the wyp table has two partitions, dt and city, the directory of the corresponding dt=20131218,city=BJ table is / user/hive/warehouse/dt=20131218/city=BJ, and all data belonging to this partition is stored in this directory.

The following statement inserts the query results from the wyp table into the test table:

Hive > insert into table testpartition (age='25') select id, name, telfrom wyp

# #

A bunch of Mapreduce task information is output here, omitted here

# #

Total MapReduce CPU Time Spent: 1 seconds 310 msec

OK

Time taken: 19.125 seconds

Hive > select * from test;OK5 wyp1 13121212256 wyp2 134535353535 257 wyp3 132453535353 258 wyp4 1542434355 251 wyp 131888888888 252 test 1388888888 253 zs 899314121 25Time taken: 0.126 seconds, Fetched: 7 row (s)

Here's an explanation: we know that our traditional block form, insert into table values (Field 1, Field 2), is not supported by hive.

4. When creating a table, query the corresponding records from other tables and insert them into the created table

Basic syntax:

Create table new table name

As select A set of fields separated by commas for from source table names

In practice, the output of the table may be too many to be displayed on the console. At this time, it is very convenient to store the query output of Hive directly in a new table. We call this situation CTAS (create table. As select) is as follows:

Hive > create table test4

> as > select id, name, tel > from wyp

Hive > select * from test4

OK

5 wyp1 131212121212

6 wyp2 134535353535

7 wyp3 132453535353

8 wyp4 154243434355

1 wyp 13188888888888

2 test 13888888888888

3 zs 899314121

Time taken: 0.089 seconds, Fetched: 7 row (s)

The data is inserted into the test4 table, and the CTAS operation is atomic, so if the select query fails for some reason, the new table will not be created!

II. Hive data export mode

These methods can be divided into three types depending on where they are exported:

Export to local file system

Export to HDFS

Export to another table in Hive.

1. Export to the local file system

Basic syntax:

Insert overwrite local directory 'local file path'

Select Field from Hive Table name

Export the data from the above table to the local file system:

Hive > insert overwrite local directory'/ home/wyp/wyp'

> select * from wyp

The execution of this HQL needs to enable Mapreduce completion, and after running this statement, a file will be generated in the / home/wyp/wyp directory of the local file system, which is the result of Reduce.

2. Export to HDFS

Basic syntax:

Insert overwrite directory 'HDFS file path'

Select * from Hive table name

Export the above table data to HDFS:

Hive > insert overwrite directory'/ home/wyp/hdfs'

Select * from wyp

The exported data will be saved in the / home/wyp/hdfs directory of HDFS. Note that if there is one less local than the HQL that exports the file to the local file system, the path to the data is different.

3. Export to another table in Hive

As above, the corresponding data is queried from other tables and imported into the Hive table.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report