Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to import and export Hive data to mysql

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article is about how to import and export Hive data into and out of mysql. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Hive location: ETL (data Warehouse) tool

A tool that extracts (extract), transform (transform), and loads (load) data from the source to the destination, such as kettle

DML

Bulk insert / bulk Import

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2...)]

Note: filepath can be a hdfs path or a S3 path, such as hdfs://namenode:9000/user/hive/project/data1

1. Import from a local file to a table

Load data local inpath 'test.txt' into table test

two。 Import from hdfs to table

Load data inpath'/ home/test/add.txt' into table test

3. Import from a table query to a table

Insert into table test select id, name, tel from test

4. Import query data into multiple tables

From source_table

Insert into table test select id, name, tel from dest1_table select src.* where src.id < 100

Insert into table test select id, name, tel from dest2_table select src.* where src.id < 100

Insert into table test select id, name, tel from dest3_table select src.* where src.id < 100

5. Import when creating a table

Create table test4 as select id, name, tel from test

Specify delimiter export data

Insert overwrite local directory'/ home/hadoop/export_hive'

Row format delimited

Fields terminated by'\ t'

Select * from test

Delete / empty

1. Delete data that do not meet the criteria in table1

Insert overwrite table table1

Select * from table1 where XXXX

two。 Clear the table

Insert overwrite table t_table1

Select * from t_table1 where 1: 0

3. Truncated table (Note: external table cannot be truncated)

Truncate table table_name

4. Delete the table data corresponding to hdfs to clear the table (the table structure still exists)

Hdfs dfs-rmr / user/hive/warehouse/test

Note: 1 and 2 are essentially overwritten tables to clean up data.

Delete and update

Transactions are not supported by default in hive, so delete and update are not supported by default. If you need support, you must configure and open it in hive-site.xml.

DDL

Library / table / index / view / partition / bucket

Database

List / create / modify / delete / view information

1. List all databases

Show databases

two。 Create a database

Create database test

3. Delete

Drop database test

For security reasons, the database with data directly drop will report an error. In this case, the cascade keyword needs to ignore the error report and delete it.

Drop database if exists test cascade

4. View database information

Describe database test

Table

List / create / modify / delete / view information

1. List all tables

All tables in the current database

Show tables

Specify all tables in the database

Show tables in db_name

Support regularity

Show tables. * s'

two。 Create a tabl

Create table test

(id int

A string

)

ROW FORMAT DELIMITED line segmentation

FIELDS TERMINATED BY', 'field delimiter

LINES TERMINATED BY'\ n' line delimiter

STORED AS TEXTFILE; is stored as text

Create a table based on regular split line fields

Add jar.. / build/contrib/hive_contrib.jar

CREATE TABLE apachelog (

Host STRING

Identity STRING

User STRING

Time STRING

Request STRING

Status STRING

Size STRING

Referer STRING

Agent STRING)

ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'

WITH SERDEPROPERTIES (

"input.regex" = "([^] *) (- | / / [[^ / /]] * / /]) ([^ /"] * | / "[^ /"] * / ") (- | [0-9] *) (?: ([^ /"] *) / ([^ / "] * /") ([^ / "] * /) ([^ /"] * | / "[^ /"] * / ")?"

"output.format.string" = "1$ s 2$ s 3$ s 4$ s 5$ s 6$ s 7$ s 8$ s 9$ s"

)

STORED AS TEXTFILE

3. Modify

Add a new column

ALTER TABLE test ADD COLUMNS (new_col2 INT COMMENT'a comment')

Change the name of the table

ALTER TABLE old_name RENAME TO new_name

4. Delete

Drop table test

5. View information

Show column information

Desc test

Show parts list information

Desc formatted test

Indexes

Create an index

CREATE INDEX index_name

ON TABLE base_table_name (col_name,...)

AS 'index.handler.class.name'

Such as: DROP INDEX index_name ON table_name

Rebuild the index

ALTER INDEX index_name ON table_name [PARTITION (...)] REBUILD

Such as: alter index index1_index_test on index_test rebuild

Delete index

DROP INDEX index_name ON table_name

List Index

Show index on index_test

View

CREATE VIEW [IF NOT EXISTS] view_name [(column_name [COMMENT column_comment],...)] [COMMENT view_comment] [TBLPROPERTIES (property_name = property_value,...)] AS SELECT

Note: hive only supports logical views, not materialized views.

? Add view

? If no table name is provided, the name of the view column is automatically generated by the defined SELECT expression

? If you modify the properties of the basic table, it will not be reflected in the view and the invalid query will fail.

? View is read-only and cannot use LOAD/INSERT/ALTER

? Delete View DROP VIEW view_name

Zoning (focus)

List / create / modify / delete

1. List all partitions of a table

Show partitions test

two。 Create a partition table

Create table test

(id int

A string

)

Partitioned by (b string,c int)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY','

LINES TERMINATED BY'\ n'

STORED AS TEXTFILE

3. Add a partition to an existing table

ALTER TABLE test ADD IF NOT EXISTS

PARTITION (year = 2017) LOCATION'/ hiveuser/hive/warehouse/data_zh.db/data_zh/2017.txt'

4. Delete partition

ALTER TABLE test DROP IF EXISTS PARTITION (year = 2017)

5. Load data into a partition table

LOAD DATA INPATH'/ data/2017.txt' INTO TABLE test PARTITION (year=2017)

6. Unpartitioned table data import partitioned table

Insert overwrite table part_table partition (YEAR,MONTH) select * from no_part_table

7. Dynamic partition instruction

Set hive.exec.dynamic.partition=true

Set hive.exec.dynamic.partition.mode=nonstrict

# set hive.enforce.bucketing = true

You can omit the steps to specify a partition when importing data after dynamic partitioning is turned on

LOAD DATA INPATH'/ data/2017.txt' INTO TABLE test PARTITION (year)

Split bucket

CREATE TABLE bucketed_user (id INT) name STRING)

CLUSTERED BY (id) INTO 4 BUCKETS

For each table (table) or partition, Hive can be further organized into buckets, that is, buckets are finer-grained data range partitions. Hive is also an organization that conducts buckets for a column. Hive uses a hash of column values, and then divides the remainder by the number of buckets to determine which bucket the record is stored in.

There are two reasons to organize tables (or partitions) into Bucket:

(1) to obtain higher query processing efficiency. Buckets add an extra structure to the table that Hive can take advantage of when dealing with some queries. Specifically, joining two tables with buckets on the same column (containing join columns) can be efficiently implemented using Map-side joins (Map-side join). Such as JOIN operations. For JOIN operations, two tables have the same column, if a bucket operation is performed on both tables. Then the bucket that holds the same column values can be operated by JOIN, which can greatly reduce the amount of data in JOIN.

(2) make sampling more efficient. When dealing with a large data set, it will bring a lot of convenience if the query can be run on a small part of the data in the stage of developing and modifying the query.

Thank you for reading! On "how to import and export Hive data mysql" this article is shared here, I hope the above content can be of some help to you, so that you can learn more knowledge, if you think the article is good, you can share it out for more people to see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report