Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Chapter 8 of Hadoop Learning: Sqoop installation and configuration

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

1. Introduction of Sqoop

Sqoop is a tool for transferring data from Hadoop (Hive, HBase) and relational database to each other. Data from a relational database (such as MySQL, Oracle, Postgres, etc.) can be imported into Hadoop's HDFS, or HDFS data can be imported into a relational database.

Sqoop is already the top-level project of Apache, and the current versions are 1.4.4 and Sqoop2 1.99.3. This article takes version 1.4.4 as an example to explain the basic installation and configuration and the demonstration of simple applications.

The version is:

Sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz

Environment variable configuration

2.Sqoop parameter configuration

# Set Hadoop-specific environment variables here.#Set path to where bin/hadoop is available#export HADOOP_COMMON_HOME=#Set path to where hadoop-*-core.jar is available#export HADOOP_MAPRED_HOME=#set the path to where bin/hbase is available#export HBASE_HOME=#Set the path to where bin/hive is available#export HIVE_HOME=

3. Drive jar package

The following test demonstrates that if you take MySQL as an example, you need to copy the driver lib file corresponding to mysql to the / lib directory.

Test data in 4.Mysql

CREATE TABLE `demo_ blog` (`id` int (11) NOT NULL AUTO_INCREMENT, `blog` varchar (100) NOT NULL, PRIMARY KEY (`id`)) ENGINE=MyISAM DEFAULT CHARSET=utf8;CREATE TABLE `demo_ log` (`operator` varchar (16) NOT NULL, `log` blog` NOT NULL) ENGINE=MyISAM DEFAULT CHARSET=utf8; insert test data: insert into demo_blog (id, blog) values (1, "micmiu.com"); insert into demo_blog (id, blog) values (2, "ctosun.com") Insert into demo_blog (id, blog) values (3, "baby.micmiu.com"); insert into demo_log (operator, log) values ("micmiu", "create"); insert into demo_log (operator, log) values ("micmiu", "update"); insert into demo_log (operator, log) values ("michael", "edit"); insert into demo_log (operator, log) values ("michael", "delete")

II. Sqoop command operation

1.Sqoop basic command

(1) list the databases in Mysql

Sqoop list-databases-connect jdbc:mysql://Master-Hadoop:3306-username root-password rootroot

(2) list all the tables in the test database

Sqoop list-databases-connect jdbc:mysql://Master-Hadoop:3306-username root-password rootroot

(3) Import from Mysql to HDFS file

Sqoop import-connect jdbc:mysql://Master-Hadoop:3306/test-username root-password rootroot-table demo_log-split-by operator-target-dir / usr/sqoop/other

Display error: class Class demo_log not found cannot be found in the table

1. Will / tmp/sqoop-root/compile/. Copy the compiled files under / usr/local/sqoop/lib/

two。 The following annotation configuration is required

Comment out the following in the bin/configure-sqoop file # # Moved to be a runtime check in sqoop.#if [!-d "${HBASE_HOME}"]; then# echo "Warning: $HBASE_HOME does not exist! HBase imports will fail. "# echo 'Please set $HBASE_HOME to the root of your HBase installation.'#fi## Moved to be a runtime check in sqoop.#if [!-d" ${HCAT_HOME} "]; then# echo" Warning: $HCAT_HOME does not exist! HCatalog jobs will fail. "# echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'#fi

(4) Import from HDFS to Mysql

Sqoop export--connect jdbc:mysql://Master-Hadoop:3306/test-- username root-- password rootroot-- table dutao-- export-dir / usr/other/part-2-- input-fields-terminated-by'\ t'- M1

(5) Import from mysql to Hive

Sqoop import-connect jdbc:mysql://Master-Hadoop:3306/test-username root-password rootroot-table demo_log-warehouse-dir / user/sqoop/demo_log-hive-import-create-hive-table-m 1

Import hive table directly

Sqoop import-- connect jdbc:postgresql://ip/db_name--username user_name-- table table_name-- hive-import-m 5

The actual internal implementation is divided into three parts, 1. Import the data into hdfs (the corresponding directory can be found on hdfs), 2. Create a table with the same hive table name, 3, and pass the data on hdfs into the hive table

Sqoop creates the hive table based on the postgresql table

Sqoop create-hive-table-- connect jdbc:postgresql://ip/db_name-- username user_name-- table table_name-- hive-table hive_table_name (--hive-partition-key partition_name add partition name if partition is required)

Import into the table that hive has created

Sqoop import-connect jdbc:postgresql://ip/db_name-username user_name-table table_name-hive-import-M5-hive-table hive_table_name (--hive-partition-key partition_name-- hive-partition-value partititon_value)

Import hive tables using query

Sqoop import-connect jdbc:postgresql://ip/db_name-username user_name-query "select, * from retail_tb_order where\ $CONDITIONS"-hive-import-m 5-hive-table hive_table_name (--hive-partition-key partition_name-- hive-partition-value partititon_value)

Note: the $CONDITIONS condition must have. If the query clause uses double quotes, $CONDITIONS needs to be escaped with\, and if single quotation marks are used, there is no need to escape.

(6) Import the table data from hive into the mysql database table

Sqoop export--connect jdbc:mysql://Master-Hadoop:3306/test-username root-password rootroot-table demo_log-export-dir / user/hive/warehouse/demo_log/000000_1-input-fields-terminated-by'\ 0001'

The table demo_log in mysql must have been brought up and created before it can be imported.

1. Import data from a relational database into a file into the hive table, using the query statement

Sqoop import-append-connect jdbc:mysql://192.168.20.118:3306/test-username dyh-password 000000-query "select id,age,name from userinfos where\ $CONDITIONS"-M1-- target-dir / user/hive/warehouse/userinfos2-- fields-terminated-by ","

two。 To import data from a relational database into a hive table, the-- columns-- where statement uses the

Sqoop import-append-connect jdbc:mysql://192.168.20.118:3306/test-username dyh-password 000000-table userinfos-columns "id,age,name"-where "id > 3 and (age = 88 or age = 80)"-M1-- target-dir / user/hive/warehouse/userinfos2-- fields-terminated-by ","

Note:-- target-dir / user/hive/warehouse/userinfos2 can be replaced with-- hive-import-- hive-table userinfos2

(6) data between Hdfs and Hive

1. Export from Hive to local system

Hive > insert overwrite local directory'/ home/wyp/wyp' > select * from wyp

two。 From the local system to Hive

First create a table in Hive

Hive > create table wyp > (id int, name string, > age int, tel string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY'\ t'> STORED AS TEXTFILE;OKTime taken: 2.832 seconds

There is a / home/wyp/wyp.txt file in the local file system

[wyp@master] $cat wyp.txt1 wyp 25 131888888888882 test 30 138888888888883 zs 34899314121

The data columns in the wyp.txt file are split using\ t. You can import the data in this file into the wyp table with the following statement, as follows:

Hive > load data local inpath 'wyp.txt' into table wyp;Copying data from file:/home/wyp/wyp.txtCopying file: file:/home/wyp/wyp.txtLoading data to table default.wypTable default.wyp stats: [num _ partitions: 0, num_files: 1, num_rows: 0, total_size: 67] OKTime taken: 5.967 seconds

two。 Export from Hive to HDFS

Hive > insert overwrite directory'/ home/wyp/hdfs' > select * from wyp

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report