In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
I. Overview:
Sqoop is an open source tool, mainly used in Hadoop (such as Hive, HDFS) and traditional databases (mysql, Oracle...). Data transfer between can import the data from a relational database (such as MySQL, Oracle...) into the HDFS of Hadoop, and also import the data from HDFS into the relational database. Its essence is to convert import and export commands into MapReduce programs to achieve.
II. Installation and configuration
1. Modify the configuration file sqoop-env.sh:
# Set path to where bin/hadoop is available
Export HADOOP_COMMON_HOME=/home/hadoop/app/hadoop-2.4.1
# Set path to where hadoop-*-core.jar is available
Export HADOOP_MAPRED_HOME=/home/hadoop/app/hadoop-2.4.1
# set the path to where bin/hbase is available
Export HBASE_HOME=/home/hadoop/app/hbase-0.96.2-hadoop2
# Set the path to where bin/hive is available
Export HIVE_HOME=/home/hadoop/app/hive-0.12.0-bin
# Set the path for where zookeper config dir is
Export ZOOCFGDIR=/home/hadoop/app/zookeeper-3.4.5/conf
2. Add sqoop to the environment variable.
3. Copy the database connection driver to $SQOOP_HOME/lib.
3. Import and export:
1. Import the data from the database into HDFS:
(1) specify the imported field:
Sqoop import
-- connect jdbc:mysql://192.168.1.10:3306/itcast
-- username root-- password
-- table trade_detail
-- columns'id, account, income, expenses'
(2), specify the output path, specify the data separator:
Sqoop import
-- connect jdbc:mysql://192.168.1.10:3306/itcast
-- username root-- password
# # tables to import data
-- table trade_detail
# # Directory where the data is imported into hdfs
-- target-dir'/ sqoop/td'
# # Separator between imported data fields
-fields-terminated-by'\ t'
(3) specify the number of Map-m
Sqoop import
-- connect jdbc:mysql://192.168.1.10:3306/itcast
-- username root-- password
-- table trade_detail
-- target-dir'/ sqoop/td1'
-fields-terminated-by'\ t'
# # specify the number of map tasks for import processing
-m 2
(4) add where condition, note: the condition must be enclosed in quotation marks
Sqoop import
-- connect jdbc:mysql://192.168.1.10:3306/itcast
-- username root-- password
-- table trade_detail
-- where'id > 3'
-- target-dir'/ sqoop/td2'
(5) add query statement (use\ to wrap the statement)
Sqoop import
-- connect jdbc:mysql://192.168.1.10:3306/itcast
-- username root-- password
-- query 'SELECT * FROM trade_detail where id > 2 AND $CONDITIONS'
-- split-by trade_detail.id
-- target-dir'/ sqoop/td3'
Note: if you use the command-- query, you need to pay attention to the parameter after where, and the parameter AND $CONDITIONS must be added.
And there is a difference between single quotation marks and double quotation marks. If-- query is followed by double quotation marks, you need to put\ that is\ $CONDITIONS before $CONDITIONS.
If you set the number of map to-m 1, you do not need to add-- split-by ${tablename.column}, otherwise you need to add
2. Export the file data on HDFS to the table of the database:
Sqoop export
-- connect jdbc:mysql://192.168.8.120:3306/itcast
-- username root-- password
# # the directory where the data you want to export is located
-- export-dir'/ td3'
# # the target relationship table you want to lead to
-- table td_bak
-m 1
# # Field delimiter of the file you want to export
-fields-termianted-by'\ t'
Use python to import and export tables:
Import os
From_table= "rpt_daily"
To_table= "rpt_daily"
Sqoop1= "sqoop import-connect jdbc:mysql://172.30.200.219/bi_warehouse-username root"\
"--password artisan-- table" + from_table+ "--fields-terminated-by'\ 001'-- target-dir / db/as_main/modifier/lzf/" + from_table+ "--delete-target-dir-- num-mappers 1"
Os.system (sqoop1)
Sqoop2= "sqoop export-connect\" jdbc:mysql://192.168.1.4/bi_warehouse?useUnicode=true&characterEncoding=utf-8\ "- username root-password root-table" + to_table+ ""\
"--export-dir / db/as_main/modifier/lzf/" + to_table+ "/ part-m-00000-- input-fields-terminated-by'\ 001'"
Os.system (sqoop2)
There are two ways to execute:
Method 1. On the linux line, execute python, enter python, then paste the above content and enter enter.
Method 2. On the linux line, execute python test2.py, that is, the python call file (the name of the file saved by test2.py for the above)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.