Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Sqoop note arrangement

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

[toc]

Summary of Sqoop Note processing SQOOP-data handlers can migrate external data to hdfs directories or hive tables or hbase tables to obtain metadata information (schema, table, field, field type) from traditional databases based on import principles, and convert the import function into Map-only Mapreduce jobs. There are many map in mapreduce, each map reads a piece of data, and then copies the data in parallel. The export principle obtains the schema and meta information of the export table, and the field match; in Hadoop runs several map only jobs at the same time to export the data in hdfs to the relational database. Sqoop installation and download address: decompress after https://mirrors.tuna.tsinghua.edu.cn/apache/sqoop/1.4.7/ download: tar-zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz-C / home/uplooking/app rename: [uplooking@uplooking01 ~] $mv app/sqoop-1.4.7.bin__hadoop-2.6.0/ app/sqoop configure SQOOP_HOME to ring Export SQOOP_HOME=/home/uplooking/app/sqoop export PATH=$PATH:$SQOOP_HOME configuration $SQOOP_HOME/conf/sqoop-env.sh export HADOOP_COMMON_HOME=/home/uplooking/app/hadoop export HADOOP_MAPRED_HOME=/home/uplooking/app/hadoop export HBASE_HOME=/home/uplooking/app/hbase export HIVE_HOME=/home/uplooking/app/hive export ZOOCFGDIR=/home/uplooking/app/zookeeper/conf in the context variable the following needs to be configured in 1.4.7 Otherwise, an error will be reported when data is imported into hive Note: 1. Database driver: copy the relevant database driver jar package to the $SQOOP_HOME/lib directory in the execution of sqoop. For example, mysql requires support of mysql-connector-java-5.1.32-bin.jar or above. 2. JDK version JDK version is preferred above 1.7. 3. Copy the core package of hive (this needs to be configured in 1.4.7, otherwise it will report an error when data is imported into hive) copy $HIVE_HOME/lib/hive-exec.jar to the $SQOOP_HOME/lib directory, otherwise it will report 15:50:54 on 18-03-15 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly. 15:50:54 on 18-03-15 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf data Import import Import data from mysql to HDFS

Data in the people table:

+-+ | id | name | age | height | +-+ | 1 | Sweets | 18 | 168 | 2 | Xiao Dandan | 19 | 167 | 3 | God | 25 | 181 | | 4 | | | Colonel | 38 | 158 | | 5 | Reporter | 22 | 169 | +-- + |

Data Import:

Sqoop import-- connect jdbc:mysql://192.168.43.116:3306/test-- username root-- password root-- table people imports the table people from the msyql data test into the hdfs directory under / user/ user / people, where people is the imported table name

This is the default directory to which sqoop is imported. If you want to import to the specified directory, add an option-- target-dir:

Sqoop import-connect jdbc:mysql://192.168.43.116:3306/test-username root-password root-table people-target-dir / output/sqoop/people

Because there are 4 maptasks tasks by default to execute sqoop, in order to meet the needs of the business, you can modify it by adding an option-m after the command:

Sqoop import-- connect jdbc:mysql://192.168.43.116:3306/test-- username root-- password root-- table people-- target-dir / output/sqoop/people-m 2

In the process of execution, if the output directory already exists and an error is reported, use the option-- delete-target-dir to export to that directory:

Sqoop import-- connect jdbc:mysql://192.168.43.116:3306/test-- username root-- password root-- table people-- target-dir / output/sqoop/people-m2-- delete-target-dir

If you want to add new data to the original, you only need to add an option-- append, but note that-- append and-- delete-target-dir cannot exist at the same time:

Sqoop import-- connect jdbc:mysql://192.168.43.116:3306/test-- username root-- password root-- table people-- target-dir / output/sqoop/people-m2-- append

Condition Import:

Sqoop import-connect jdbc:mysql://192.168.43.116:3306/test-username root-password root-table people-target-dir hdfs://ns1/input/sqoop/people-append-M1-where "age

< 20 and height >

167 "

Import through sql:

#! / bin/bashSQOOP_HOME=/home/uplooking/app/sqoopsqoop import\-connect jdbc:mysql://192.168.43.116:3306/test\-username root\-password root\-target-dir hdfs://ns1/input/sqoop/people\-query "select id, name, age, height from people where age

< 30 and height >

And\ $CONDITIONS "\-- append-M1\ imported from mysql to hivesqoop import-- connect jdbc:mysql://192.168.43.116:3306/test-- username 'root'-- password' root'-- table people-- hive-import-M1 overwrite data (overwrite data only Do not override the table structure) sqoop import-- connect jdbc:mysql://192.168.43.116:3306/test-- username 'root'-- password' root'-- table people-- hive-import-M1-- hive-overwrite create the table name sqoop import-- connect jdbc:mysql://192.168.43.116:3306/test-- username 'root'-- password' root'-- table people-- hive-import-M1-- hive-table "hpeople" -- hive-overwrite exports all tables to hive sqoop import-all-tables-- connect jdbc:mysql://192.168.43.116:3306/test-- username root-- password root-- hive-import-- fields-terminated-by "\ 001"-- lines-terminated-by "\ n" # where row and column separators are specified Because by default, data imported by MySQL into hive is delimited by commas, and # data is also delimited by commas when exported from hive (HDFS, to be exact) to mysql In particular, it should be noted that when importing from mysql to HBasesqoop import\-- connect jdbc:mysql://192.168.43.116:3306/test\-- username 'root'\-- password' root'\-- table people\-- hbase-create-table\-- hbase-row-key id\-- hbase-table hpeople\-- column-family cf, the key is used by default without the primary key, and the federated primary key cannot be processed temporarily. It is best to establish the relevant table structure in hbase now.

View the data of the table in HBase:

Hbase (main): 002hpeople'ROW COLUMN+CELL 0 > scan 'hpeople'ROW COLUMN+CELL 1 column=cf:age, timestamp=1521846328316, value=18 1 column=cf:height, timestamp=1521846328316 Value=168.0 1 column=cf:name, timestamp=1521846328316, value=\ xE5\ xB0\ x8F\ xE7\ x94\ x9C\ xE7\ x94\ x9C 2 column=cf:age, timestamp=1521846328096, value=19 2 column=cf:height, timestamp=1521846328096 Value=167.0 2 column=cf:name, timestamp=1521846328096, value=\ xE5\ xB0\ X8F\ xE4\ xB8\ xB9\ xE4\ xB8\ xB9 3 column=cf:age, timestamp=1521846329182, value=25 3 column=cf:height, timestamp=1521846329182 Value=181.0 3 column=cf:name, timestamp=1521846329182, value=\ xE5\ xA4\ xA7\ xE7\ xA5\ x9e 4 column=cf:age, timestamp=1521846328454, value=38 4 column=cf:height, timestamp=1521846328454 Value=158.0 4 column=cf:name, timestamp=1521846328454, value=\ xE5\ x9B\ xA2\ xE9\ x95\ xBF 5 column=cf:age, timestamp=1521846330135, value=22 5 column=cf:height, timestamp=1521846330135 Value=169.0 5 column=cf:name, timestamp=1521846330135, value=\ xE8\ xAE\ xB0\ xE8\ x80\ x85 5 row (s) in 0.1620 seconds data Export export

Data is exported to mysql, with commas as delimiters by default

When exporting from HDFS to mysql, the fields need to correspond to sqoop export\-- connect jdbc:mysql://192.168.43.116:3306/test\-- username root-- password root\-- table people\-- export-dir hdfs://ns1/input/sqoop/people Chinese garbled code: sqoop export\-- connect "jdbc:mysql://192.168.43.116:3306/test?useUnicode=true&characterEncoding=utf-8"\-- username root\-- password Root\-- table people\-- export-dir hdfs://ns1/input/sqoop/people insert or update if present Insert if it doesn't exist (refers to a record Instead of the table itself) sqoop export--connect "jdbc:mysql://192.168.43.116:3306/test?useUnicode=true&characterEncoding=utf-8"\-- username root\-- password root\-- table people--export-dir / export- M1\-- update-key id\-- update-mode allowinsert exports from Hive to mysql and imports data similar to-input-fields-terminated-by parsing the data on HDFS to the database using the parameter sqoop export\-- connect Jdbc:mysql://192.168.43.116:3306/test\-- username root\-- password root\-- table people\-- export-dir / user/hive/warehouse/hpeople--input-fields-terminated-by'\ 001' Export from HBase to mysql

There is currently no way to export directly from HBase, but you can export the data first to HDFS (through the integration of Hive and HBase), and then from HDFS to mysql.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report