In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
The content of this article mainly focuses on how to install sqoop in Linux system. The content of the article is clear and clear. It is very suitable for beginners to learn and is worth reading. Interested friends can follow the editor to read together. I hope you can get something through this article!
Introduction to sqoop: as the name of Sqoop indicates: Sqoop is a tool for transferring data from a relational database and Hadoop to each other. You can import data from a relational database (such as Mysql, Oracle) into Hadoop (such as HDFS, Hive, Hbase), or data from Hadoop (such as HDFS, Hive, Hbase) into relational databases (such as Mysql, Oracle). As shown in the following figure: 2. Sqoop architecture Sqoop architecture: as shown in the figure above: after receiving the shell command or Java api command from the client, the Sqoop tool converts the command into the corresponding MapReduce task through the task translator (Task Translator) in Sqoop, and then transfers the data in the relational database and Hadoop to complete the copy of the data.
Sqoop-1.4.7 installation and configuration process (1) Sqoop environment premise: Hadoop
Relational database (MySQL/Oracle)
HBase
Hive
ZooKeeper
(2) extract the sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz installation package to the target directory: tar-zxvf .tar.gz-C target directory
(3) for subsequent convenience, rename the Sqoop folder: mv sqoop-1.4.7.bin__hadoop-2.6.0/ sqoop-1.4.7
(4) modify the configuration file: enter the sqoop-1.4.7/conf path and rename the configuration file:
Mv sqoop-env-template.sh sqoop-env.sh
Modify the sqoop-env.sh information: (if the environment variable is configured, you can use the
Echo $XXXXX_HOME query installation location)
Vi sqoop-env.sh
# Set path to where bin/hadoop is available export HADOOP_COMMON_HOME=Hadoop installation path # Set path to where hadoop-*-core.jar is available # export HADOOP_MAPRED_HOME=Hadoop installation path # set the path to where bin/hbase is available # export HBASE_HOME=HBase installation path # Set the path to where bin/hive is available # export HIVE_HOME=Hive installation path # Set the path for where zookeper config dir is # export ZOOCFGDIR=ZooKeeper configuration folder path copy Code (5) off Link Hive:cp / XXX/hive/conf/hive-site.xml / XXX/sqoop-1.4.7/conf/
(5) configure environment variables: modify the configuration file:
Vi / etc/profile
Add the following:
Export SQOOP_HOME=sqoop installation path
Export PATH=$PATH:$SQOOP_HOME/bin
Declare environment variables:
Source / etc/profile
(6) start to view the version number sqoop version
(7) add drivers: import MySQL drivers into sqoop/lib
Import Oracle driver to sqoop/lib
3. Sqoop operation (1) Common parameters: parameter view: Sqoop official website-> documentation-> Sqoop User Guide
Import imports data into the cluster
Export exports data from a cluster
Create-hive-table creates hive table
Import-all-tables specifies all tables in the relational database to the cluster
List-databases lists all databases
List-tables lists all database tables
Merge merges data
Codegen acquires a table data to generate JavaBean and package it with Jar
(2) Import operation of import--Sqoop: function: MySQL/Oracle-> HDFS/Hive
Modify MySQL access:
Update user set host='%' where host='localhost'
Delete from user where Host='127.0.0.1'
Delete from user where Host='bigdata01'
Delete from user where Host='::1'
Flush privileges
Use mysql
Select User, Host, Password from user
View permissions:
Modify permissions to be accessible to all users:
Operation command:
Preparatory work:
Import commands:
Enable the hive service
Create the corresponding table to import in hive
FAILED: SemanticException [Error 10072]: Database does not exist: XXXXXXXX
Reason for error: Sqoop is not associated with Hive
Solution:
Cp / XXX/hive/conf/hive-site.xml / XXX/sqoop-1.4.7/conf/
ERROR tool.ImportTool: Import failed: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://bigdata01:9000/XXXXXXXXXX already exists
Error reason: there is a path with the same name in hdfs
Solution:
Specify a new path or delete the original file in hdfs
ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
Reason for error: missing configuration of hive environment variables
Solution:-- add Hive dependency to Hadoop environment
Source / etc/profile
Export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
Vi / etc/profile
Modify the configuration file:
Add the following:
Declare environment variables:
Sqoop import-- connect jdbc:mysql://bigdata01:3306/ database name-- username root-- password password-- table table name-num-mappers 1-- hive-import-- fields-terminated-by "\ t"-- hive-overwrite-- hive-table hive database name. Table name
You can see the data information passed in the specified table in Hive.
Possible error 1:
Possible error 2:
Possible error 3:
Export commands:
Check the upload result of hdfs locally by Linux:
Use query to filter data:
Directly filter fields:
Sqoop import-- connect jdbc:mysql://bigdata01:3306/ database name # connection MySQL-- username root # username-- password XXXXXX # password-- table table name # uploaded to HDFS-- target-dir / YYYYYYY # HDFS destination folder-- num-mappers 1 # specify map run-- fields-terminated-by "\ t" # specify delimiter
Hdfs dfs-cat / XXXXXXX/part-m-00000
Sqoop import-- connect jdbc:mysql://bigdata01:3306/ database name-- username root-- password XXXXXX-- table table name-- target-dir / YYYYYYY-- num-mappers 1-- fields-terminated-by "\ t"-- query 'select * from table name where strip and $CONDITIONS' # $CONDITIONS to index mapper
Sqoop import-- connect jdbc:mysql://bigdata01:3306/ database name-- username root-- password XXXXXX-- table table name-- target-dir / YYYYYYY-- num-mappers 1-- columns field name
Upload the local mysql table to hdfs:
Upload the local mysql table to hive:
(3) Export operation of emport--Sqoop: function: HDFS/Hive-> MySQL/Oracle
Operation command:
Export commands:
Sqoop emport-- connect jdbc:mysql://bigdata01:3306/ database name # connection MySQL-- username root # username-- password XXXXXX # password-- table table name # destination mysql table-- export-dir / user/hive/warehouse/YYYYYYY # hive folder-- num-mappers 1 # specify map run-- input-fields-terminated-by "\ t" # specify delimiter
The hive table exports to the local mysql:
(4) list all databases: operation commands:
Sqoop list-databases-- connect jdbc:mysql://bigdata01:3306/-- username root-- password password
(5) obtain database table data and generate JavaBean: operation command:
Sqoop codegen-- connect jdbc:mysql://bigdata01:3306/ database name-- username root-- password password-- table table name-- bindir Linux local path # specify Jar package packaging path-- class-name class name # specify Java class name-- fields-terminated-by "\ t"
(6) merge data under different directories in hdfs: operation command:
Sqoop merge--new-data hdfs new table path-- onto hdfs old table path-- target-dir / YYYYYYY # merged hdfs path-- jar-file = # Linux local Jar package path-- class-name XXXXX # Jar package class-- merge-key id # merge basis
Note: the merge operation is an operation in which the new table replaces the old table. If there is an id conflict, the new table data replaces the old table data, and if there is no conflict, the new table data is added to the old table data.
Thank you for your reading. I believe you have some understanding of "how to install sqoop in Linux system". Go to practice quickly. If you want to know more about it, you can follow the website! The editor will continue to bring you better articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.