Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Introduction of sqoop and deployment and installation

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

1. Introduction to SQOOP (1) Introduction:

  Sqoop is Apache's tool for "transferring data between hadoop and relational database servers."

  Import data: MySQL, Oracle import data to hadoop hdfs, hive, HBASE and other data storage systems.

   Export data: export data from hadoop file systems to relational databases.

(2) Working mechanism

  The import and export commands are translated into MapReduce programs to implement, and MapReduce programs do not require reducetask. In the translated MapReduce, it is mainly aimed at customizing InputFormat and OutputFormat.

(3) Import and export principle of sqoop

Data import:

  SQOOP tool is import job through MapReduce. In general, rows of records from a table in a relational database are written to hdfs.

Explanation:

 - SQOOP will obtain the metadata information of the required database through jdbc, such as the column name and data type of the imported table.

 - These database data types are mapped to java data types, and based on this information, sqoop generates a class with the same table name to complete the serialization and save each row in the table.

 - sqoop Open MapReduce jobs

 - The job started reads the contents of the data table through jdbc during input, and serializes it using classes generated by sqoop.

 - Finally, these records are written to hdfs, which is also deserialized using classes generated by sqoop.

Data export:

Explanation:

 - First, sqoop accesses the relational database through jdbc to obtain the metadata information of the data to be exported.

 - According to the metadata information obtained, sqoop generates a java class for data transmission carrier, which must be serialized.

 - Start MapReduce

 - sqoop uses this java class generated to read data from hdfs in parallel

 - Each map job generates a batch of insert statements based on the metadata information of the exported table and the data read. Then multiple map jobs insert data into the database mysql in parallel.

Summary: Data is read from hdfs concurrently, and it is also written concurrently. The parallel read depends on the performance of hdfs, while the parallel write to MySQL depends on the performance of MySQL.

2. Installation of sqoop

Prerequisites: Java and Hadoop environments are already available.

Installation package download address: ftp.wayne.edu/apache/sqoop/1.4.6/

qoop-1.4.6.bin_hadoop-2.0.4-alpha.tar.gz (sqoop1.x version) installed here

Specific installation:

Decompress: tar -zxvf sqoop-1.4.6.bin_hadoop-2.0.4-alpha.tar.gz -C /applications fix configuration file: #sqoop-env.sh[hadoop hadoop01@ ~]$cd /application/sqoop-1.4.6/conf/[hadoop hadoop01@ ~]$mv sqoop-env-template.shsqoop-env.sh [hadoop hadoop01@ ~]$vim sqoop-env.sh

export HADOOP_COMMON_HOME=/application/hadoop-2.7.6export HADOOP_MAPRED_HOME=/application/hadoop-2.7.6export HIVE_HOME=/application/apache-hive-2.3.2-binexport ZOOCFGDIR=/application/zookeeper-3.4.10/conf

Note: Here HADOOP_COMMON_HOME and HADOOP_MAPRED_HOME are installed in one directory, but we are installing the open source version of hadoop: so these two are installed in one directory, but in the commercial version of hadoop these two configurations are installed in different directories.

Import MySQL driver package into sqoop1.4.6/lib directory (mysql-connector-java-5.1.40-bin.jar) Configure environment variable [hadoop hadoop01@ ~]$vim /etc/profileexport SQOOP_HOME=/application/sqoop-1.4.6export PATH=$PATH:$ZOOKEEPER_HOME/bin:$SQOOP_HOME/bin[hadoop hadoop01@ ~]$source /etc/profile Verify installation success (sqoop version)

The installation was successful!!!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report