Deployment and testing of hadoop2.6.5+sqoop1.4.6 Environment (3) 07/01 Update SLTechnology News&Howtos

Deployment and testing of hadoop2.6.5+sqoop1.4.6 Environment (3)

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

After the hadoop environment is built, the next step is to integrate sqoop so that it can use hadoop and mysql-connector-java to extract data from MySQL and transfer it to hdfs.

1. Extract the resulting sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz into / usr/local/ and establish a / usr/local/sqoop soft link.

Mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz / usr/local/tar-xvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gzln-s / usr/local/sqoop-1.4.6.bin__hadoop-2.0.4-alpha / usr/local/sqoop

two。 Modify the / usr/local/sqoop,/usr/local/sqoop-1.4.6.bin__hadoop-2.0.4-alpha master group to hadoop to ensure that hadoop users can use:

Chown-R hadoop:hadoop / usr/local/sqoop-1.4.6.bin__hadoop-2.0.4-alphachown-R hadoop:hadoop / usr/local/sqoop

3. Configure the SQOOP_HOME environment variable and add and modify the following records in / etc/profile:

Export SQOOP_HOME=/usr/local/sqoopexport PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SQOOP_HOME/bin:$PATH

4. Configure sqoop to ensure that it can be integrated into hadoop:

1) go to the $SQOOP_HOME/conf directory, copy a sqoop environment configuration script sqoop-env-template.sh to the current directory, and rename it sqoop-env.sh:

Cd $SQOOP_HOME/confcp sqoop-env-template.sh sqoop-env.sh

2) modify the values of HADOOP_COMMON_HOME and HADOOP_MAPRED_HOME variables of sqoop-env.sh to correspond to the corresponding hadoop file paths:

Export HADOOP_COMMON_HOME=/usr/local/hadoopexport HADOOP_MAPRED_HOME=/usr/local/hadoop/share/hadoop/mapreduce

3) sqoop needs to connect to mysql and run mapreduce program to complete data extraction, so you need to support the corresponding library files of mysql-connector and mapreduce. Copy the mysql-connector-java package and all jar packages under $HADOOP_HOME/share/hadoop/mapreduce/ to the $SQOOP_HOME/lib directory:

Cp $HADOOP_HOME/share/hadoop/mapreduce/*.jar $SQOOP_HOME/lib/cp ~ / mysql-connector-java-5.1.32-bin.jar $SQOOP_HOME/lib/chown-R hadoop:hadoop $SQOOP_HOME/lib/

5. Now you can use the sqoop script for data extraction, which is in the $SQOOP_HOME/bin directory as follows:

# Test whether the database can connect to sqoop list-databases-- connect jdbc:mysql://localhost:3306/actionLog\-- username root-- P (if the database name is returned Then you can connect to the mysql database through sqoop) # extract data from the MySQL library to sqoop import on hdfs-- connect jdbc:mysql://hadoop-test-nn:3306/actionLog\-- username root-P\-- table log\-- columns "logger_id Time "--where 'action =" login "\-- target-dir / test/loginInfo option states:-- username database user name-P uses interactive concealment and input of database user password-- table specifies the exported library table name-- columns specifies which columns of the table's data are exported-- where can filter the exported records by adding similar where conditions in the sql statement-- the path where the data exported by target-dir is stored on hdfs. The path value here is the path on hdfs, not the absolute path of the file system itself.

The above sqoop import command is used to extract data from the log table in the actionLog library on mysql. The table structure is as follows:

Because export columns are specified as logger_id, time. So the data exported to hdfs is as follows:

[hadoop@hadoop-test-nn lib] $hdfs dfs-ls / test/loginInfo Found 1 items-rw-r--r-- 2 hadoop supergroup 211825 2017-08-02 16:04 / test/loginInfo/userLoginInfo.txt [hadoop@hadoop-test-nn lib] $hdfs dfs-cat / test/loginInfo/userLoginInfo.txtwanger,2017-07-27 14ls 21zhangsanqie 2017-07-27 14VOG 37jamesMagi 2017-07-27 15JV 2713...

(note: the text content under / test/loginInfo is merged and restored here. In actual use, multiple texts named in part-** format will be generated under this directory, and the format of the text content is consistent.)

Now the data has been successfully extracted and stored on hdfs as text. Now you can write a mapreduce program to analyze the text.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.