In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "the installation and configuration method of Sqoop1.x". In the daily operation, I believe that many people have doubts about the installation and configuration method of Sqoop1.x. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "installation and configuration method of Sqoop1.x". Next, please follow the editor to study!
First, install hadoop
Hadoop:
Sqoop2.x:
Http://my.oschina.net/u/204498/blog/518941
Second, install sqoop1.x
1. Select the corresponding version
[hadoop@hftclclw0001 ~] $pwd/home/hadoop [hadoop@hftclclw0001 ~] $wget [hadoop@hftclclw0001 ~] $tar-zxvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz [hadoop@hftclclw0001 ~] $cd sqoop-1.4.6.bin__hadoop-2.0.4-alpha/conf [hadoop@hftclclw0001 conf] $ls-altotal 44drwx-2 hadoop root 4096 Nov 25 04:32 .drwx-9 Hadoop root 4096 Nov 25 04:20.-rw- 1 hadoop root 818 Apr 27 2015. Gitignore-rw- 1 hadoop root 3895 Apr 27 2015 oraoop-site-template.xml-rw- 1 hadoop root 1404 Apr 27 2015 sqoop-env-template.cmd-rwx- 1 hadoop root 1345 Apr 27 2015 sqoop-env-template.sh-rw- 1 hadoop root 5531 Apr 27 2015 sqoop-site-template.xml- Rw- 1 hadoop root 5531 Apr 27 2015 sqoop-site.xml [hadoop@hftclclw0001 conf] $cp sqoop-env-template.sh sqoop-env.sh [hadoop@hftclclw0001 conf] $vim sqoop-env.shexport HADOOP_COMMON_HOME=/home/hadoop/hadoop-2.7.1#Set path to where hadoop-*-core.jar is availableexport HADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce#set the path to where bin/hbase is available = > No need Configure # export HBASE_HOME=/home/hadoop/hbase-1.0.1.1#Set the path to where bin/hive is available = > when using HBASE, then configure # export HIVE_HOME=/home/hadoop/apache-hive-1.2.1-bin#Set the path for where zookeper config dir is#export ZOOCFGDIR=export JAVA_HOME=/usr/java/latest = > when using HIVE, to install JDK, the previously installed JRE There will be problems when using it.
two。 Add the corresponding jdbc driver, I use mysql
[hadoop@hftclclw0001 lib] $pwd/home/hadoop/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/lib [hadoop@hftclclw0001 lib] $ls-al | grep mysql-rw- 1 hadoop root 848401 Nov 3 06:41 mysql-connector-java-5.1.25-bin.jar
Third, Sqoop 1.x syntax
1. Install mysql (configure the appropriate repo)
[root@hftclclw0001 opt] yum install mysql-server mysql mysql-client
two。 Start and test and add a password to the root user
[root@hftclclw0001 opt] service mysqld start [root@hftclclw0001 opt] # netstat-apn | grep 3306tcp 00 0.0.0.0root@hftclclw0001 opt 3306 0.0.0.0netstat * LISTEN 24540/mysqld [root@hftclclw0001 opt] # mysql-u root-pEnter password: mysql >
3. Prepare test data
I refer to the mysql that Apache Sqoop Cookbook uses
Https://github.com/jarcec/Apache-Sqoop-Cookbook
Use the mysql file above github to create the sqoop user, create the sqoop database, and add the corresponding tables. And give sqoop users the corresponding permissions mysql > show databases;+-+ | Database | +-+ | information_schema | | mysql | | performance_schema | | sqoop | +-+ 4 rows in set (0.00 sec) mysql > use sqoopmysql > show tables +-+ | Tables_in_sqoop | +-+ | cities | | countries | | normcities | | staging_cities | | visits | +-+ 5 rows in set (0.00 sec)
Chapter2 importing data
Sqoop list: [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop-list-tables-- connect jdbc:mysql:// {ip}: {por} / sqoop\ >-- username sqoop\ >-- password sqoop.citiescountriesnormcitiesstaging_citiesvisits = > these tables are the new sqoop import in the previous mysql: full table import (transferring an entire table) [hadoop@hftclclw0001 sqoop -1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop import\ >-connect jdbc:mysql:// {ip}: {port} / sqoop\ >-- username sqoop\ >-- password sqoop\ >-- table cities... = > MR is called, mysql is read, and written to the file (the default understanding is the wood corresponding to the table name under the current user), a total of three records Three files [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $hadoop dfs-ls / user/hadoop/cities-rw-r--r-- 3 hadoop supergroup 0 2015-11-25 05:29 / user/hadoop/cities/_SUCCESS-rw-r--r-- 3 hadoop supergroup 16 2015-11-25 05:29 / user/hadoop/cities/part-m-00000-rw-r were generated -- hadoop supergroup-3 hadoop supergroup 22 2015-11-25 05:29 / user/hadoop/cities/part-m-00001-rw-r--r-- 3 hadoop supergroup 16 2015-11-25 05:29 / user/hadoop/cities/part-m-00002 [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $hadoop dfs-cat cities/part-m-000001 USA Palo Altosqoop import: specified path (specifying a target directory)-- target-dir specified path cannot exist (for single table use) [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop import\ >-- connect jdbc:mysql:// {ip}: {port} / sqoop\ >-- username sqoop\ >-- password sqoop\ >-- table cities\ >-- target-dir / tmp/cities [hadoop Hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $hadoop dfs-ls / tmp/cities-rw-r--r-- 3 hadoop supergroup 0 2015-11-25 05:29 / user/hadoop/cities/_SUCCESS-rw-r--r-- 3 hadoop supergroup 16 2015-11-25 05:29 / user/hadoop/cities/part-m-00000-rw-r--r-- 3 hadoop supergroup 22 2015-11-25 05:29 / user/hadoop/cities/part-m-00001-rw-r--r-- 3 hadoop supergroup 16 2015-11-25 05:29 / user/hadoop/cities/part-m-00002 [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $hadoop dfs-cat cities/part-m-000001 USA,Palo Alto is when multiple tables are imported You can use-- warehouse-dir will specify the directory again, and then generate the directory with the name of table table sqoop import: sql with where condition, that is, subset (importing only a subset of data) mysql > select * from sqoop.cities +-+ | id | country | city | +-+ | 1 | USA | Palo Alto | | 2 | Czech Republic | Brno | | 3 | USA | Sunnyvale | + -+-+ 3 rows in set (0.00 sec) mysql > select * from sqoop.cities where country = 'USA' +-+ | id | country | city | +-+ | 1 | USA | Palo Alto | | 3 | USA | Sunnyvale | +-+ 2 rows in set (0.00 sec) [hadoop@hftclclw0001 Sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $hadoop dfs-rmr / tmp/cities [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop import\ >-- connect jdbc:mysql:// {ip}: {port} / sqoop\ >-- username sqoop\ >-- password sqoop\ >-- table cities\ >-- where "country = 'USA'"\ >-- target-dir / tmp/citiessqoop import: (protecting your password) [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop import\ >-- connect jdbc:mysql:// {ip}: {port} / sqoop\ >-- username sqoop\ >-- table cities\ >-- where "country = 'USA'"\ >-- target-dir / tmp/cities\ >-P = > fate Command line input >-- password-file my-sqoop-password = > specify password file sqoop import: (Using a File Format Other Than CSV) the CSV file is generated by default Use tab interval between fields-- as-sequencefile--as-avrodatafile sqoop import: (Compressing imported data)-- compress--compression-codec org.apache.hadoop.io.compress.BZip2Codec = > specify compression algorithm sqoop import: (speeding up transfers) default mr inputformat reads data in the form of jdbc, which is inefficient. You can use some tools provided by the database, such as mysql's mysqldump, etc.-- direct.
Chapter3 Incremental Import
Mysql > select * from sqoop.visits +-+ | id | city | last_update_date | +-+ | 1 | Freemont | 1983-05-22 01:01:01 | | 2 | Jicin | 1987-02-02 02:02:02 | | +-+ 2 rows in set (0.00 sec) importing only new data table has an id primary key (int type). We import data > 1-- check-column = > check that field-- last-value = > check field. | What was the value last time? This time, + 1 will start importing [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop import\ >-- connect jdbc:mysql:// {ip}: {port} / sqoop\ >-- username sqoop\ >-- password sqoop\ >-- table visits\ >-- target-dir / tmp/visits\ >-- incremental append\ = > incremental mode is append appended >-- check-column id\ = > append mode Need an incremental primary key >-- last-value 1 = > will import from id > 1. Note that the following log is output here during execution, indicating that the next incremental import is last-value 2 (that is, the last record of this import) and that you had better use sqoop job-- create to handle similar scheduled incremental imports at 06:05:28 on 15-11-25 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import Supply the following arguments:15/11/25 06:05:28 INFO tool.ImportTool:-- incremental append15/11/25 06:05:28 INFO tool.ImportTool:-- check-column id15/11/25 06:05:28 INFO tool.ImportTool:-- last-value 215-11-25 06:05:28 INFO tool.ImportTool: (Consider saving this with 'sqoop job-- create') [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4- Alpha] $hadoop dfs-ls / tmp/visits-rw-r--r-- 3 hadoop supergroup 30 2015-11-25 06:05 / tmp/visits/part-m-00000 [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $hadoop dfs-cat / tmp/visits/part-m-000002 Jicin,1987-02-02 02:02:02.0incrementally importing mutable data
Sqoop Job:
Http://shiyanjun.cn/archives/621.html
We use Sqoop1.x to synchronize data between rdbms and hadoop/hive, and if we use the-- incremental append mode, we need to record-- last-value. If every time you perform a synchronization step, you need to parse a value of-last-value from the log, and then reset the step parameters in order to ensure that the synchronization is correct.
[hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop job\ >-- create visits-sync-job\ = > create job: job-id (visits-sync-job) >--\ > import\ >-- connect jdbc:mysql://10.224.243.124:3306/sqoop\ >-- username sqoop\ >-- password sqopp\ >-- table visits\ > -- incremental append\ >-- check-column id\ >-- last-value 1 [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop job-- list15/11/25 06:40:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6Available jobs: visits-sync-job [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop job- -show visits-sync-job15/11/25 06:40:10 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6Enter password:Job: visits-sync-jobTool: import.incremental.last.value = 1. [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop job-- exec visits-sync-jobEnter password: after job is executed We have recorded it at show job [hadoop@hftclclw0001 sqoop-1.4.6.bin__hadoop-2.0.4-alpha] $. / bin/sqoop job-- show visits-sync-job15/11/25 06:44:52 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6Enter password: incremental.last.value = 2 = > last_value, which will be read the next time it is executed.
Chapter 4 Free-Form Query Import
Sqoop import: (importing data from two tables) mysql > select * from sqoop.cities +-+ | id | country | city | +-+ | 1 | USA | Palo Alto | | 2 | Czech Republic | Brno | | 3 | USA | Sunnyvale | + -+-+ 3 rows in set (0.00 sec) mysql > select * from sqoop.countries +-+-+ | country_id | country | +-+-+ | 1 | USA | | 2 | Czech Republic | +-- -+ 2 rows in set (0.00 sec) so far The study on "the method of installation and configuration of Sqoop1.x" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.