In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "the construction of sqoop1.4.7 environment and the method of importing and exporting mysql data to hive". In the daily operation, I believe that many people have doubts about the method of building sqoop1.4.7 environment and importing and exporting mysql data to hive. The editor consulted all kinds of materials and sorted out simple and useful operation methods. I hope it will be helpful for you to answer the questions about how to build sqoop1.4.7 environment and how to import and export mysql data to hive. Next, please follow the editor to study!
Sqoop documents: http://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#_prerequisites
Delimiters must be added when hive creates tables and imports data, otherwise an error will be reported when the data is exported
1. Download and install
[root@node1 ~] # wget http://mirrors.shu.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
[root@node1] # tar xvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz-C / opt/
[root@node1 ~] # cd / opt/
[root@node1 opt] # mv sqoop-1.4.7.bin__hadoop-2.6.0/ sqoop-1.4.7
[root@node1 opt] # vim / etc/profile
Export SQOOP_HOME=/opt/sqoop-1.4.7
Export HADOOP_HOME=/opt/hadoop-2.8.5
Export HADOOP_CLASSPATH=/opt/hive-2.3.4/lib/*
Export HCAT_HOME=/opt/sqoop-1.4.7/testdata/hcatalog
Export ACCUMULO_HOME=/opt/sqoop-1.4.7/src/java/org/apache/sqoop/accumulo
Export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$SQOOP_HOME/bin
[root@node1 opt] # source / etc/profile
[root@node1 opt] # sqoop help-help information
[root@node1 opt] # sqoop import-- help-- Parameter help
two。 Modify yarn configuration file
[root@node1 ~] # vim / opt/hadoop-2.8.5/etc/hadoop/yarn-site.xml
Yarn.nodemanager.resource.memory-mb 2048 yarn.nodemanager.resource.cpu-vcores 2
[root@node1 ~] # scp / opt/hadoop-2.8.5/etc/hadoop/yarn-site.xml node2:/opt/hadoop-2.8.5/etc/hadoop/-copy configuration files to each node
Yarn-site.xml 100% 1414 804.3KB/s 00:00
[root@node1 ~] # scp / opt/hive-2.3.4/conf/hive-site.xml / opt/sqoop-1.4.7/conf/-- the configuration file for hive should also be placed under sqoop, because sqoop calls hive
[root@node1 ~] # stop-all.sh
[root@node1 ~] # start-all.sh
3. Import mysql data into HDFS
Parameter explanation:
-- append appends data
-- as-textfile is imported to form a text file
-- which fields are imported by columns?
-- delete-target-dir-- delete and then import if the imported directory exists
-- fetch-size-- how much data do you read each time?
-m-- how many tasks
-e-- query statement (select)
-- table-- Table name
-- target-dir dir-- specify the HDFS directory
-- warehouse-dir dir-- the imported table will be under this directory (table name and directory name)
-where where clause-where condition
-z-- data compression
-- direct-- bypass the mysql database and import directly (worry parameters)
[root@node1] # sqoop import-- connect jdbc:mysql://172.16.9.100/hive-- username hive-- password system-- table TBL_PRIVS-- target-dir / user/sqoop-- direct-M1-- fields-terminated-by'\ t'
[root@node1 ~] # hdfs dfs-ls / user/sqoop-View the imported directory
Found 2 items
-rw-r--r-- 3 root supergroup 0 2019-03-19 12:43 / user/sqoop/_SUCCESS
-rw-r--r-- 3 root supergroup 176 2019-03-19 12:43 / user/sqoop/part-m-00000
[root@node1 ~] # hdfs dfs-cat / user/sqoop/part-m-00000-View imported data
6,1552878877,1,root,USER,root,USER,INSERT,6
7,1552878877,1,root,USER,root,USER,SELECT,6
8,1552878877,1,root,USER,root,USER,UPDATE,6
9,1552878877,1,root,USER,root,USER,DELETE,6
[root@node1 ~] #
4. Import mysql data into hive
Detailed description of parameters:
-- hive-home dir specifies hive directory
-- hive-import is imported into hive
-- hive-database imports the specified library
-- hive-overwrite overwrites to hive
-- create-hive-table creates tables in hive
-- hive-table table-name specifies the hive table name
-- hive-partition-value v hive partition
[root@node1] # sqoop import-- connect jdbc:mysql://172.16.9.100/hive-- username hive--password system-- table TBL_PRIVS-- target-dir / user/tmp-- hive-import-- hive-table tt-M1-- create-hive-table-- delete-target-dir-- direct-- fields-terminated-by'\ t'
[root@node1 conf] # hive
Logging initialized using configuration in jar Velocity fileVlue Universe OptUnique optUniverse 2.3.4 Universe HiveMutual CommonMuray 2.3.4.JarVouche Log 4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. Spark, tez) or using Hive 1.x releases.
Hive > show tables
OK
Tt
Time taken: 11.464 seconds, Fetched: 1 row (s)
Hive > select * from tt
OK
6 1552878877 1 root USER root USER INSERT 6
7 1552878877 1 root USER root USER SELECT 6
8 1552878877 1 root USER root USER UPDATE 6
9 1552878877 1 root USER root USER DELETE 6
Time taken: 3.978 seconds, Fetched: 4 row (s)
Hive >
5. Import mysql data into a library specified by hive
[root@node1] # sqoop import-- connect jdbc:mysql://172.16.9.100/hive-- username hive--password system-- table TABLE_PARAMS-- hive-import-- hive-table tt1-M1-- create-hive-table-- hive-database tong-- direct-- fields-terminated-by'\ t'
[root@node1 conf] # hive
Logging initialized using configuration in jar Velocity fileVlue Universe OptUnique optUniverse 2.3.4 Universe HiveMutual CommonMuray 2.3.4.JarVouche Log 4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. Spark, tez) or using Hive 1.x releases.
Hive > use tong
OK
Time taken: 14.34 seconds
Hive > show tables
OK
Tt1
Time taken: 0.374 seconds, Fetched: 1 row (s)
Hive > select * from tt1
OK
6 numFiles 1
6 numRows 0
6 rawDataSize 0
6 totalSize 8
6 transient_lastDdlTime 1552878901
11 comment Imported by sqoop on 2019-03-19 15:36:21
11 numFiles 1
11 numRows 0
11 rawDataSize 0
11 totalSize 176
11 transient_lastDdlTime 1552981011
16 comment Imported by sqoop on 2019-03-19 16:04:22
16 numFiles 1
16 numRows 0
16 rawDataSize 0
16 totalSize 239
16 transient_lastDdlTime 1552982688
Time taken: 3.004 seconds, Fetched: 17 row (s)
Hive >
6. Import HDFS data into mysql
[root@node1 ~] # hdfs dfs-cat / user/tmp/part-m-00000
1 2
3 4
5 6
[root@node1] # sqoop export--connect jdbc:mysql://172.16.9.100/tong-- username tong-- password system-- export-dir / user/tmp/part-m-00000-- table T1-- direct-- fields-terminated-by'\ t
[root@node1] # mysql-u root-psystem
Welcome to the MariaDB monitor. Commands end with; or\ g.
Your MySQL connection id is 1006876
Server version: 5.6.35 MySQL Community Server (GPL)
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or'\ h' for help. Type'\ c'to clear the current input statement.
MySQL [(none)] > use tong
MySQL [tong] > select * from T1
+-+ +
| | a | b | |
+-+ +
| | 3 | 4 |
| | 5 | 6 |
| | 1 | 2 |
+-+ +
3 rows in set (0.00 sec)
MySQL [tong] >
Error message: (the card does not move in Running job and does not execute down)
19-03-19 11:20:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552965562217_0001
19-03-19 11:20:10 INFO impl.YarnClientImpl: Submitted application application_1552965562217_0001
11:20:10 on 19-03-19 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552965562217_0001/
19-03-19 11:20:10 INFO mapreduce.Job: Running job: job_1552965562217_0001
Solution:
[root@node1 ~] # vim / opt/hadoop-2.8.5/etc/hadoop/yarn-site.xml-limit memory, cpu resources, synchronize configuration files to other node, restart hadoop service
Yarn.nodemanager.resource.memory-mb 2048 yarn.nodemanager.resource.cpu-vcores 2
[root@node1 ~] #
Error message: (mysql imported into hive)
19-03-19 14:34:25 INFO hive.HiveImport: Loading uploaded data into Hive
19-03-19 14:34:25 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
19-03-19 14:34:25 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
At org.apache.sqoop.hive.HiveConfig.getHiveConf (HiveConfig.java:50)
At org.apache.sqoop.hive.HiveImport.getHiveArgs (HiveImport.java:392)
At org.apache.sqoop.hive.HiveImport.executeExternalHiveScript (HiveImport.java:379)
Solution:
[root@node1 ~] # vim / etc/profile-add lib variable
Export HADOOP_CLASSPATH=/opt/hive-2.3.4/lib/*
[root@node1 ~] # source / etc/profile
Error message: (due to jackson package conflict between sqoop and hive)
19-03-19 15:32:11 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager
15:32:11 on 19-03-19 INFO ql.Driver: Executing command (queryId=root_20190319153153_63feddd9-a2c8-4217-97d4-23dd9840a54b): CREATE TABLE `tt` (`TBL_GRANT_ ID`BIGINT, `CREATE_ TIME` INT)
`GRANTOR` INT, `GRANTOR` STRING, `GRANTOR_ TYPE` STRING, `PRINCIPAL_ NAME` STRING, `PRINCIPAL_ TYPE` STRING, `TBL_ PRIV` STRING, `TBL_ ID` BIGINT) COMMENT 'Imported by sqoop on 2019-03-19
15 STORED AS TEXTFILE 31 STORED AS TEXTFILE 49 'ROW FORMAT DELIMITED FIELDS TERMINATED BY'\ 001' LINES TERMINATED BY'\ 012'
19-03-19 15:32:11 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode
15:32:12 on 19-03-19 ERROR exec.DDLTask: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.readerFor (Ljava/lang/Class;) Lcom/fasterxml/jackson/databind/ObjectReader
At org.apache.hadoop.hive.common.StatsSetupConst$ColumnStatsAccurate. (StatsSetupConst.java:165)
At org.apache.hadoop.hive.common.StatsSetupConst.parseStatsAcc (StatsSetupConst.java:297)
At org.apache.hadoop.hive.common.StatsSetupConst.setBasicStatsState (StatsSetupConst.java:230)
At org.apache.hadoop.hive.common.StatsSetupConst.setBasicStatsStateForCreateTable (StatsSetupConst.java:292)
Solution:
[root@node1 ~] # mv / opt/sqoop-1.4.7/lib/jackson-* / home/
[root@node1] # cp-a / opt/hive-2.3.4/lib/jackson-* / opt/sqoop-1.4.7/lib/
Error message:
19-03-19 18:38:40 INFO metastore.HiveMetaStore: 0: Done cleaning up thread local RawStore
19-03-19 18:38:40 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=Done cleaning up thread local RawStore
19-03-19 18:38:40 ERROR tool.ImportTool: Import failed: java.io.IOException: Hive CliDriver exited with status=1
At org.apache.sqoop.hive.HiveImport.executeScript (HiveImport.java:355)
At org.apache.sqoop.hive.HiveImport.importTable (HiveImport.java:241)
At org.apache.sqoop.tool.ImportTool.importTable (ImportTool.java:537)
At org.apache.sqoop.tool.ImportTool.run (ImportTool.java:628)
Solution:
Create table T1 (an int,b int) row format delimited fields terminated by'\ tmarker;-- delimiters must be added when creating tables
Sqoop import-connect jdbc:mysql://172.16.9.100/hive-username hive-password system-table TBL_PRIVS-target-dir / user/sqoop-direct-M1-fields-terminated-by'\ t'
At this point, the study on "the method of building sqoop1.4.7 environment and importing and exporting mysql data to hive" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.