Hadoop hive sqoop zookeeper hb 07/13 Update SLTechnology News&Howtos

Hadoop hive sqoop zookeeper hb

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

6. Problems and solutions

1. Problem description:

WARN util.NativeCodeLoader: Unable to load native-hadoop library foryour platform... Using builtin-java classes where applicable

Cause of the problem: the default lib is 32-bit and 64-bit is not supported.

Solution: recompile the 64-bit library-please note that compilation errors will occur on jdk1.8

# yum install cmake lzo-devel zlib-devel gccgcc-c++ autoconf automake libtool ncurses-devel openssl-deve

Install maven

# wget http://mirror.cc.columbia.edu/pub/software/apache/maven/maven-3/3.2.3/binaries/apache-maven-3.2.3-bin.tar.gz

# tar zxfapache-maven-3.2.3-bin.tar.gz-C / usr/local

# cd / usr/local

# ln-sapache-maven-3.2.3 maven

# vim/etc/profile

ExportMAVEN_HOME=/usr/local/maven

ExportPATH=$ {MAVEN_HOME} / bin:$ {PATH}

# source/etc/profile

Install ant

# wget http://apache.dataguru.cn//ant/binaries/apache-ant-1.9.4-bin.tar.gz

# tar zxf apache-ant-1.9.4-bin.tar.gz-C/usr/local

# vim / etc/profile

ExportANT_HOME=/usr/local/apache-ant-1.9.4

ExportPATH=$PATH:$ANT_HOME/bin

# source / etc/profile

Install findbugs

# wget http://prdownloads.sourceforge.net/findbugs/findbugs-2.0.3.tar.gz?download

# tar zxf findbugs-2.0.3.tar.gz-C/usr/local

# vim / etc/profile

Export FINDBUGS_HOME=/opt/findbugs-2.0.3

Export PATH=$PATH:$FINDBUGS_HOME/bin

Install protobuf

# wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz

# tar zxf protobuf-2.5.0.tar.gz

# cd protobuf-2.5.0

#. / configure & & make & & makeinstall

Download the source package

# wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.5.0/hadoop-2.5.0-src.tar.gz

# tar zxf hadoop-2.5.0-src.tar.gz

# cd hadoop-2.5.0-src

# mvn clean install-DskipTests

# mvn package-Pdist,native-DskipTests-Dtar

Replace the old lib library

# mv / data/hadoop-2.5.0/lib/native / data/hadoop-2.5.0/lib/native_old

# cp-r / data/hadoop-2.5.0-src/hadoop-dist/target/hadoop-2.5.0/lib/native\

/ data/hadoop-2.5.0/lib/native

# bin/hdfs getconf-namenodes

Reference:

Http://www.tuicool.com/articles/zaY7Rz

Http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html#Supported_Platforms)

two。 Problem description:

WARN hdfs.DFSClient:DataStreamer Exception appears and then execute

Sbin/stop-dfs.sh = > namenode1: no datanode tostop

Or hadoop dfsadmin-report cannot query the information of the file system in the cluster

The cause of the problem: when reformatting the file system, the new namespaceID generated by namenode is inconsistent with the namespaceID held by datanode.

Solution: before we format namenode, we should first delete everything under the data folder in the file configured by dfs.data.dir.

3. Problem description:

ERRORorg.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:Incompatible namespaceIDs in

The cause of the problem: each namenode format will recreate a namenodeId, and the directory configured with the dfs.data.dir parameter contains the id created by the last format, which is inconsistent with the id in the directory configured by the dfs.name.dir parameter. Namenode format empties the data under namenode, but does not empty the data under datanode, resulting in startup failure. All you have to do is clear the directory configured for dfs.data.dir parameters before each fotmat.

Commands for formatting hdfs

Solution: bin/hadoop namenode-format

MapReduce learns blog: http://www.cnblogs.com/xia520pi/archive/2012/05/16/2504205.html

4. Problem description:

[root@namenode1hadoop] # hadoop fs-put README.txt /

15-01-04 21:50:49 WARN hdfs.DFSClient:DataStreamer Exception

Org.apache.hadoop.ipc.RemoteException (java.io.IOException): File / README.txt._COPYING_ could only be replicated to 0 nodes instead ofminReplication (= 1). There are 6datanode (s) running and no node (s) are excluded in this operation.

The reason for the problem is that the following configuration of hdfs-site.xml is incorrect (the following parameters need to be modified according to the actual situation)

Dfs.block.size

268435456

The default block size for newfiles

Dfs.datanode.max.xcievers

10240

An Hadoop HDFS datanode has an upper bound on the number of files thatit will serve at any one time.

Dfs.datanode.du.reserved

32212254720

Reserved space in bytes per volume. Always leave thismuch space free for non dfs use.

Solution: modify the above configuration, and then restart.

5. Problem description:

Cause of problem: slf4j bindings conflict

Solution:

# mv / var/data/hive-1.40/lib/hive-jdbc-0.14.0-standalone.jar/opt/

When hive still fails to start, check

1. If you look at the hive-site.xml configuration, you will see that the configuration value contains the configuration item of "system:java.io.tmpdir"

two。 Create a new folder / var/data/hive/iotmp

3. Change the value of the configuration item containing "system:java.io.tmpdir" to the address above

Start hive, successful!

6. Problem description

HADOOP:Error Launching job: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:Invalid resource request, requested memory

< 0, or requested memory >

Maxconfigured, requestedMemory=1536, maxMemory=1024

Cause of the problem: the default memory required by mapreduce is 1536MB, which is too small.

Mapreduce.map.memory.mb

five hundred and twelve

Mapreduce.map.java.opts

-Xmx410m

Mapreduce.reduce.memory.mb

five hundred and twelve

Mapreduce.reduce.java.opts

-Xmx410m

The 512 is value the yarn.scheduler.maximum-allocation-mb inyarn-site.xml, and the 1536 is default value ofyarn.app.mapreduce.am.resource.mb parameter in mapred-site.xml, make sure theallocation-mb > app.mapreduce.resouce will be ok.

Solution:

Adjust the above parameter to 2048 and expand the memory

7. Problem description:

Hadoop:java.lang.IncompatibleClassChangeError:

Found interface org.apache.hadoop.mapreduce.JobContext,but class was expected

Cause of the problem: the version of sqoop does not match the version of hadoop

Workaround: recompile sqoop as follows:

How to compile sqoop

Step one:

Additionally,building the documentation requires these tools:

* asciidoc

* make

* python 2.5 +

* xmlto

* tar

* gzip

Yum-y install git

Yum-y install asciidoc

Yum-y install make

Yum-y install xmlto

Yum-y install tar

Yum-y install gzip

Step 2:

Download the relevant software packages:

Wget http://dist.codehaus.org/jetty/jetty-6.1.26/jetty-6.1.26.zip

Wget http://mirrors.cnnic.cn/apache/sqoop/1.4.5/sqoop-1.4.5.tar.gz

Mv jetty-6.1.26.zip/root/.m2/repository/org/mortbay/jetty/jetty/6.1.26/

Step 3:

Extract and modify related files:

Tar-zxvf sqoop-1.4.5.tar.gz; cd sqoop-1.4.5

Modify: after build.xml, the content is as follows

Modify 550th and 568th lines debug= "${javac.debug}" >

Is: debug= "${javac.debug}" includeantruntime= "on" >

Modify: src/test/org/apache/sqoop/TestExportUsingProcedure.java

Modify

Modify line 244 sql.append (StringUtils.repeat (?),

Is: sql.append (StringUtils.repeat)

After the above configuration is modified, execute: ant package

If the compilation is successful, it will prompt: BUILD SUCCESSFUL

Step 4: package the sqoop installation package we need

After the compilation is successful, sqoop-1.4.5.bin__hadoop-2.5.0 is generated by default in the sqoop-1.4.5/build directory

Tar-zcfsqoop-1.4.5.bin__hadoop-2.5.0.tar.gz sqoop-1.4.5.bin__hadoop-2.5.0

Over! Reference: http://www.aboutyun.com/thread-8462-1-1.html

8. Problem description:

Execute the command:

# sqoopexport-- connect jdbc:mysql://10.40.214.9:3306/emails\

-usernamehive-password hive-table izhenxin\

-- export-dir/user/hive/warehouse/maillog.db/izhenxin_total

...

Caused by:java.lang.RuntimeException: Can't parse input data:'@ QQ.com'

Atizhenxin.__loadFromFields (izhenxin.java:378)

At izhenxin.parse (izhenxin.java:306)

Atorg.apache.sqoop.mapreduce.TextExportMapper.map (TextExportMapper.java:83)

... 10 more

Caused by:java.lang.NumberFormatException: For input string: "@ QQ.com"

...

23:15:21 on 15-01-19 INFO mapreduce.ExportJobBase: Transferred 0bytes in 46.0078 seconds (0bytes / sec)

15-01-19 23:15:21 INFO mapreduce.ExportJobBase: Exported 0records.

15-01-19 23:15:21 ERROR tool.ExportTool: Error during export: Exportjob failed!

The cause of the problem:

Caused by the absence of the full path of the specified file

In fact, the full path should be:

# hadoop fs-ls/user/hive/warehouse/maillog.db/izhenxin_total/

Found 1 items

-rw-r--r-- 2 rootsupergroup 2450 2015-01-19 23:50/user/hive/warehouse/maillog.db/izhenxin_total/000000_0

Solution:

# sqoop export--connectjdbc:mysql://10.40.214.9:3306/emails-username hive-password hive-tableizhenxin-export-dir / user/hive/warehouse/maillog.db/izhenxin_total/000000_0--input-fields-terminated-by'\ t'

Still report an error:

Mysql > create table izhenxin (id int (10) unsigned NOT NULL AUTO_INCREMENT,mail_domain varchar (32) DEFAULTNULL,sent_number int,bounced_number int, deffered_number int, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='sent mail'; # # original table

# # solution: delete the above table first, and then create the following table to adapt to the table structure of hive

Mysql > create table izhenxin (mail_domainvarchar (32) DEFAULT NULL,sent_number int,bounced_number int, deffered_numberint) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='sent mail'

# # final output:

00:05:51 on 15-01-20 INFO mapreduce.ExportJobBase: Transferred6.9736 KB in 26.4035 seconds (270.4564 bytes/sec)

15-01-20 00:05:51 INFO mapreduce.ExportJobBase: Exported 132records.

Mysql > select count (1) from izhenxin

+-+

| | count (1) | |

+-+

| | 132 |

+-+

1 row in set (0.00 sec)

Got it!

9. Problem description:

15-01-27 10:48:56 INFO mapreduce.Job: Task Id: attempt_1420738964879_0244_m_000003_0, Status: FAILED

AttemptID:attempt_1420738964879_0244_m_000003_0 Timed out after600 secs

15-01-27 10:48:57 INFO mapreduce.Job: map 75% reduce 0

15-01-27 10:49:08 INFO mapreduce.Job: map 100% reduce 0

15-01-27 10:59:26 INFO mapreduce.Job: Task Id: attempt_1420738964879_0244_m_000003_1, Status: FAILED

AttemptID:attempt_1420738964879_0244_m_000003_1 Timed out after600 secs

15-01-27 10:59:27 INFO mapreduce.Job: map 75% reduce 0

15-01-27 10:59:38 INFO mapreduce.Job: map 100% reduce 0

15-01-27 11:09:55 INFO mapreduce.Job: Task Id: attempt_1420738964879_0244_m_000003_2, Status: FAILED

AttemptID:attempt_1420738964879_0244_m_000003_2 Timed out after600 secs

The cause of the problem:

Execution timeout

Solution:

Vim mapred-site.xml

Mapred.task.timeout

1800000

Method 2:

Configuration conf=new Configuration ()

Long milliSeconds = 1000 "60" 60

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.