Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Hadoop2.6.0 Learning Notes (2) MapReduce runs through Eclipse

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Welcome to visit: Lu Chunli's work notes, learning is a belief, let time test the strength of persistence.

System: Win7 64 bit

JEE version of Eclipse:Luna Release (4.4.0)

Hadoop:2.6.0

Hadoop-plugin:hadoop-eclipse-plugin-2.2.0.jar

0. Write in front

The Hadoop2.6 cluster of work notes has set up the cluster environment of hadoop. Usually, the execution of mapreduce needs to be submitted to the hadoop cluster as a jar package, but for the convenience of testing, we are now ready to have an eclipse environment with mapreduce operation.

1. Plug-in installation

Copy hadoop-eclipse-plugin-2.2.0.jar to the eclipse installation directory plugins

2. Environment configuration

Restart eclipse after copying hadoop-eclipse-plugin-2.2.0.jar to the eclipse installation directory plugins

A.) Find the mapreduce plug-in

B.) Create a new hadoop location

C.) Configure Genernal

Parameter description:

Location name: custom name Map/Reduce (V2) Master: the configuration information of the cluster JobTracker is the same as that of the mapreduce.jobtracker.address in the mapre-site.xml. DFS Master: consistent with the fs.defaultFS in the core-site.xml file is configured to be the same as Active NameNode, and configured as cluster will resolve cluster as a hostname (resolution failed) User name: configured as the user hadoop I used in the hadoop cluster

Description:

Many of the parameters in Advanced Parameters are not clear about their specific functions, so they will not be adjusted here.

D.) Verify the configuration

You can see the directory on hdfs:

3. Run wordcount

Eclipse's hadoop plug-in has been successfully integrated, so let's run wordcount, a starter for mapreduce.

A.) Create a new MapReduce Project

First, you need to extract the hadoop installer locally so that the jar package that hadoop depends on is automatically introduced when you create the mapreduce program.

B.) Preparation program

Package com.invic.mapreduce.wordcount;import java.io.IOException;import java.util.StringTokenizer;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Mapper;public class MyMapper extends Mapper {private static final Log LOG = LogFactory.getLog (MyMapper.class) Override public void map (Object key, Text value, Context context) throws IOException, InterruptedException {LOG.info ("= mapper="); LOG.info ("key:" + key + "\ tvalue:" + value); IntWritable one = new IntWritable (1); Text word = new Text (); StringTokenizer token = new StringTokenizer (value.toString ()) While (token.hasMoreTokens ()) {word.set (token.nextToken ()); LOG.info (word.toString ()); context.write (word, one);} package com.invic.mapreduce.wordcount;import java.io.IOException;import java.util.Iterator;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;import org.apache.hadoop.io.IntWritable Import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Reducer;/** @ author lucl * * / public class MyReducer extends Reducer {private static final Log LOG = LogFactory.getLog (MyReducer.class); @ Override public void reduce (Text key, Iterable value, Context context) throws IOException, InterruptedException {LOG.info ("= reducer="); LOG.info ("key" + key + "\ tvalue:" + value) Int result = 0; for (Iterator it = value.iterator (); it.hasNext ();) {IntWritableval = it.next (); LOG.info ("\ t\ t:" + val.get ()); result + = val.get ();} LOG.info ("total key:" + key + "\ result:" + result) Context.write (key, new IntWritable (result));}} package com.invic.mapreduce.wordcount;import java.io.IOException;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.conf.Configured;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job Import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.GenericOptionsParser;import org.apache.hadoop.util.Tool;import org.apache.hadoop.util.ToolRunner;/** @ author lucl * * / public class WordCounterTool extends Configured implements Tool {private static final Log LOG = LogFactory.getLog (WordCounterTool.class) Public static void main (String [] args) throws IOException, ClassNotFoundException, InterruptedException {/ / system parameters need to be set here, otherwise winutils.exe error System.setProperty ("hadoop.home.dir", "E:\ hadoop-2.6.0\\ hadoop-2.6.0"); try {int exit = ToolRunner.run (new WordCounterTool (), args) LOG.info ("result:" + exit);} catch (Exception e) {e.printStackTrace ();}} @ Override public int run (String [] args) throws Exception {Configuration conf = new Configuration (); String [] otherArgs = new GenericOptionsParser (conf, args). GetRemainingArgs (); if (otherArgs.length)

< 2) { LOG.info("Usage: wordcount [...] "); System.exit(2); } Job job = Job.getInstance(); job.setJarByClass(WordCounterTool.class); job.setMapperClass(MyMapper.class); job.setCombinerClass(MyReducer.class); job.setReducerClass(MyReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); for (int i = 0; i < otherArgs.length - 1; ++i) { FileInputFormat.addInputPath(job, new Path(otherArgs[i])); } FileOutputFormat.setOutputPath(job, new Path(otherArgs[otherArgs.length - 1])); return job.waitForCompletion(true) ? 0 : 1; }} c.) 运行MapReduce程序 选中WordCounterTool右键Run Configurations配置输入参数,点击"Run"按钮 data目录下file1.txt内容为: hello worldhello markhuanghello hadoop data目录下file2.txt内容为: hadoop okhadoop failhadoop 2.3 d.) 程序报错 15/07/19 22:17:31 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-Administrator/mapred/staging/Administrator907501946/.staging/job_local907501946_0001Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method) at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557) at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977) at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115) at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131) at org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:163) at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:536) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314) at com.invic.mapreduce.wordcount.WordCounterTool.run(WordCounterTool.java:60) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.invic.mapreduce.wordcount.WordCounterTool.main(WordCounterTool.java:31) 说明: 从网上下载hadoop2.6版本对应的hadoop.dll文件放到C:\Windows\System32目录下 e.) 再次执行 选中WordCounterTool右键Run AS -->

Run On Hadoop, wait a moment and then the program executes successfully.

F.) View the output result

Summary: the plug-in configuration was successful.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report