In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Reference http://hadoop.apache.org/docs/r2.7.6/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
Eclipse New maven Project
Pom file content
4.0.0
Hadoop_mapreduce
WordCount
0.0.1-SNAPSHOT
Jar
WordCount
Http://maven.apache.org
UTF-8
Org.apache.hadoop
Hadoop-client
2.8.0
Jdk.tools
Jdk.tools
1.8
System
C:\ Program Files\ Java\ jdk1.8.0_151\ lib\ tools.jar
Note: only hadoop-client packages are needed. If hbase-related packages are introduced, package conflicts are likely to occur and exceptions will occur in operation.
WordCount class code
Package hadoop_mapreduce.WordCount
Import java.io.IOException
Import java.io.InterruptedIOException
Import java.util.StringTokenizer
Import org.apache.hadoop.conf.Configuration
Import org.apache.hadoop.fs.Path
Import org.apache.hadoop.io.IntWritable
Import org.apache.hadoop.io.Text
Import org.apache.hadoop.mapreduce.Job
Import org.apache.hadoop.mapreduce.Mapper
Import org.apache.hadoop.mapreduce.Reducer
Import org.apache.hadoop.mapreduce.lib.input.FileInputFormat
Import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
Public class WordCount {
Public static class TokenizerMapper
Extends Mapper {
Private final static IntWritable one = new IntWritable (1)
Private Text word = new Text ()
Public void map (Object key,Text value,Context context) throws IOException,InterruptedIOException, InterruptedException
{
StringTokenizer itr = new StringTokenizer (value.toString ())
While (itr.hasMoreTokens ()) {
Word.set (itr.nextToken ())
Context.write (word, one)
}
}
}
Public static class IntSumReducer
Extends Reducer {
Private IntWritable result = new IntWritable ()
Public void reduce (Text key, Iterable values,Context context) throws IOException,InterruptedException {
Int sum = 0
For (IntWritableval: values) {
Sum + = val.get ()
}
Result.set (sum)
Context.write (key, result)
}
}
/ / public static void main (String [] args) throws IOException, ClassNotFoundException, InterruptedException
Public static void main (String [] args) throws IOException, ClassNotFoundException, InterruptedException
{
/ *
* IntWritable intwritable = new IntWritable (1)
Text text = new Text ("abc")
System.out.println (text.toString ())
System.out.println (text.getLength ())
System.out.println (intwritable.get ())
System.out.println (intwritable)
StringTokenizer itr = new StringTokenizer ("www baidu com")
While (itr.hasMoreTokens ()) {
System.out.println (itr.nextToken ()); hdfs://192.168.50.107:8020/input hdfs://192.168.50.107:8020/output
, /
/ / String path = WordCount.class.getResource ("/") .toString ()
/ / System.out.println ("path =" + path)
System.out.println ("Connection end")
/ / System.setProperty ("hadoop.home.dir", "file://192.168.50.107/home/hadoop-user/hadoop-2.8.0");"
/ / String StringInput = "hdfs://192.168.50.107:8020/input/a.txt"
/ / String StringOutput = "hdfs://192.168.50.107:8020/output/b.txt"
Configuration conf = new Configuration ()
/ / conf.set ("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem")
/ / conf.addResource ("classpath:core-site.xml")
/ / conf.addResource ("classpath:hdfs-site.xml")
/ / conf.addResource ("classpath:mapred-site.xml")
/ / conf.set ("HADOOP_HOME", "/ home/hadoop-user/hadoop-2.8.0")
Job job = Job.getInstance (conf, "word count")
Job.setJarByClass (WordCount.class)
Job.setMapperClass (TokenizerMapper.class)
Job.setCombinerClass (IntSumReducer.class)
Job.setOutputKeyClass (Text.class)
Job.setOutputValueClass (IntWritable.class)
/ / FileInputFormat.addInputPath (job, new Path (StringInput))
/ / FileOutputFormat.setOutputPath (job, new Path (StringOutput))
FileInputFormat.addInputPath (job, new Path (args [0]))
FileOutputFormat.setOutputPath (job, new Path (args [1]))
System.exit (job.waitForCompletion (true)? 0:1)
}
}
The location of the configuration file for connecting to hadoop is shown in the figure
Eclipse execution will report an error: HADOOP_HOME and hadoop.home.dir are unset.
Compile and package and put it into linux system
Mvn clean
Mvn compile
Mvn pacakge
I put the packaged WordCount-0.0.1-SNAPSHOT.jar into the / home/hadoop-user/work directory
Run hadoop jar WordCount-0.0.1-SNAPSHOT.jar hadoop_mapreduce.WordCount.WordCount hdfs://192.168.50.107:8020/input hdfs://192.168.50.107:8020/output on linux
Note: if I don't have the classpath here, I will report an error and can't find the WordCount class. Put the files to be analyzed into the input directory of hdfs. The Output directory does not have to be created by yourself. The final analysis results will exist in the output directory
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.