In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces the sample code of mapreduce in hadoop, which is very detailed and has certain reference value. Friends who are interested must finish it!
Package cn.itheima.bigdata.hadoop.mr.wordcount
Import java.io.IOException
Import org.apache.commons.lang.StringUtils
Import org.apache.hadoop.io.LongWritable
Import org.apache.hadoop.io.Text
Import org.apache.hadoop.mapreduce.Mapper
Public class WordCountMapper extends Mapper {
@ Override
Protected void map (LongWritable key, Text value,Context context)
Throws IOException, InterruptedException {
/ / get the contents of an one-line file
String line = value.toString ()
/ / the content of this line is divided into an array of words
String [] words = StringUtils.split (line, "")
/ / traverse the output
For (String word:words) {
Context.write (new Text (word), new LongWritable (1))
}
}
}
Package cn.itheima.bigdata.hadoop.mr.wordcount
Import java.io.IOException
Import org.apache.hadoop.io.LongWritable
Import org.apache.hadoop.io.Text
Import org.apache.hadoop.mapreduce.Reducer
Public class WordCountReducer extends Reducer {
/ / key: hello, values: {1,1,1,1,1.}
@ Override
Protected void reduce (Text key, Iterable values,Context context)
Throws IOException, InterruptedException {
/ / define an accumulation counter
Long count = 0
For (LongWritable value:values) {
Count + = value.get ()
}
/ / output key-value pairs
Context.write (key, new LongWritable (count))
}
}
Package cn.itheima.bigdata.hadoop.mr.wordcount
Import java.io.IOException
Import org.apache.hadoop.conf.Configuration
Import org.apache.hadoop.fs.Path
Import org.apache.hadoop.io.LongWritable
Import org.apache.hadoop.io.Text
Import org.apache.hadoop.mapreduce.Job
Import org.apache.hadoop.mapreduce.lib.input.FileInputFormat
Import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
/ * *
* used to describe a job job (which mapper class is used, which reducer class is used, where the input file is located, and where the output results are placed. )
* then submit the job to the hadoop cluster
* @ author duanhaitao@itcast.cn
*
, /
/ / cn.itheima.bigdata.hadoop.mr.wordcount.WordCountRunner
Public class WordCountRunner {
Public static void main (String [] args) throws Exception {
Configuration conf = new Configuration ()
Job wcjob = Job.getInstance (conf)
/ / set the jar package used by job
Conf.set ("mapreduce.job.jar", "wcount.jar")
/ / set the jar package where the resources in the wcjob are located
Wcjob.setJarByClass (WordCountRunner.class)
/ / which mapper class to use for wcjob
Wcjob.setMapperClass (WordCountMapper.class)
/ / which reducer class to use for wcjob
Wcjob.setReducerClass (WordCountReducer.class)
/ / the kv data type output by the mapper class of wcjob
Wcjob.setMapOutputKeyClass (Text.class)
Wcjob.setMapOutputValueClass (LongWritable.class)
/ / the kv data type output by the reducer class of wcjob
Wcjob.setOutputKeyClass (Text.class)
Wcjob.setOutputValueClass (LongWritable.class)
/ / specify the path where the original data to be processed is stored
FileInputFormat.setInputPaths (wcjob, "hdfs://192.168.88.155:9000/wc/srcdata")
/ / specify the path to which the processed results are output.
FileOutputFormat.setOutputPath (wcjob, new Path ("hdfs://192.168.88.155:9000/wc/output"))
Boolean res = wcjob.waitForCompletion (true)
System.exit (res?0:1)
}
}
Pack it into mr.jar and put it on hadoop server.
[root@hadoop02 ~] # hadoop jar / root/Desktop/mr.jar cn.itheima.bigdata.hadoop.mr.wordcount.WordCountRunner
Java HotSpot (TM) Client VM warning: You have loaded library / home/hadoop/hadoop-2.6.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack-c', or link it with'- z noexecstack'.
15-12-05 06:07:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable
15-12-05 06:07:07 INFO client.RMProxy: Connecting to ResourceManager at hadoop02/192.168.88.155:8032
15-12-05 06:07:08 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15-12-05 06:07:09 INFO input.FileInputFormat: Total input paths to process: 1
15-12-05 06:07:09 INFO mapreduce.JobSubmitter: number of splits:1
15-12-05 06:07:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449322432664_0001
15-12-05 06:07:10 INFO impl.YarnClientImpl: Submitted application application_1449322432664_0001
06:07:10 on 15-12-05 INFO mapreduce.Job: The url to track the job: http://hadoop02:8088/proxy/application_1449322432664_0001/
15-12-05 06:07:10 INFO mapreduce.Job: Running job: job_1449322432664_0001
15-12-05 06:07:22 INFO mapreduce.Job: Job job_1449322432664_0001 running in uber mode: false
15-12-05 06:07:22 INFO mapreduce.Job: map 0 reduce 0
15-12-05 06:07:32 INFO mapreduce.Job: map 100% reduce 0
15-12-05 06:07:39 INFO mapreduce.Job: map 100 reduce 100%
15-12-05 06:07:40 INFO mapreduce.Job: Job job_1449322432664_0001 completed successfully
15-12-05 06:07:41 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=635
FILE: Number of bytes written=212441
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=338
HDFS: Number of bytes written=223
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms) = 7463
Total time spent by all reduces in occupied slots (ms) = 4688
Total time spent by all map tasks (ms) = 7463
Total time spent by all reduce tasks (ms) = 4688
Total vcore-seconds taken by all map tasks=7463
Total vcore-seconds taken by all reduce tasks=4688
Total megabyte-seconds taken by all map tasks=7642112
Total megabyte-seconds taken by all reduce tasks=4800512
Map-Reduce Framework
Map input records=10
Map output records=41
Map output bytes=547
Map output materialized bytes=635
Input split bytes=114
Combine input records=0
Combine output records=0
Reduce input groups=30
Reduce shuffle bytes=635
Reduce input records=41
Reduce output records=30
Spilled Records=82
Shuffled Maps = 1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms) = 211
CPU time spent (ms) = 1350
Physical memory (bytes) snapshot=221917184
Virtual memory (bytes) snapshot=722092032
Total committed heap usage (bytes) = 137039872
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=224
File Output Format Counters
Bytes Written=223
The above is all the content of this article "sample Code of mapreduce in hadoop". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.