Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to run java source program in pseudo-distributed mode hadoop

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

Pseudo-distribution mode hadoop how to run java source programs, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

After writing the source code, first compile: javac-classpath / usr/local/hadoop/hadoop-core-1.2.1.jar:/usr/local/hadoop/lib/commons-cli-1.2.jar count.java-d org generate three class files in the org directory: count.class count\ Map.class count\ Reduce.class and then package the three class files: jar-cvf count.jar-C org/. After that, generate count.jar files under the hadoop root directory to create a distributed folder, and put the data to be analyzed into it: bin/hadoop fs-mkdir input bin/hadoop fs-- put ~ / Downloads/Gowalla_totalCheckins.txt input (~ / Downloads/Gowalla_totalCheckins.txt is the location of my file) can be viewed through localhost:50070: you can see that the data in txt has been tested under input. Next, run the program: after bin/hadoop jar count.jar count input output runs, you will find that an output folder is generated, under which there are three files, and the output information is saved in the file contents in part-r-00000:

196514 2020-07-24T13:45:06Z 53.3648119-2.2723465833 145064 196514 2020-07-24T13:44:58Z 53.360511233-2.276369017 1275991

196514 2020-07-24T13:44:46Z 53.3653895945-2.2754087046 376497 196514 2020-07-24T13:44:38Z 53.3663709833-2.2700764333 98503

196514 2020-07-24T13:44:26Z 53.3674087524-2.2783813477 1043431

196514 2020-07-24T13:44:08Z 53.3675663377-2.278631763 881734

196514 2020-07-24T13:43:18Z 53.3679640626-2.2792943689 207763 196514 2020-07-24T13:41:10Z 53.364905-2.270824 1042822

The first column is the user's id, the second column is the login time, the third column is the user's latitude, the fourth column is the user's longitude, and the fifth column is the user's address id. This program analyzes the user's login time and makes statistics over time periods.

Source code:

Import java.io.IOException; import java.util.*; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser Public class count {public static class Map extends Mapper {/ / implement the map function public void map (Object key, Text value, Context context) throws IOException, InterruptedException {String line = value.toString (); int k; StringTokenizer itr = new StringTokenizer (line); int I = 0; int hour = 0gimme second = 0 While (itr.hasMoreTokens ()) {String token = itr.nextToken (); if (I = = 2) {int indexOfT = token.indexOf ('T') Int indexOfZ = token.indexOf ('Zhumimei indexOfT + 1); String substr = token.substring (indexOfT + 1 de indexOfZ); int blank1 = substr.indexOf (':'); int blank2 = substr.indexOf (':', blank1 + 1) Hour = Integer.parseInt (substr.substring (0mai Blank1), 10); minute = Integer.parseInt (substr.substring (blank1 + 1) Blank2), 10); second = Integer.parseInt (substr.substring (blank2 + 1), 10) }} k = (hour * 60 * 60 + minute * 60 + second) / (3600 * 4); context.write (new IntWritable (k), new IntWritable (1));}} public static class Reduce extends Reducer

< IntWritable, IntWritable, IntWritable, IntWritable>

{/ / implement the reduce function public void reduce (IntWritable key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; for (IntWritableval: values) {sum + = val.get () } context.write (key, new IntWritable (sum));}} public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); String [] otherArgs = new GenericOptionsParser (conf, args). GetRemainingArgs () If (otherArgs.length! = 2) {System.err.println ("Usage: Multiple Table Join"); System.exit (2);} Job job = new Job (conf, "count"); job.setJarByClass (count.class) / / set Map and Reduce processing classes job.setMapperClass (Map.class); job.setCombinerClass (Reduce.class); job.setReducerClass (Reduce.class); / / set output type job.setOutputKeyClass (IntWritable.class); job.setOutputValueClass (IntWritable.class); FileInputFormat.addInputPath (job, new Path (otherArgs [0])) FileOutputFormat.setOutputPath (job, new Path (otherArgs [1])); System.exit (job.waitForCompletion (true)? 0: 1);}} is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report