Build hadoop development environment on eclipse 07/01 Update SLTechnology News&Howtos

Build hadoop development environment on eclipse

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

I. Overview

1. The Hadoop cluster used in the experiment is pseudo-distributed mode, and the related configuration of eclipse has been completed.

two。 The software version is hadoop-2.7.3.tar.gz, apache-maven-3.5.0.rar.

Second, use eclipse to connect to hadoop cluster for development

1. Configure hadoop on the development host

① unzips hadoop-2.7.3.tar.gz to the local host

② replaces the bin folder in the target with bin in the windows version of hadoop

③ configures the hadoop environment variable on windows

two。 Configure hadoop cluster information on eclipse

① adds a hadoop path to eclipse

② configuration hadoop cluster access information

3. Cancel permission verification in hadoop cluster

Hdfs-site.xml dfs.permissions false

4. Create a file to test connection permissions

5. Install maven

① unzips maven to the development host

② adds a maven path to eclipse

5. New maven project

6. Modify maven configuration file (maven/pom.xml)

Org.apache.hadoop hadoop-client 2.7.3 junit junit 3.8.1 test

7. Create a new class for testing (WordCount)

Import java.io.IOException;import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.GenericOptionsParser Public class WordCount {public static class TokenizerMapper extends Mapper {private final static IntWritable one = new IntWritable (1); private Text word = new Text (); public void map (Object key, Text value, Context context) throws IOException, InterruptedException {StringTokenizer itr = new StringTokenizer (value.toString ()); while (itr.hasMoreTokens ()) {word.set (itr.nextToken ()); context.write (word, one) } public static class IntSumReducer extends Reducer {private IntWritable result = new IntWritable (); public void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; for (IntWritableval: values) {sum + = val.get ();} result.set (sum); context.write (key, result) } public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); String [] otherArgs = new GenericOptionsParser (conf, args). GetRemainingArgs (); if (otherArgs.length < 2) {System.err.println ("Usage: wordcount [...]"); System.exit (2);} Job job = Job.getInstance (conf, "wordcount"); job.setJarByClass (WordCount.class) Job.setMapperClass (TokenizerMapper.class); job.setCombinerClass (IntSumReducer.class); job.setReducerClass (IntSumReducer.class); job.setOutputKeyClass (Text.class); job.setOutputValueClass (IntWritable.class); for (int I = 0; I < otherArgs.length-1; + + I) {FileInputFormat.addInputPath (job, new Path (otherArgs [I]);} FileOutputFormat.setOutputPath (job, new Path (otherArgs.length-1])) System.exit (job.waitForCompletion (true)? 0: 1);}}

8. Configure WordCount

① moves log4j.properties under the WordCount class

② sets the run argument of WordCount

8. Run the test

III. Export and submission of jar packages

1. Export WordCount

two。 Upload the exported jar package to the hadoop cluster

[hadoop@hadoop ~] $lswc.jar

3. Running

[hadoop@hadoop ~] $hadoop jar wc.jar WordCount / user/hadoop/input/* / user/hadoop/output/out17/09/06 22:36:56 INFO client.RMProxy: Connecting to ResourceManager at hadoop/192.168.100.141:803217/09/06 22:36:57 INFO input.FileInputFormat: Total input paths to process: 117-09-06 22:36:58 INFO mapreduce.JobSubmitter: number of splits:117/09/06 22:36:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_ 1504744740212_000117/09/06 22:36:59 INFO impl.YarnClientImpl: Submitted application application_1504744740212_000117/09/06 22:36:59 INFO mapreduce.Job: The url to track the job: http://hadoop:8088/proxy/application_1504744740212_0001/17/09/06 22:36:59 INFO mapreduce.Job: Running job: job_1504744740212_000117/09/06 22:37:36 INFO mapreduce.Job: Job job_1504744740212_0001 running in uber mode: false17/09/06 22:37:36 INFO Mapreduce.Job: map 0% reduce 0 reduce 09 reduce 06 22:38:26 INFO mapreduce.Job: map 100% reduce 0 Chark 06 22:38:42 INFO mapreduce.Job: map 100% reduce 100-09-06 22:38:46 INFO mapreduce.Job: Job job_1504744740212_0001 completed successfully

4. View the running results

[hadoop@hadoop] $hdfs dfs-cat / user/hadoop/output/out/part-r-00000 "AS 1" GCC 1 "License"); 1 & 1'Aalto 1'Apache 4 ArrayDesigns, 1'Bouncy 1 inch calibrations, 1 million Compresses LZF, 1...

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.