How to use eclipse to run and debug mapreduce programs online on a remote hadoop cluster 04/04 Update SLTechnology News&Howtos

How to use eclipse to run and debug mapreduce programs online on a remote hadoop cluster

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how to use eclipse to run and debug mapreduce programs online on a remote hadoop cluster". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor learn how to use eclipse to run and debug mapreduce programs online on a remote hadoop cluster.

Premise:

1. The hadoop I use is hadoop-2.3.0-cdh6.1.0.tar

two。 The following is the configuration of my hadoop core profile:

Core-site.xml

Fs.defaultFS hdfs://master:9000 io.file.buffer.size 131072 hadoop.tmp.dir file:/ Home/yinkaipeng/tmp Abase for other temporary directories. Hadoop.proxyuser.hduser.hosts * hadoop.proxyuser.hduser.groups *

Hdfs-site.xml

Dfs.namenode.secondary.http-address master:9001 dfs.namenode.name.dir file:/usr/local/data/dfs/name dfs.datanode.data.dir file:/usr/local/data/dfs/data Dfs.replication 3 dfs.webhdfs.enabled true

Mapred-site.xml

Mapreduce.framework.name yarn mapreduce.jobhistory.address master:10020 mapreduce.jobhistory.webapp.address master:19888

Yarn-site.xml

Yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.resourcemanager.address master:8032 yarn.resourcemanager.scheduler.address Master:8030 yarn.resourcemanager.resource-tracker.address master:8031 yarn.resourcemanager.admin.address master:8033 Yarn.resourcemanager.webapp.address master:8088

If you don't configure zookeeper in hadoop2.0, just add your datanode to the slaves file.

After the hadoop cluster is configured, then use eclipse to connect!

OK, I use: hadoop-eclipse-plugin-2.2.0, downloaded from the Internet.

Let's get to work.

Start the hadoop cluster

Copy hadoop-eclipse-plugin-2.2.0 to the plugins directory of eclipse and start eclipse.

It won't work if you connect now. Because we are using Windows, we also need to do the following steps:

Change the current user name of the computer to the startup user name of hadoop

Eclipse connects to the hadoop source directory and tests the bin directory of hadoop-common-2.2.0-bin-master to the workspace of eclipse.

Note: the hadoop directory above is the extracted hadoop I downloaded from the Linux system.

It is no problem to operate hdfs here. If you want to run mapreduce, you will also report an error. Perform the following two steps:

1. Add the org.apache.hadoop.io.nativeio from the hadoop source code to the project and make the following modifications:

two。 Inject the hadoop local directory address environment variable into the main function of our mapreduce.

Ok! Now you can debug the mapreduce of the hadoop cluster on Windows!

At this point, I believe you have a deeper understanding of "how to use eclipse to run and debug mapreduce programs online on a remote hadoop cluster". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.