Hadoop debug source code 07/01 Update SLTechnology News&Howtos

Hadoop debug source code

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

This section describes two ways to debug Hadoop source code: using the Eclipse remote debugging tool and printing debug logs. Both methods can debug Hadoop in pseudo-distributed working mode and fully distributed working mode. This section mainly introduces the method of Hadoop debugging in pseudo-distributed working mode.

(1) remote debugging using Eclipse

Taking debugging ResourceManager as an example, the basic method of remote debugging using Eclipse is introduced, which can be carried out in two steps.

Step 1 start Hadoop in debug mode.

Run the following Shell script under the Hadoop installation directory:

Export YARN_NODEMANAGER_OPTS= "- Xdebug-Xrunjdwp:transport=dt_socket,address=8788, server=y,suspend=y"

Sbin/start-all.sh

After running the script, you will see that the Shell command line terminal displays the following message:

Listening for transport dt_socket at address: 8788

This indicates that the ResourceManager is in a listening state until a debug confirmation message is received.

Step 2 sets the breakpoint.

In the new Java project "hadoop-2.0" above, find the ResourceManager-related code and set some breakpoints where you are interested.

Step 3 debug the Hadoop program in Eclipse.

In the Eclipse menu bar, select "Run" → "Debug Configurations" → "Remote Java Applications", and fill in the remote debugger name (define one by yourself), the host where the ResourceManager is located and the listening port number and other information, and select the Hadoop source code project, you can enter debug mode.

During debugging, the information output by ResourceManager is stored in the yarn-XXX-resourcemanager-localhost.log file under the log folder (XXX is the current user name). You can view the logs printed during debugging with the following command:

Tail-f logs/yarn-XXX-resourcemanager-localhost.log

(2) print Hadoop debug log

Hadoop uses Apache log4j as the basic log library, which has five levels of logs, namely DEBUG, INFO, WARN, ERROR and FATAL. These five levels are in order, that is, DEBUG < INFO < WARN < ERROR < FATAL, which are used to specify the importance of log information. The log output rule is: only log information at a level not lower than the set level is output. For example, if the level is set to INFO, log information at INFO, WARN, ERROR and FATAL levels will be output, but DEBUG with a lower level than INFO will not.

In Hadoop source code, debug logs (DEBUG level logs) exist in most Java files, but by default, the log level is INFO. To see more detailed running status, you can open the DEBUG log in the following ways.

Method 1 uses the Hadoop Shell command.

You can use the daemonlog command in the Hadoop script to view and modify the log level of a class, for example, you can view the log level of the NodeManager class with the following command:

Bin/hadoop daemonlog-getlevel ${nodemanager-host}: 8042\

Org.apache.hadoop.yarn.server.nodemanager.NodeManager

The log level of the NodeManager class can be changed to DEBUG with the following command:

Bin/hadoop daemonlog-setlevel ${nodemanager-host}: 8042\

Org.apache.hadoop.yarn.server.nodemanager.NodeManager DEBUG

Where the host,8042 where nodemanager-host is the NodeManager service is the HTTP port number of NodeManager.

Method 2 through the Web interface.

Users can view and modify the log level of a class through the Web interface. For example, the log level of the NodeManager class can be modified through the following URL:

Http://${nodemanager-host}:8042/logLevel

Method 3 modifies the log4j.properties file.

The above two methods can only temporarily modify the log level, which will be reset when Hadoop is restarted. If you want to change the log level permanently, you can add the following configuration option to the log4j.properties file under the target node configuration directory:

Log4j.logger.org.apache.hadoop.yarn.server.nodemanager.NodeManager=DEBUG

In addition, sometimes in order to debug a Java file specifically, you need to output the related logs of that file to a separate file, and you can add the following to the log4j.properties:

# define a custom TTOUT for output

Log4j.logger. Org.apache.hadoop.yarn.server.nodemanager.NodeManager=DEBUG,TTOUT

# set the output mode of TTOUT to output to file

Log4j.appender.TTOUT = org.apache.log4j.FileAppender

# set file path

Log4j.appender.TTOUT.File=$ {hadoop.log.dir} / NodeManager.log

# set the layout of the file

Log4j.appender.TTOUT.layout=org.apache.log4j.PatternLayout

# format the file

Log4j.appender.TTOUT.layout.ConversionPattern=%d {ISO8601}% p% c:% m% n

These configuration options write DEBUG logs from NodeManager.java to the NodeManager.log file in the log directory.

In order to track changes in the value of a variable while reading the source code, the reader may need to add some DEBUG logs themselves. In Hadoop source code, most classes define a log print object that allows you to print logs at all levels. For example, define the object LOG in NodeManager with the following code:

Public static final Log LOG = LogFactory.getLog (NodeManager.class)

Users can use the LOG object to print debug logs. For example, you can add the following code to the first line of the main function in NodeManager:

LOG.debug ("Start to lauch NodeManager...")

Then recompile the Hadoop source code and change the debug level of org.apache.hadoop.yarn.server.nodemanager.NodeManager to DEBUG, and you can see this debugging information after restarting Hadoop.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.