Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of performance Metrics and logs in hadoop

2025-01-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you the example analysis of performance indicators and logs in hadoop, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

Hadoop indicator type

The metrics of Hadoopde daemons can be divided into different groups according to the context to which they belong, as follows:

JVM metrics: these metrics are generated by JVM running in the cluster, including JVM heap size and garbage collection related metrics, such as current heap memory (MemHeapUsed) usage and total GC count (GcCount).

RPC metrics: metrics in the context of rpc include hostnames and ports, as well as metrics such as number of bytes occurring (SentBytes), current punch connections (NumOpenConnections), and number of authentication failures.

DFS metrics: dfs context includes metrics related to NameNode, HDFS file system, DataNodes, and JournalNodes. DFS metrics can tell whether there are a large number of file creation and deletion operations in the cluster.

Log messages for Hadoop

You can access hadoop log messages for Spark and other jobs by browsing individual log files or through Hadoop's built-in web interface. Most of the time, it is better to access the logs through the web interface because you can save time and quickly find out the cause of performance problems or job failures:

Hadoop generates two main types of logs:

It generates logs for daemons such as NameNode and DataNode. Daemon logs are mainly used by administrators because they help to troubleshoot unexpected failures of key Hadoop services such as DataNode and NameNode.

Hadoop also generates logs for each application running in the cluster, and hadoop application logs can be used by developers to understand the reasons for job failures and performance degradation.

You can view hadoop logs in a variety of ways

Hadoop web UI, especially ResourceManager webUi, can avoid the trouble of accessing log storage location and viewing log files. You can also view logs through JobHistory web UI.

Check the log information directly from the log file

For some application logs, aggregate them to HDFS storage if log aggregation is enabled.

Check through the yarn command:

The yarn application command manages the following tasks:

List the applications running in the cluster

Kill a running application

Gets the status of the running application.

View the Yarn application

Yarn application-list can retrieve a list of all jobs, regardless of their status. Jobs can have the following states: All, NEW, NEW_SAVING, SUMBMITTED, ACCEPTED, RUNNING, FINISHED, FAILED, and KILLED. Specify the-appStates option

Yarn application-list-appStates running

Check the status of the application

Yarn application-status

Jobs being executed by kill

Yarn application-kill

Check node status

Yarn node-all-list lists all nodes in the cluster and their status

Get Job Log

The syntax of yarn logs fame and fortune:

Yarn logs-applicationId

You can only get the logs of running jobs that have ended.

Location where Hadoop stores logs

1. Hdfs: this is the location where hadoop creates a staging directory to store job execution files, such as the job.xml that contains the hadoop parameter for running the job.

2. NodeManager local directory: this is the directory created on the local file system where hadoop stores Shell scripts produced by the NodeManager service to execute the ApplicationMaster container. You can use the yarn.nodemanager.local.dir parameter in the yarn-site.xml file to specify the NodeManger local directory location.

This parameter provides a list of directories where NodeManager stores its local files. The local file directory ${yarn.nodemanager.local-dir} / usercache/user/.... of the actual application under these directories. Each NodeManager is in the local application cache under the NodeManager local directory

3. NodeManger log directory: this is the local directory on linux, where NodeManager stores the actual log files of applications that users are running. All containers that execute jobs on this node's NodeManager and their application logs are stored in this directory. Use the yarn.nodemanager.log-dirs parameter to specify the location of the NodeManager log directory.

There is no need to worry that the nm-local-dirs directory will fill up the job files in the appcache subdirectory, which will be automatically deleted when the job is completed. But some jobs do contain large files, the configuration property yarn.nodemanager.delete.debug-delay-sec specifies the time to keep the local log directory after the application, the configuration time expires, and NodeManager's deletionservicehi deletes the application's local file directory structure.

Hadoop storage log aggregation

When log aggregation is enabled, nodemanager connects all container logs to a file and saves it in HDFS, and you can use the yarn.nodemanager.remote-app-log-dir parameter to configure where hadoop stores aggregate logs in hdfs. Generally configured in / tmp/hadoop/logs/

There are three ways to get application logs

1. Obtain from hdfs

2. From web UI via hadoop, click applicationmaster in the unfinished app, and click logs below the tab.

3. View from JobHistoryServer UI after the job is completed.

The above is all the content of the article "sample Analysis of performance Metrics and logs in hadoop". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report