Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Simple usage of automatic HDFS data replication mechanism

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "the simple use of the automatic HDFS data replication mechanism". In the daily operation, I believe that many people have doubts about the simple usage of the automatic HDFS data replication mechanism. The editor consulted all kinds of materials and sorted out the simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "the simple use of the automatic HDFS data replication mechanism". Next, please follow the editor to study!

In the first half of this chapter, we examined two mechanisms that can convert semi-structured and binary data into HDFS: the open source HDFS File Slurper project and Oozie, which trigger the data entry workflow. The challenge of using the local file system for input (and output) is that map and reduce tasks running on the cluster will not be able to access the file system on a specific server, and there are three broad options for moving data from the HDFS to the file system:

Host agent on a server, such as a Web server, and then write it using MapReduce.

Write to the local file system in MapReduce, and then trigger a script on the remote server to move the data in the post-processing step.

Run the process on the remote server to extract data directly from the HDFS.

The third option is the preferred method, because it is the simplest and most effective, so this is the focus of this section. We will learn how to use HDFS File Slurper to automatically move files out of HDFS to the local file system.

Automatic mechanism for exporting files from HDFS

Suppose you have files written by MapReduce in HDFS, and you want to automatically extract them to the local file system. No Hadoop tool supports this type of functionality, so you must look at other methods.

problem

Automatically move files from HDFS to the local file system.

Solution

The HDFS file Slurper can be used to copy files from HDFS to the local file system.

Discuss

The goal here is to use the HDFS File Slurper project (https://github.com/alexholmes/ hdfs-file-slurper) to assist with automation. We covered HDFS File Slurper in detail in the previous article, so please read this section before continuing with this technique.

HDFS Slurper supports moving data from HDFS to local directories. All we need to do is flip the source and destination directories, as shown in the Slurper configuration file below:

You will notice that there are not only source directories but also work, completion and error directories in HDFS. This is due to the need to be able to automatically move files between directories without the expensive overhead of cross-file system replication.

Summary

At this point, you may want to know how to trigger Slurper to copy the directory you just wrote using the MapReduce job. When the MapReduce job completes successfully, it creates a file called _ SUCCESS in the job output directory. This seems to be the perfect trigger for starting the output process to copy the content to the local file system. It turns out that Oozie has a mechanism to trigger the workflow when these Hadoop files are detected to be "successful," but the challenge here is that any work performed by Oozie is performed in MapReduce, so it cannot be used to perform a direct transfer. You can write your own script, poll HDFS to find the completed directory, and then trigger the file copy process. If the source file needs to remain unchanged, the file copy process can be Slurper or a simple hadoop fs-get command.

At this point, the study of "the simple use of automated HDFS data replication mechanism" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Network Security

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report