How to read data from Hadoop URL 04/19 Update SLTechnology News&Howtos

How to read data from Hadoop URL

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article introduces the knowledge of "how to read data from Hadoop URL". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

One of the easiest ways to read files from the Hadoop file system is to use the java.net.URL object to open a data stream and read data from it. The general format is as follows:

1. InputStream in = null

2. Try {

In = new URL ("hdfs://host/path") .openStream ()

4. / / process in

5.} finally {

6. IOUtils.closeStream (in)

7.}

There is also a bit of work needed to get Java to recognize the URL scheme of the Hadoop file system, which is to call the setURLStreamHandler-Factory method in URL through an instance of FsUrlStreamHandlerFactory. This method can only be called once in a Java virtual machine, so it is generally executed in a static block. This limitation means that if other parts of the program (probably third-party parts that are not in your control) set a URLStreamHandlerFactory, we can no longer read data from the Hadoop. Another approach will be discussed in the next section.

Example 3-1 shows a program that displays files from the Hadoop file system in standard output, which is similar to Unix's cat command.

Example 3-1: using URLStreamHandler to display files of the Hadoop file system in a standard output format

1. Public class URLCat {

two。

3. Static {

4. URL.setURLStreamHandlerFactory (new FsUrlStreamHandlerFactory ())

5.}

7. Public static void main (String [] args) throws Exception {

8. InputStream in = null

9. Try {

10. In = new URL (args [0]). OpenStream ()

11. IOUtils.copyBytes (in, System.out, 4096, false)

12.} finally {

13. IOUtils.closeStream (in)

14.}

15.}

16.}

We use the concise IOUtils class in Hadoop to close the data flow in the find clause and copy the bytes between the input stream and the output stream (in this case, System.out). The last two parameters of the copyBytes method, the former is the size of the buffer to be replicated, and the latter indicates whether the data flow is turned off after the replication ends. Here the input stream is turned off, and System.out does not need to be closed.

This is the end of "how to read data from Hadoop URL". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.