Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to connect Java and Python API interfaces in HDFS

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Today, I will talk to you about how to connect the Java and Python API interfaces in HDFS. Many people may not know much about it. In order to make you understand better, the editor has summarized the following for you. I hope you can get something according to this article.

Now enter the API operations of Java and Python in HDFS, which may be related to Scala later.

Before I talk about Java API, I would like to introduce the IDE--IntelliJ IDEA I use. I am using the community version of 2020.3 x64.

Java API

To create a maven project, with regard to the configuration of Maven, in IDEA, the Maven download source must be configured as Ali Cloud.

You need to set the download source of Ali Cloud in the corresponding D:\ apache-maven-3.8.1-bin\ apache-maven-3.8.1\ conf\ settings.xml.

Let's create a maven project to add common dependencies

Add hadoop-client dependencies, preferably the version specified by hadoop, and add junit unit test dependencies.

Org.apache.hadoop hadoop-common 3.1.4 org.apache.hadoop hadoop-hdfs 3.1.4 org.apache.hadoop hadoop-client 3.1.4 junit junit 4.11 HDFS file upload

Just write the test class here and create a new java file: main.java

The FileSyste here starts as a local file system and needs to be initialized to a HDFS file system

Import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.junit.Test; import java.net.URI; public class main {@ Test public void testPut () throws Exception {/ / there are many ways to get the FileSystem class. Here you can write only one (more commonly used is to make URI) Configuration configuration = new Configuration () / / user is the account of Hadoop cluster. The connection port defaults to 9000 FileSystem fileSystem = FileSystem.get (new URI ("hdfs://192.168.147.128:9000"), configuration, "hadoop"). / / upload f:/stopword.txt to / user/stopword.txt fileSystem.copyFromLocalFile (new Path ("f:/stopword.txt"), new Path ("/ user/stopword.txt")); fileSystem.close ();}}

In the corresponding HDFS, you will see the machine learning related stop words I just uploaded.

HDFS file download

Since I need to initialize FileSystem every time, I am lazy and use @ Before to load every time.

The API interface for downloading the HDFS file is copyToLocalFile. The specific code is as follows.

Test public void testDownload () throws Exception {Configuration configuration = new Configuration (); FileSystem fileSystem = FileSystem.get (new URI ("hdfs://192.168.147.128:9000"), configuration, "hadoop") FileSystem.copyToLocalFile (false, new Path ("/ user/stopword.txt"), new Path ("stop.txt"), true); fileSystem.close (); System.out.println ("over");} Python API

The following is the main introduction of hdfs

We use the command pip install hdfs to install the hdfs library, and before using hdfs, use the command hadoop fs-chmod-R 777 / one to give read, write and executable permissions to the current directory and all files in the directory.

> from hdfs.client import Client > # 2.x version port uses 50070 3.x version port use 9870 > client = Client ('http://192.168.147.128:9870') > client.list (' /') # View the directory under hdfs / ['hadoop-3.1.4.tar.gz'] > client.makedirs (' / test') > client.list ('/') ['hadoop-3.1.4.tar.gz'] 'test'] > client.delete ("/ test") True > client.download (' / hadoop-3.1.4.tar.gz','C:\\ Users\\ YIUYE\\ Desktop')'C:\\ Users\\ YIUYE\\ Desktop\ hadoop-3.1.4.tar.gz' > client.upload ('/' 'C:\\ Users\\ YIUYE\\ Desktop\ demo.txt') > client.list (' /') / demo.txt' > client.list ('/') ['demo.txt',' hadoop-3.1.4.tar.gz'] > # upload demo.txt content: Hello\ n hdfs > > with client.read ("/ demo.txt") as reader:... Print (reader.read ()) b'Hello\ r\ nhdfs\ r\ n'

Compared to Java API,Python API connection, it is really simple.

After reading the above, do you have any further understanding of how to connect the Java and Python API interfaces in HDFS? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report