Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

JAVA API operates to merge small files into HDFS (notes)

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Please create your own related documents!

Package com.hadoop.hdfs

Import java.io.IOException

Import java.net.URI

Import java.net.URISyntaxException

Import org.apache.hadoop.conf.Configuration

Import org.apache.hadoop.fs.FSDataInputStream

Import org.apache.hadoop.fs.FSDataOutputStream

Import org.apache.hadoop.fs.FileStatus

Import org.apache.hadoop.fs.FileSystem

Import org.apache.hadoop.fs.FileUtil

Import org.apache.hadoop.fs.Path

Import org.apache.hadoop.fs.PathFilter

Import org.apache.hadoop.io.IOUtils

/ * *

Merge small files into HDFS

, /

Public class MergeSmallFilesToHDFS {

Private static FileSystem fs = null

Private static FileSystem local = null

Public static void main (String [] args) throws IOException

URISyntaxException {

List ()

}

/ * *

Data sets are merged and uploaded to HDFSthrows IOException

Throws URISyntaxException

/

Public static void list () throws IOException, URISyntaxException {

/ / read the configuration of the hadoop file system

Configuration conf = new Configuration ()

/ / File system access API. Note: hdfs://master:9000 is modified to its own HDFS address.

URI uri = new URI ("hdfs://master:9000")

/ / create a FileSystem object

Fs = FileSystem.get (uri, conf)

/ / obtain the local file system

Local = FileSystem.getLocal (conf)

/ / filter the svn file in the directory. Note: the file path E://Hadoop/73/ is modified to its own path.

FileStatus [] dirstatus = local.globStatus (new Path ("E://Hadoop/73/"), new RegexExcludePathFilter ("^ .SVN $"))

/ / get all file paths in the 73 directory

Path [] dirs = FileUtil.stat2Paths (dirstatus)

FSDataOutputStream out = null

FSDataInputStream in = null

For (Path dir: dirs) {

/ / 2019-10-31

String fileName = dir.getName () .replace ("-", "); / / File name

/ / only .txt files in the date directory are accepted

FileStatus [] localStatus = local.globStatus (new Path (dir+ "/"), new RegexAcceptPathFilter ("^ .txt $"))

/ / get all the files in the date directory

Path [] listedPaths = FileUtil.stat2Paths (localStatus)

/ / output path. Note: hdfs://master:9000/20191031/ is modified to its own HDFS directory address.

Path block = new Path ("hdfs://master:9000/20191031/" + fileName + ".txt")

System.out.println ("merged file name:" + fileName+ ".txt")

/ / Open the output stream

Out = fs.create (block)

For (Path p: listedPaths) {

In = local.open (p); / / Open the input stream

IOUtils.copyBytes (in, out, 4096, false); / / copy data

/ / close the input stream

In.close ()

}

If (out! = null) {

/ / close the output stream

Out.close ()

}

}

}

/ * *

Filter files in regex format

, /

Public static class RegexExcludePathFilter implements PathFilter {

Private final String regex

Public RegexExcludePathFilter (String regex) {

This.regex = regex

}

Public boolean accept (Path path) {

Boolean flag = path.toString () .matches (regex)

Return! flag

}

}

/ * *

Accept files in regex format

, /

Public static class RegexAcceptPathFilter implements PathFilter {

Private final String regex

Public RegexAcceptPathFilter (String regex) {

This.regex = regex

}

@ Override

Public boolean accept (Path path) {

Boolean flag = path.toString () .matches (regex)

Return flag

}

}

}

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report