In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Please create your own related documents!
Package com.hadoop.hdfs
Import java.io.IOException
Import java.net.URI
Import java.net.URISyntaxException
Import org.apache.hadoop.conf.Configuration
Import org.apache.hadoop.fs.FSDataInputStream
Import org.apache.hadoop.fs.FSDataOutputStream
Import org.apache.hadoop.fs.FileStatus
Import org.apache.hadoop.fs.FileSystem
Import org.apache.hadoop.fs.FileUtil
Import org.apache.hadoop.fs.Path
Import org.apache.hadoop.fs.PathFilter
Import org.apache.hadoop.io.IOUtils
/ * *
Merge small files into HDFS
, /
Public class MergeSmallFilesToHDFS {
Private static FileSystem fs = null
Private static FileSystem local = null
Public static void main (String [] args) throws IOException
URISyntaxException {
List ()
}
/ * *
Data sets are merged and uploaded to HDFSthrows IOException
Throws URISyntaxException
/
Public static void list () throws IOException, URISyntaxException {
/ / read the configuration of the hadoop file system
Configuration conf = new Configuration ()
/ / File system access API. Note: hdfs://master:9000 is modified to its own HDFS address.
URI uri = new URI ("hdfs://master:9000")
/ / create a FileSystem object
Fs = FileSystem.get (uri, conf)
/ / obtain the local file system
Local = FileSystem.getLocal (conf)
/ / filter the svn file in the directory. Note: the file path E://Hadoop/73/ is modified to its own path.
FileStatus [] dirstatus = local.globStatus (new Path ("E://Hadoop/73/"), new RegexExcludePathFilter ("^ .SVN $"))
/ / get all file paths in the 73 directory
Path [] dirs = FileUtil.stat2Paths (dirstatus)
FSDataOutputStream out = null
FSDataInputStream in = null
For (Path dir: dirs) {
/ / 2019-10-31
String fileName = dir.getName () .replace ("-", "); / / File name
/ / only .txt files in the date directory are accepted
FileStatus [] localStatus = local.globStatus (new Path (dir+ "/"), new RegexAcceptPathFilter ("^ .txt $"))
/ / get all the files in the date directory
Path [] listedPaths = FileUtil.stat2Paths (localStatus)
/ / output path. Note: hdfs://master:9000/20191031/ is modified to its own HDFS directory address.
Path block = new Path ("hdfs://master:9000/20191031/" + fileName + ".txt")
System.out.println ("merged file name:" + fileName+ ".txt")
/ / Open the output stream
Out = fs.create (block)
For (Path p: listedPaths) {
In = local.open (p); / / Open the input stream
IOUtils.copyBytes (in, out, 4096, false); / / copy data
/ / close the input stream
In.close ()
}
If (out! = null) {
/ / close the output stream
Out.close ()
}
}
}
/ * *
Filter files in regex format
, /
Public static class RegexExcludePathFilter implements PathFilter {
Private final String regex
Public RegexExcludePathFilter (String regex) {
This.regex = regex
}
Public boolean accept (Path path) {
Boolean flag = path.toString () .matches (regex)
Return! flag
}
}
/ * *
Accept files in regex format
, /
Public static class RegexAcceptPathFilter implements PathFilter {
Private final String regex
Public RegexAcceptPathFilter (String regex) {
This.regex = regex
}
@ Override
Public boolean accept (Path path) {
Boolean flag = path.toString () .matches (regex)
Return flag
}
}
}
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.