In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Original code of setup function: (excerpt from "hadoop practice")
* Called once at the start of the task.
Protected void setup (Context context) throws IOException,InterruptedException {}
As you can see from the comments, the setup function is called when Task starts.
Jobs in MapReduce are organized into MapTask and ReduceTask.
Each Task takes the Map class or reduce class as the body of the processing method
The input shard is the input of the processing method, and the Task is destroyed after its own shard is processed.
As you can see here, the setup function is called once before data processing after task starts.
The overridden Map and Reduce functions are called once for each Key of the input fragment.
So the setup function can be treated as a global processing on Task.
Taking advantage of the characteristics of the setup function, you can put repeated processing in the Map or Reduce function into the setup function.
Such as the "name" in the Exercise_2 given by the teacher
It is important to note, however, that calling the setup function is only a global operation on the corresponding Task, not a global operation of the entire job.
You can first use api to transfer local files to / user/hadoop/test in hdfs.
/ / upload local files to HDFS
Public static void upload (String src,String dst) throws FileNotFoundException,IOException {
InputStream in = new BufferedInputStream (new FileInputStream (src))
/ / get the configuration object
Configuration conf = new Configuration ()
/ / File system
FileSystem fs = FileSystem.get (URI.create (dst), conf)
/ / output stream
OutputStream out = fs.create (new Path (dst), new Progressable () {
Public void progress () {
System.out.println ("upload a file that sets the size and capacity of the cache!")
}
});
/ / connect two streams to form a channel to transfer data from the input to the output stream
IOUtils.copyBytes (in, out, 4096 dint true)
}
Just call this function when uploading.
For example
Upload ("/ home/jack/test/test.txt", "/ user/hadoop/test/test")
The first is the file in the local directory, followed by the file in the hdfs
Note that both must be "path + file name" and cannot be without a file name
Configuration conf = new Configuration ()
Conf.setStrings ("job_parms", "aaabbc"); / / this is the key sentence.
Job job = new Job (conf, "load analysis")
Job.setJarByClass (LoadAnalysis.class)
Job.setMapperClass (LoadMapper.class)
Job.setReducerClass (LoadIntoHbaseReduce.class)
Job.setMapOutputKeyClass (Text.class)
Job.setMapOutputValueClass (Text.class)
FileInputFormat.addInputPath (job, new Path (otherArgs [0]))
@ Override
Protected void setup (Context context)
Throws IOException, InterruptedException {
Try {
/ / obtain configuration parameters from global configuration
Configuration conf = context.getConfiguration ()
String parmStr = conf.get ("job_parms"); / / so you get it.
.
} catch (SQLException e) {
E.printStackTrace ()
}
}
Global file: hadoop has a distributed cache to save the global file, ensuring that all node can access it, using the class name DistributedCache
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.