Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize the new API of mapreduce multi-file output

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "how to realize the new API of mapreduce multi-file output". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. For MultipleOutputs.addNamedOutput (job, "errorlog") in the code

TextOutputFormat.class, Text.class, NullWritable.class); method. In fact, the second parameter is not used in this way. Look at the code below:

Private MultipleOutputs multipleOutputs = null; @ Override protected void reduce (IntWritable key, Iterable values,Context context) throws IOException, InterruptedException {for (Text val:values) {multipleOutputs.write ("KeySplit", NullWritable.get (), val, key.toString () + "/"); multipleOutputs.write ("AllData", NullWritable.get (), val) }}

The write function has a lot of overloaded methods, which used to have three parameters, but this method actually outputs all the reduce output to one folder.

At this point, the second argument we pass when we call the MultipleOutputs.addNamedOutput () function is multiple, so it will result in

-rw-r--r-- 2 hadoop supergroup 10569073 2014-06-06 11:50 / test/aa/fileRequest-m-00063.lzo

-rw-r--r-- 2 hadoop supergroup 10512656 2014-06-06 11:50 / test/aa/fileRequest-m-00064.lzo

-rw-r--r-- 2 hadoop supergroup 68780 2014-06-06 11:51 / test/aa/firstIntoTime-m-00000.lzo

-rw-r--r-- 2 hadoop supergroup 67901 2014-06-06 11:51 / test/aa/firstIntoTime-m-00001.lzo

Such a phenomenon, and will output a lot of useless empty files

So in fact, the write method has a method with four parameters, and the last parameter happens to pass a directory to output the data generated by reduce to different folder directories for different logic. Such as multipleOutputs.write ("KeySplit", NullWritable.get (), val, key.toString () + "/") in the first paragraph of code Statement, the function of the last parameter is to use key as a folder and output data with the same key to this folder, followed by a "/" represents the current directory, which certainly does not refer to the current directory of the project, but the parameters of the output directory passed when executing hadoop jar, such as: hadoop jar test.jar com.TestJob / input / output

Suppose the data looks like this:

1 Limei

1 Xiaohui

2 Xiao Hong

3 Dahua

Then the output of the three folders is

/ output/1

/ output/2

/ output/3

Where / output/1 is a file in this folder, the content is

1 Limei

1 Xiaohui

There are other methods for the write function, which have not been studied yet, and the first function of the write method has not been studied. If you have time, you will summarize the multi-file output in detail.

Note: when configuring job

This code

MultipleOutputs.addNamedOutput (job, "errorlog", TextOutputFormat.class, Text.class, NullWritable.class); "mapreduce multi-file output how to implement the new API" is introduced here, thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 276

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report