Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

14. MapReduce--OutputFormat and RecordWriter abstract classes

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

I. basic principles

After the execution of the ​ reduce, each reduce outputs the KV to a file. So in what format does KV export to a file? This involves two abstract classes: OutputFormat and RecordWriter.

1. OutputFormatpublic abstract class OutputFormat {public OutputFormat () {} public abstract RecordWriter getRecordWriter (TaskAttemptContext var1) throws IOException, InterruptedException; public abstract void checkOutputSpecs (JobContext var1) throws IOException, InterruptedException; public abstract OutputCommitter getOutputCommitter (TaskAttemptContext var1) throws IOException, InterruptedException;}

In fact, the main thing is to create RecordWriter objects.

2. RecordWriterpublic abstract class RecordWriter {public RecordWriter () {} / / writes KV to the output stream public abstract void write (K var1, V var2) throws IOException, InterruptedException; / / closes the stream public abstract void close (TaskAttemptContext var1) throws IOException, InterruptedException;}

The main thing is the write method, which writes the KV to the file.

2. Commonly used OutputFormat implementation classes 1, TextOutputFormat

​ inherits FileOutputFormat, and the returned RecordWriter is TextOutputFormat.LineRecordWriter. Converts each KV to each line of text. You can define the delimiter for key and value in the text, which defaults to "\ t".

2 、 SequenceFileOutputFormat

​ also inherits from FileOutputFormat, and the returned RecordWriter is an anonymous inner class that directly appends all KV to the text without additional lines (unless there is a newline in the original data).

​ SequenceFileOutputFormat writes its output as a sequential file. If the output needs to be used as input for subsequent MapReduce tasks, this is a good output format because it is compact and easy to compress.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report