In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article focuses on "how to achieve external sorting based on multipath merging". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "how to achieve external sorting based on multipath merging".
Import com.google.common.collect.Lists;import java.io.*;import java.util.*;/** * sorts the data that is much larger than memory, first dividing the file into memory and processing multiple small files separately, * sorting these small files in memory, and then merging these files into a final ordered large file * * / public class ExternalSort {public static int BUFFER_SIZE = 1024 * 4 * 1000 / / buffered read public static int LITTLE_FILE_SIZE = 10; / / number of records per file public static File IN_FILE = new File ("data/f1.txt"); / / Files to be sorted public static File OUT_FILE = new File ("data/concat") / / output the merged ordered file / * * @ param args * @ throws IOException * / public static void main (String [] args) throws IOException {/ / sort long start = System.currentTimeMillis (); new ExternalSort (). MSort (IN_FILE); long end = System.currentTimeMillis (); System.out.println ((end-start) / 1000 + "s") } / * Multiplex merge * * @ param file * @ throws IOException * / public void mSort (File file) throws IOException {List files = split (file); / / split into small files and loaded into memory sort multMerge (files); / / merge} / * Splits the original file into a number of sub files. * / public List split (File file) throws IOException {List files = Lists.newArrayList (); BufferedReader din = new BufferedReader (new FileReader (file), BUFFER_SIZE); int [] buffer = new int [LITTLE _ FILE_SIZE]; boolean fileCompleted = false; while (! fileCompleted) {int len = buffer.length; for (int I = 0; I)
< buffer.length && !fileCompleted; i++) { try { buffer[i] = Integer.parseInt(din.readLine()); } catch (NumberFormatException | IOException e) { fileCompleted = true; len = i; } } //将buffer数据排序写入文件 if (len >0) {Arrays.sort (buffer, 0, len); File f = new File ("data/set" + new Random (). NextInt ()); BufferedWriter dout = new BufferedWriter (new FileWriter (f)); for (int I = 0; I
< len; i++) { dout.write(buffer[i]+"\n"); } dout.close(); files.add(f); } } din.close(); return files; } /** * 多路归并 * * @param files * @throws IOException */ public static void multMerge(List files) throws IOException { //构建输出缓冲流 BufferedWriter out = new BufferedWriter(new FileWriter(OUT_FILE), BUFFER_SIZE); //构建输入缓冲流 List inList = Lists.newArrayList(); int size = files.size(); for (int i = 0; i < size; i++) { inList.add(i,new BufferedReader(new FileReader(files.get(i)), BUFFER_SIZE)); } //构建一个数组,存放从每个输入流里取出来的数据 int[] ext = new int[size]; int left = size; //定义剩余文件数 for (int i = 0; i < size; i++) { try { ext[i] = Integer.parseInt(inList.get(i).readLine()); } catch (Exception e) { ext[i] = -1; left--; inList.get(i).close(); } } //将ext数组中最小值写入文件 while (left >0) {int ind = getMinIndex (ext); out.write (ext[ ind] + "\ n"); / / then pull the next data try {ext [ind] = Integer.parseInt (inList.get (ind). ReadLine ());} catch (Exception e) {ext [ind] =-1 Left--; inList.get (ind). Close ();}} out.close ();} / / find the smallest public static int getMinIndex in the data (int [] ext) {/ / find the first element that is not-1 int min; int inx = 0 While (true) {if (ext [inx]! =-1) {min=ext [inx]; break;} else {inx++;}} for (int I = 1; I < ext.length If (ext [I] < min & & ext [I]! =-1) {min = ext [I]; inx = I;}} return inx;}} so far, I believe you have a deeper understanding of "how to implement external sorting based on multipath merging". You might as well do it in practice! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.