In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
I. Overview of sorting
1. Sorting is performed three times during the shuffle process of MapReduce, which are:
Overflow phase of map: quick sorting by partition and key
Merge overwrite files of map: merge and sort multiple overflow files of the same partition to synthesize large overflow files
Reduce input phase: merge and sort data files from different map task in the same partition
2. In the whole process of MapReduce, the output KV pairs are sorted by key by default, and quick sort is used.
The sort of map output is actually the sort in the overflow process above.
The sort of reduce output, that is, after reduce has processed the data, MapReduce will automatically sort the output KV according to key.
The above sorting is sorted according to the Key in KV. So when our custom class is used as a Key, we need to implement the WritableComparable interface, that is, the compareTo () method in the implementation, for sorting and comparison.
The rules of comparison are as follows:
Public int compareTo (object other) {this > other returns 1, positive order, return-1, reverse order. } definition of secondary and secondary sorting
When sorting by key, if key is a compound object, that is, the object contains multiple member attributes, then the comparison between multiple attributes will be involved in the key comparison, and if the comparison condition in the compareTo () method is two, it is called secondary sorting.
III. Definition of auxiliary sorting
Auxiliary sorting is also called grouping sorting, which refers to the grouping according to the sorting rules in the group process before reduce, because when grouping, it is necessary to compare whether the key in KV is the same, if the same is the same, it will be classified into the same group, if not equal, it will be classified into different groups, so it involves the key comparison method. Generally speaking, it actually defines when key is equal. This process can define the method of grouping itself, that is, the implementation class of grouping sorting.
How to use it:
1. Custom grouping class, inheriting WritableComparator
2. Call the constructor of the parent class to create an instance
3. Override the compare method of the parent class
Example:
Public class OrderGroupCompartor extends WritableComparator {protected OrderGroupCompartor () {super (OrderBean.class, true);} / * * is grouped based on the ID in the orderbean object. * if the same ID thinks it is the same group, a group will only call reduce * * @ param a comparison object 1 * @ param b comparison object 2 * @ return * / @ Override public int compare (WritableComparable a, WritableComparable b) {OrderBean aOrderBean = (OrderBean) a; OrderBean bOrderBean = (OrderBean) b If (aOrderBean.getID () > bOrderBean.getID ()) {return 1;} else if (aOrderBean.getID () < bOrderBean.getID ()) {return-1;} else {return 0;}
We should note that when grouping, the key in the same group is based on the key in the first KV pair that enters the partition. Such as:
There are two KV pairs: 1, 2, in which key is composed of id and item name, and value is assumed to be grouped according to the id in key, so the above two KV belong to the same group, but in fact the key of the two KV is not equal. When No. 1 KV first enters the group, then the No. 1 key will be used as the key of the group. The grouping result is: if the No.2 KV enters first, then according to the previous rules, the grouping result is: this will happen, and we should make good use of this.
So who first entered the group? It is very simple, according to the pre-sorted order, in front of the natural first entry. The sort here is actually the result of the merge sort at the reduce end, and the sort used is actually based on the compareTo method in the wrapper class of key, which belongs to the ordinary sort.
After you have written the custom grouping sort class, you need to specify the custom grouping class in job:
Job.setGroupingComparatorClass (OrderGroupCompartor.class); IV. Sort examples
For general ranking, please see "MapReduce- Statistics of Mobile number Traffic".
For secondary sorting and auxiliary sorting, please see "MapReduce-- gets the most expensive goods"
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.