Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use numpy to operate and sort topk functions

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "how to use numpy to achieve the operation of topk functions in parallel". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

TopK algorithm is often used in various functions. In python, computing libraries such as numpy use rich underlying optimization, and the efficiency of matrix calculation is much higher than that of python's for-loop implementation. However, the direct implementation of topK algorithm is not directly provided in numpy.

The pytorch library provides the topk function, which can select the largest (smallest) K item and sort the high-dimensional array along a certain dimension (a total of N items). Returns the sort result and index information. Oddly enough, the lighter numpy library does not provide topK functions directly. Numpy only provides argpartition and partition, and you can rank the largest (smallest) K items in the top K place. Take argpartition as an example, the smallest three items are ranked in the top three:

> x = np.array ([3,5,6,4,2,7,1]) > > x [np.argpartition (x, 3)] array ([2,1,3,4,5,7,6])

Note that argpartition implements partial sorting, as in the example above, the first three items and the rest are separated, but the two parts are not sorted! We may prefer to order several items in topK (the rest are not required). Therefore, an argpartition-based topK method is provided below.

A naive method

The easiest way, of course, is to sort it all, and then take the first K item. The disadvantage is that you have to sort data outside of topK. When K > > dists = np.random.permutation (np.arange (30)) .reshape (6, 5) array ([[17, 28, 1, 24, 23, 8], [9, 21, 3, 22, 4, 5], [19, 12, 26, 11, 13, 27], [10, 15, 18, 14, 7, 16], [0, 25, 29, 2, 6, 20]) > > naive_arg_topK (dists) 2, axis=0) array ([[4,2,0,4,1,1], [1,3,1,2,4,0]) > naive_arg_topK (dists, 2, axis=1) array ([[2,5], [2,4], [3,1], [4,0], [0,3]]) based on partition

For the np.argpartition function, the complexity may drop to O (nlog nlog K) O (nlogK) O (nlogK), in many cases K > > dists = np.random.permutation (np.arange (30)). Reshape (6, 5) array ([[17, 28, 1, 24, 23, 8], [9, 21, 3, 22, 4, 5], [19, 12, 26, 11, 13, 27], [10, 15, 18, 14, 7, 16], [0, 25, 29, 2, 6, 20]) > partition_arg_topK (dists) 2, axis=0) array ([[4,2,0,4,1,1], [1,3,1,2,4,0]) > partition_arg_topK (dists, 2, axis=1) array ([[2,5], [2,4], [3,1], [4,0], [0,3]]) large data testing

The matrix of shape (5000, 100000) is sorted by topK, and the test time is:

Kpartition (s) naive (s) 108.88422.6041009.01222.45810008.90422.506500011.30522.844

Supplement: python heap sorting to implement TOPK problem

# build small top heap jump def sift (li, low, higt): tmp = li [low] I = low j = 2 * I + 1 while j heap [0]: heap [0] = li [I] sift (heap, 0, k-1) # output for i in range (k-1,-1,-1): heap [0], heap [I] = heap [I] Heap [0] sift (heap, 0, I-1) return heapli = [0, 8, 6, 2, 4, 9, 1, 4, 6] print (top_k (li, 3)) "how to use numpy to implement the operation and ordering of topk functions" ends here. Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report