In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the Python classic skills of the use of what related knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe you will have something to gain after reading this Python classic skills, let's take a look at it.
How to measure the execution time of a program
The question of how Python accurately measures the execution time of programs seems simple but complex, because the execution time of programs is affected by many factors, such as operating system, Python version, and related hardware (CPU performance, memory read and write speed), and so on. These factors are certain when running the same version of the language on the same computer, but the sleep time of the program is still variable, and other programs running on the computer can also interfere with the experiment. so strictly speaking, this is "the experiment is not repeatable."
The two representative libraries I have learned about timing are time and timeit.
Among them, there are three functions in the time library: time (), perf_counter () and process_time (), which can be used for timing (in seconds). The suffix _ ns indicates timing in nanoseconds (starting from Python3.7). There was a clock () function before that, but it was removed after Python3.3. The differences between the above three are as follows:
Time () is relatively less precise, and affected by the system, it is suitable to represent the date and time or the timing of large programs.
Perf_counter () is suitable for smaller program tests and calculates the sleep () time.
Process_time () is suitable for smaller program testing and does not calculate sleep () time.
Compared with the time library, timeit has two advantages:
Timeit will choose the best timer based on your operating system and Python version.
Timeit temporarily disables garbage collection during the timing period.
Parameter description of timeit.timeit (stmt='pass', setup='pass', timer=, number=1000000, globals=None):
Stmt='pass': requires a statement or function for timing.
The code that setup='pass': will run before executing stmt. Typically, it is used to import modules or declare necessary variables.
Timer=: timer function, which defaults to time.perf_counter ().
Number=1000000: the number of times the timing statement is executed. The default is one million times.
Globals=None: specifies the namespace in which the code is executed.
All the timings in this paper use the timeit method, and the default execution number is one million times.
Why do you have to execute a million times? Because our test program is very short, if we don't execute it so many times, we can't see the gap at all.
1. Use map () for function mapping
Exp1: converts lowercase letters in a string array to uppercase letters.
The test array is oldlist = ['life',' is', 'short',' iota, 'choose',' python'].
Method one
Newlist = [] for word in oldlist: newlist.append (word.upper ())
Method two
List (map (str.upper, oldlist))
The first method takes 0.5267724000000005s, the second method 0.41462569999999843s, and the performance is improved by 21.29%.
two。 Use set () to find the intersection
Exp2: find the intersection of two list.
Test array: a = [1, 2, 2, 3, 4, 5], b = [2, 4, 6, 6, and 8, 10].
Method one
Overlaps = [] for x in a: for y in b: if x = = y: overlaps.append (x)
Method two
List (set (a) & set (b))
Method 1 takes 0.95072640000006s, method 2 takes 0.6148200999999993s, and the performance is improved by 35.33%.
The syntax for set (): |, &,-denotes union, intersection, and subtraction, respectively.
3. Sort using sort () or sorted ()
We can sort the sequence in many ways, but in fact, the loss of writing our own sorting algorithm outweighs the gain. Because the built-in sort () or sorted () method is good enough, and the use of the parameter key can achieve different functions, very flexible. The difference between the two is that the sort () method is defined only in list, while sorted () is a global method that is valid for all iterable sequences.
Exp3: sort the same list using the Quick sort and sort () methods, respectively.
Test array: lists = [2, 1, 4, 4, 3, 0].
Method one
Def quick_sort (lists,i,j): if I > = j: return list pivot = lists [I] low = I high = j while i
< j: while i < j and lists[j] >= pivot: J-= 1 lists [I] = lists [j] while I
< j and lists[i] b[2] else 1 #成绩姓名都相同,按照年龄降序排序 students = [('john', 'A', 15),('john', 'A', 14),('jane', 'B', 12),('dave', 'B', 10)]sorted(students, key = functools.cmp_to_key(cmp))4.使用collections.Counter()计数 Exp4:统计字符串中每个字符出现的次数。 测试数组:sentence='life is short, i choose python'。 方法一 counts = {}for char in sentence: counts[char] = counts.get(char, 0) + 1 方法二 from collections import CounterCounter(sentence) 方法一耗时 2.8105250000000055s,方法二耗时 1.6317423000000062s,性能提升 41.94% 5.使用列表推导 列表推导(list comprehension)短小精悍。在小代码片段中,可能没有太大的区别。但是在大型开发中,它可以节省一些时间。 Exp5:对列表中的奇数求平方,偶数不变。 测试数组:oldlist = range(10)。 方法一 newlist = []for x in oldlist: if x % 2 == 1: newlist.append(x**2) 方法二 [x**2 for x in oldlist if x%2 == 1] 方法一耗时 1.5342976000000021s,方法二耗时 1.4181957999999923s,性能提升 7.57% 6.使用 join() 连接字符串 大多数人都习惯使用+来连接字符串。但其实,这种方法非常低效。因为,+操作在每一步中都会创建一个新字符串并复制旧字符串。更好的方法是用 join() 来连接字符串。关于字符串的其他操作,也尽量使用内置函数,如isalpha()、isdigit()、startswith()、endswith()等。 Exp6:将字符串列表中的元素连接起来。 测试数组:oldlist = ['life', 'is', 'short', 'i', 'choose', 'python']。 方法一 sentence = ""for word in oldlist: sentence += word 方法二 "".join(oldlist) 方法一耗时 0.27489080000000854s,方法二耗时 0.08166570000000206s,性能提升 70.29% join还有一个非常舒服的点,就是它可以指定连接的分隔符,举个例子 oldlist = ['life', 'is', 'short', 'i', 'choose', 'python']sentence = "//".join(oldlist)print(sentence) life//is//short//i//choose//python 7.使用x, y = y, x交换变量 Exp6:交换x,y的值。 测试数据:x, y = 100, 200。 方法一 temp = xx = yy = temp 方法二 x, y = y, x 方法一耗时 0.027853900000010867s,方法二耗时 0.02398730000000171s,性能提升 13.88% 8.使用while 1取代while True 在不知道确切的循环次数时,常规方法是使用while True进行无限循环,在代码块中判断是否满足循环终止条件。虽然这样做没有任何问题,但while 1的执行速度比while True更快。因为它是一种数值转换,可以更快地生成输出。 Exp8:分别用while 1和while True循环 100 次。 方法一 i = 0while True: i += 1 if i >100: break
Method two
I = 0while 1: I + = 1 if I > 100: break
Method 1 takes 3.679268300000004s, method 2 takes 3.607847499999991s, and the performance is improved by 1.94%.
9. Use decorator caching
Storing files in a cache helps to quickly restore functionality. Python supports decorator caching, which maintains a specific type of cache in memory to achieve optimal software driver speed. We use the lru_cache decorator to provide caching for Fibonacci functions, and there are a lot of repeated calculations when using fibonacci recursive functions, such as fibonacci (1) and fibonacci (2). After using lru_cache, all the repeated calculations will be performed only once, thus greatly improving the execution efficiency of the program.
Exp9: find the Fibonacci series.
Test data: fibonacci (7).
Method one
Def fibonacci (n): if n = = 0: return 0 elif n = = 1: return 1 return fibonacci (n-1) + fibonacci (NMur2)
Method two
Import functools@functools.lru_cache (maxsize=128) def fibonacci (n): if n = = 0: return 0 elif n = = 1: return 1 return fibonacci (n-1) + fibonacci (NMur2)
Method 1 takes 3.955014900000009s, method 2 takes 0.05077979999998661s, and the performance is improved by 98.72%.
Note:
The cache is based on the parameter as the key, that is, when the parameter is constant, the function decorated by lru_cache will be executed only once.
All parameters must be hashable, for example, list cannot be used as an argument to a function decorated by lru_cache.
Import functools @ functools.lru_cache (maxsize=100) def demo (a, b): print ('I was executed') return a + bif _ _ name__ ='_ main__': demo (1, 2) demo (1, 2)
I was executed (demo (1,2) was executed twice, but only output once)
From functools import lru_cache @ lru_cache (maxsize=100) def list_sum (nums: list): return sum (nums) if _ _ name__ ='_ _ main__': list_sum ([1,2,3,4,5])
TypeError: unhashable type: 'list'
Two optional parameters for functools.lru_cache (maxsize=128, typed=False):
Maxsize represents the cached memory footprint value, beyond which the result is freed, and the new calculation result is cached, which should be set to the power of 2.
If typed is True, the results obtained by different parameter types will be saved separately.
10. The reduction point operator (.) The use of
Dot operator (.) A property or method used to access an object, which causes the program to use _ _ getattribute__ () and _ _ getattr__ () for dictionary lookups, resulting in unnecessary overhead. In particular, it is important to reduce the use of dot operators in loops and move it out of the loop.
This inspires us to use from as much as possible. Import... This is a way to guide the package, rather than getting it through the dot operator when you need to use a method. In fact, it is not only the dot operator, but also many other unnecessary operations that we try to move outside the loop.
Exp10: converts lowercase letters in a string array to uppercase letters.
The test array is oldlist = ['life',' is', 'short',' iota, 'choose',' python'].
Method one
Newlist = [] for word in oldlist: newlist.append (str.upper (word))
Method two
Newlist = [] upper = str.upperfor word in oldlist: newlist.append (upper (word))
Method 1 takes 0.7235491999999795s, method 2 takes 0.5475435999999831s, and the performance is improved by 24.33%.
11. Use for loop instead of while loop
When we know exactly how many times to loop, it is better to use a for loop than a while loop.
Exp12: use for and while to loop 100 times respectively.
Method one
I = 0while I
< 100: i += 1 方法二 for _ in range(100): pass 方法一耗时 3.894683299999997s,方法二耗时 1.0198077999999953s,性能提升73.82% 12.使用Numba.jit加速计算 Numba 可以将 Python 函数编译码为机器码执行,大大提高代码执行速度,甚至可以接近 C 或 FORTRAN 的速度。它能和 Numpy 配合使用,在 for 循环中或存在大量计算时能显著地提高执行效率。 Exp12:求从 1 加到 100 的和。 方法一 def my_sum(n): x = 0 for i in range(1, n+1): x += i return x 方法二 from numba import jit@jit(nopython=True) def numba_sum(n): x = 0 for i in range(1, n+1): x += i return x 方法一耗时 3.7199997000000167s,方法二耗时 0.23769430000001535s,性能提升 93.61% 13.使用Numpy矢量化数组 矢量化是 NumPy 中的一种强大功能,可以将操作表达为在整个数组上而不是在各个元素上发生。这种用数组表达式替换显式循环的做法通常称为矢量化。 在 Python 中循环数组或任何数据结构时,会涉及很多开销。NumPy 中的向量化操作将内部循环委托给高度优化的 C 和 Fortran 函数,从而使 Python 代码更加快速。 Exp13:两个长度相同的序列逐元素相乘。 测试数组:a = [1,2,3,4,5], b = [2,4,6,8,10] 方法一 [a[i]*b[i] for i in range(len(a))] 方法二 import numpy as npa = np.array([1,2,3,4,5])b = np.array([2,4,6,8,10])a*b 方法一耗时 0.6706845000000214s,方法二耗时 0.3070132000000001s,性能提升 54.22% 14.使用in检查列表成员 若要检查列表中是否包含某成员,通常使用in关键字更快。 Exp14:检查列表中是否包含某成员。 测试数组:lists = ['life', 'is', 'short', 'i', 'choose', 'python'] 方法一 def check_member(target, lists): for member in lists: if member == target: return True return False 方法二 if target in lists: pass 方法一耗时 0.16038449999999216s,方法二耗时 0.04139250000000061s,性能提升 74.19% 15.使用itertools库迭代 itertools是用来操作迭代器的一个模块,其函数主要可以分为三类:无限迭代器、有限迭代器、组合迭代器。 Exp15:返回列表的全排列。 测试数组:["Alice", "Bob", "Carol"] 方法一 def permutations(lst): if len(lst) == 1 or len(lst) == 0: return [lst] result = [] for i in lst: temp_lst = lst[:] temp_lst.remove(i) temp = permutations(temp_lst) for j in temp: j.insert(0, i) result.append(j) return result 方法二 import itertoolsitertools.permutations(["Alice", "Bob", "Carol"]) 方法一耗时 3.867292899999484s,方法二耗时 0.3875405000007959s,性能提升 89.98% 结语 根据上面的测试数据,我绘制了下面这张实验结果图,可以更加直观的看出不同方法带来的性能差异。As can be seen from the figure, the performance increase brought about by most of the techniques is considerable, but there are also a small number of skills with small increases (for example, numbered 5, 7, 8, where there is little difference between the two methods in Article 8).
To sum up, I think it is actually the following two principles:
1. Try to use built-in library functions
Built-in library functions are written by professional developers and have been tested many times, and the underlying layers of many library functions are developed in C language. Therefore, these functions are generally very efficient (such as sort (), join (), etc.), and the methods written by yourself are difficult to surpass them, so it is better to save effort and not to repeat the wheels, not to mention that the wheels you build may be even worse. So, if the function already exists in the library, use it directly.
two。 Try to use excellent third-party libraries
There are many excellent third-party libraries, the underlying of which may be implemented in C and Fortran, and libraries like this have nothing to lose, such as Numpy and Numba mentioned earlier, and the improvements they bring are amazing. There are many libraries like this, such as Cython, PyPy, etc., here I just throw a brick to attract jade.
In fact, there are many ways to speed up the execution of Python code, such as avoiding global variables, using the latest version, using appropriate data structures, taking advantage of the inertia of if conditions, and so on. These methods require us to practice in order to have a profound feeling and understanding, but the most fundamental way is to maintain our enthusiasm for programming and the pursuit of best practices. This is the inexhaustible source of power that we can constantly break through ourselves and scale new heights.
This is the end of the article on "what are the classic skills for using Python?" Thank you for reading! I believe you all have a certain understanding of the knowledge of "what are the classic skills of using Python". If you want to learn more, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.