Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the methods for python to find the longest common substring algorithm

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "python to find the longest common substring algorithm what methods", in daily operation, I believe many people in python to find the longest common substring algorithm what problems there are doubts, Xiaobian consulted all kinds of information, sorted out simple and easy to use operation methods, hope to answer "python to find the longest common substring algorithm what methods" doubts help! Next, please follow the small series to learn together!

Algorithm:

Find the longest common substring of two strings!

Analysis topic:

There may be multiple longest common substrings of the same length in two strings, so the return should be a list.

Double loop through two substrings, find duplicate substrings and add them to the list of longest substrings, and record the length of the longest substring. Longer substrings encountered in the loop are added to the list of longest substrings, and the length of the longest substring is modified.

def find_repeat1(a, b):longest = [] #longest string repository max = 0 #longest string length for x in range(len(b)): # x is a pointer to string b y = 0 # y is a pointer to string a while y

< len(a):long = '' # 子串申明z = 0 # 子串长度# 判断若a子串指针和b子串指针都不越界且它们指向的字符相同则进入循环while y+z < len(a) and x+z < len(b) and a[y+z] == b[x+z]: long += a[y + z] # 子串累加z += 1 # 子串长度累加if z >

= max: #substring length exceeds longest.append(long) #add this substring to repository max = z #longest substring length correction y = y + z if z > 0 else y + 1 #correct pointer to string b longest = set(longest) #deduplicate return [i for i in longest if len(i) == max] #return all substrings that match longest length

There are 17 lines of concise wind code.

Utility Code: def find_repeat2(a, b):longest = [] #holds repeated substrings lena = len(a)lenb = len(b)max = 0 #longest length for x in range(lenb): # x is pointer to string b y = 0 # y is pointer to string a for y in range(lena):if a[y] == b[x]:#Determine whether characters are the same long = '' #Temporary substring z = 0 #Count substring length keya = y #Record the position where a starts repeating keyb = x #Record the position where b starts repeating while a[keya] == b[keyb]:long += a[keya] #If there is a repetition, substring accumulation z += 1 #Substring length accumulation keya += 1 # a start repeating position pointer +1keyb += 1 # b start repeating position pointer +1if keya >= lena or keyb >= lenb: #If a or b pointer is out of bounds break #Exit loop if z >= max: #Temporary substring length is greater than max = z #Maximum length correction longest.append(long) #Add temporary substring to bin longest = set(longest) #Deduplicate return [i for i in longest if len(i) == max] #Return all substrings that match the longest length

The utility wind code has 25 lines.

Code comparison:

The concise wind code is one-third less than the practical wind code, which is a big gap. Now let's compare their performance. To test performance, I pass in two extra long strings for comparison:

a = 'abdomen' * 300b = 'dfomenshfabdomh' * 300t1 = time.time()print(find_repeat1(a, b))t2 = time.time()print(t2 - t1)print('=' * 20)t1 = time.time()print(find_repeat2(a, b))t2 = time.time()print(t2 - t1)print(17 / 25)print(t3 / t4)print('=' * 20)out:['abdom']5.348941802978516====================['abdom']1.5207555294036865====================0.683.4231525397933256

The concise code length is only 68% of the practical code length, and the concise code time is 3.4 times that of the practical code!!!

The core algorithms of the two codes are exactly the same, but the performance is so much worse!!!

At this point, the study of "Python's method of finding the longest common substring algorithm" is over, hoping to solve everyone's doubts. Theory and practice can better match to help everyone learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report