Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the ingenious use of regular expressions in Python

2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

Python regular expression clever usage is what kind of, for this problem, this article details the corresponding analysis and solution, hoping to help more want to solve this problem of small partners to find a simpler and easier way.

preface

Regular expressions are about discovering patterns in strings and expressing them in "abstract" symbols. For example, for a sequence of numbers such as 2, 5, 10, 17, 26, 37, how to calculate the seventh value must first find the law of the sequence, and then use the expression n2+1 to describe its law, and then get the seventh value of 50. For strings that need to be matched, we also regard finding rules as *** step. This paper mainly uses regular expressions to complete query matching, replacement matching and segmentation matching of strings.

Common regular symbols

Before entering the string matching, let's first understand what common regular symbols are, as shown in the following table:

If readers can master the contents of the above table skillfully, I believe that they will be able to handle the string processing process with ease. As mentioned earlier, this section will complete string queries, substitutions, and splits based on regular expressions, all of which require importing the re module and using several functions described below.

String matching queries

The findall function in the re module performs traversal matching on a specified string, obtains all matching substrings in the string, and returns a list result. The parameters of this function have the following meanings:

findall(pattern, string, flags=0)

pattern: Specifies the regular expression to match.

string: Specifies the string to be processed.

flags: Specifies the matching pattern. Common values are re.I, re.M, re.S, and re.X. The pattern of re.I is to make regular expressions case-insensitive; the pattern of re.M is to make regular expressions multi-line matching; and the pattern of re.S indicates regular symbols. It matches any character, including newlines; the re.X pattern allows regular expressions to be written in more detail, such as multiple-line representations, ignoring whitespace characters, and adding comments.

Matching substitution of strings

The sub function in the re module replaces, similar to the replace method for strings, which replaces the matching content with repl according to the regular expression. The parameters of this function have the following meanings:

sub(pattern, repl, string, count=0, flags=0)

pattern: Same as pattern in findall function.

repl: Specifies the new value to replace with.

string: Same as string in findall function.

count: Used to specify the maximum number of substitutions, default is all substitutions.

flags: Same as flags in findall function.

Matching segmentation of strings

The split function in the re module separates strings according to specified regular expressions, similar to the split method for strings. The specific parameters of this function have the following meanings:

split(pattern, string, maxsplit=0, flags=0)

pattern: Same as pattern in findall function.

maxsplit: Used to specify *** split times, default is all split.

string: Same as string in findall function.

flags: Same as flags in findall function.

practical case

If the above functions and parameters have been mastered, we need to further strengthen our understanding through cases. Next, we will illustrate the above three functions:

#import re module for regular expressions import re #extract all weather states in string8 string8 = "{ymd:'2018-01- 01', tianqi:' sunny ',aqiInfo:' light pollution'},{ymd:'2018-01- 02', tianqi:' cloudy ~ light drizzle', aqiInfo:'excellent'},{ymd:'2018-01- 03', tianqi:'light rain ~ moderate rain',aqiInfo:'excellent'},{ymd:'2018-01- 04', tianqi:'moderate rain ~ light drizzle', aqiInfo:' excellent'}" #print(re.findall("tianqi:'(.*?) string9 = 'Together, we discovered that a free market only thrives when there are rules to ensure competition and fair play, Our celebration of initiative and enterprise' #Use findall function print based on regular expressions (re.findall ('w *ow*',string9, flags = re.I)#Delete punctuation marks, numbers and letters from string 10 string10 = ' It is reported that the four steam condensation tanks shipped this time belong to the nuclear secondary pressure equipment of the International Thermonuclear Experimental Reactor (ITER) project, and have successively completed pressure test, vacuum test, helium leak test, jack test, Lug load test, stacking test and other acceptance tests. ' #Use sub function print(re.sub ('[,.,') based on regular expressions a-zA-Z0-9 ()]','',string10)) #Split each subsection in string11 string11 = '2 rooms 2 halls| 101.62 flat| Low Zone/Floor 7| facing south Shanghai Future-Pudong-Jin Yang- 2005 Jian ' #Use split function split = re.split ('[-|]', string11) print(split) #cleaning of split result split_strip = [i.strip() for i in split] print (split_strip) out: "'Sunny ', ' overcast ~ drizzle','light rain ~ moderate',' moderate rain ~ drizzle']"'Together ', 'discovered', 'only', 'to', 'competition', 'Our', ' celebration', it has been learned that the steam condensation tank shipped this time belongs to the nuclear second-stage pressure equipment of the international thermonuclear experimental reactor project, and has successively completed acceptance tests such as pressure test, vacuum test, helium leak detection test, jack test, lifting lug load test, stacking test, etc.'Built in 2005']['2 Rooms 2 Halls','101.62 flat',' Low Area/7th Floor','South-Facing', 'Shanghai Future', 'Pudong',' Jinyang','Built in 2005']

As shown above, in *** examples, the regular expression "tianqi:'(.*?) '"To achieve the acquisition of target data, if brackets are not used, values like"tianqi:' sunny '", "tianqi:' cloudy ~ light rain '"will be generated, so brackets are added for grouping, and only the contents of the group are returned;

The second example does not write the regular expression in parentheses. If you write parentheses, you will return the same result, so findall is used to return the list values that meet the matching conditions. If there are parentheses, only the matching values in parentheses will be returned.

The third example uses the replacement method to replace all punctuation marks with empty characters, thus achieving the effect of deletion;

The fourth example is the splitting of strings, if you follow the regular '[,., a-zA-Z0-9()]'division, the returned result contains empty characters, such as'2 rooms 2 hall 'after a blank character. In order to remove the first and last empty characters of each element in the list, list expressions are used, and the strip method of strings is combined to complete the compression of empty characters.

About Python regular expression clever usage is what kind of question answer to share here, hope the above content can have some help to everyone, if you still have a lot of doubts not solved, you can pay attention to the industry information channel to learn more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report