Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Python regular expressions

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces how to use Python regular expressions, which is very detailed and has a certain reference value. Friends who are interested must read it!

Regular expressions are a powerful tool for dealing with strings. As a concept, regular expressions are not unique to Python. However, there are some small differences in the actual use of regular expressions in Python.

A regular expression is a special sequence of characters that helps you easily check whether a string matches a pattern.

Python has added the re module since version 1.5, which provides Perl-style regular expression patterns.

The re module enables the Python language to have all the regular expression functions.

The compile function generates a regular expression object based on a pattern string and optional flag parameters. The object has a series of methods for regular expression matching and replacement.

The re module also provides functions that are exactly the same as those of these methods, which take a pattern string as their first argument.

1. Find the first matching string

Import res='i love python very much'pat='python'r=re.search (pat,s) print (r.span ()) # (7. 13)

2. Find all 1

Import res=' pat='1'r=re.finditer (pat,s) for i in r: print (I) # #, Class 1, Grade 3, Qingzhou Middle School, Weifang City, Shandong Province

3.\ d matching numbers [0-9]

Import res=' has a total of 20 lines of code running time 13.59smatching numbers (\ d represents a common character of a number) one or more times r=re.findall (pat,s) print (r) # ['20 minutes, 13 words, 59']

We want to keep 13.59 instead of separate, see 4

4 、? Indicates that the previous character matches 0 or 1 times

Import res=' has a total of 20 lines of code running time 13.59 slots patched r'\ d +\.?\ dblocks? It means to match the decimal point (\.) 0 times or 1 time r=re.findall (pat,s) print (r) # ['20']

5. ^ matches the beginning of the string

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r' ^ [emrt]'# looks up with r=re.findall (pat,s) print (r) # []. Because the string begins with the character `T`, which is not within the emrt matching range, the return is empty.

6. Re.I ignores case

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r' ^ [emrt]'# lookup with r=re.compile (pat,re.I) .search (s) print (r) # indicates that the beginning of the string is in the match list

7. Use regular extraction of words

This is an inaccurate version, please refer to the 9th

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r'\ s [a-zA-Z] + 'r=re.findall (pat,s) print (r) # [' module', 'provides',' regular', 'expression',' matching', 'operations',' similar', 'to',' those', 'found',' in', 'Perl']

8. Capture only words and remove spaces

Use () capture, which is an inaccurate version, see 9

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r'\ s ([a-zA-Z] +) 'r=re.findall (pat,s) print (r) # [' module', 'provides',' regular', 'expression',' matching', 'operations',' similar', 'to',' those', 'found',' in', 'Perl']

9. Add the first word

Above 8, see that the extracted word does not include the first word, use? Indicates that the previous character appears 0 or 1 times, but this character also means greedy or non-greedy matching, so be careful when using it.

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r'\ s? ([a-zA-Z] +) 'r=re.findall (pat,s) print (r) # [' This', 'module',' provides', 'regular',' expression', 'matching',' operations', 'similar',' to', 'those',' found', 'in',' Perl']

10. Use the split function to segment words directly

Use the above method to segment words, not succinctly, just for demonstration. The easiest way to split words is to use the split function.

Import res = 'This module provides regular expression matching operations similar to those found in Perl'pat = r'\ s roomr = re.split (pat,s) print (r) # ['This',' module', 'provides',' regular', 'expression',' matching', 'operations',' similar', 'to',' those', 'found',' in', 'Perl']

11. Extract words that begin with m or t and ignore case

The following results are not what we want. What are the reasons? Let's go!

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r'\ s? ([mt] [a-zA-Z] *)'# find r=re.findall (pat,s) print (r) # ['module',' matching', 'tions',' milar', 'to',' those']

12. Use ^ to find the word at the beginning of the string

Combine 11 and 12 to get all the words that start with m or t

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r' ^ ([mt] [a-zA-Z] *)\ s'# find r=re.compile (pat,re.I) .findall (s) print (r) # ['This']

13. Divide first, and then find the words that meet the requirements.

Use match to indicate whether it matches or not

Import res='This module provides regular expression matching operations similar to those found in Perl'pat=r'\ s+'r=re.split (pat,s) res= [i for i in r if re.match (r'[mMtT]', I)] print (res) # ['This',' module', 'matching',' to', 'those']

14. Greedy matching

Match as many characters as possible

Import recontent='ddedadsadgraphbbmathcc'pat=re.compile (r "(. *)) # greedy mode m=pat.findall (content) print (m) # ['graphbbmath']

15. Non-greedy matching

Compared with 14, there is only one more question mark, and the result is completely different.

Import recontent='ddedadsadgraphbbmathcc'pat=re.compile (r "(. *?)) # greedy mode m=pat.findall (content) print (m) # ['graph',' math']

Compared with 14, we can see the difference between greedy matching and non-greedy matching, the latter is returned immediately after string matching.

16. Contains multiple separators

Use the split function

Import recontent = 'graph math,english;chemistry' # this pat=re.compile (r "[\ s\,\;] +") # greedy mode m=pat.split (content) print (m) # [' graph', 'math',' english', 'chemistry']

17. Replace the matching substring

Sub function to realize the replacement of matching substring

Import recontent= "hello 12345, hello 456321" pat=re.compile (r'\ dudes') # part of m=pat.sub ("666", content) print (m) # hello 666, hello 666

18. Climb the title of Baidu home page

Import refrom urllib import request # crawler crawled Baidu home page content data=request.urlopen ("http://www.baidu.com/").read().decode() # analyzes the web page and determines the regular expression pat=r' (. *?) 'result=re.search (pat,data) print (result) #

The following is the sharing of knowledge points

19. Summary of commonly used metacharacters

. Match any character

^ match string start position

$matches the position that ends in the string

* the previous atom is repeated 0 times and 1 times.

? Repeat the previous atom once or 0 times

+ the preceding atom is repeated one or more times

The atom in front of {n} appeared n times.

{n,} the front atom appears at least n times.

The number of occurrences of atoms in front of {ncentine m} is between nmam.

() grouping, the part that needs to be output

20. Summary of common characters

\ s match white space character

\ w match any letter / number / underscore

\ W, as opposed to lowercase w, matches any character other than alphanumeric / underscore

\ d match decimal numbers

\ d matches values other than decimal numbers

[0-9] matches a number between 0-9

[amurz] matches lowercase letters

[Amurz] matches uppercase letters

The above is all the content of the article "how to use Python regular expressions". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report