How to use regular expressions in python 07/19 Update SLTechnology News&Howtos

How to use regular expressions in python

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to use the regular expression of python", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to use the regular expression of python" this article.

Regular expressions-metacharacters

The re module enables the Python language to have all the regular expression functions

1. Quantifier # extract mixed uppercase and lowercase words import rea = 'Excel 12345Word23456PPT12Lr'r = re.findall (' [a-zA-Z] {3 print 5}', a) # extract the number of letters from 3 to 5 print (r) # ['Excel',' Word' 'PPT'] # greed and non-greed [Python defaults to greed mode] # greed:' [a-zA-Z] {3jue 5}'# non-greed:'[a-zA-Z] {3jue 5}?' Or'[a-zA-Z] {3}'# the latter is recommended, but not? Number, otherwise you will have something to do with the following? Sign confusion # matches 0 or unlimited times *, characters before * appear 0 or infinite times import rea = 'exce0excell3excel3'r = re.findall (' excel*',a) r = re.findall ('excel.*',a) # [' excell3excel3'] # excel there are not many l can match print (r) # ['exce',' excell', 'excel'] # match once or unlimited times + sign The characters before the + sign appear at least once import rea = 'exce0excell3excel3'r = re.findall (' excel+',a) print (r) # ['excell',' excel'] # match 0 times or once? Number,? The number is often used to repeat import rea = 'exce0excell3excel3'r = re.findall (' excel?',a) print (r) # ['exce',' excel', 'excel'] 2. Character matching

Line = 'xyz,xcz.xfc.xdz,xaz,xez,xec'r = re.findall (' x [de] zoning, line) # pattern is the beginning of x, ending z, including d or eprint (r) # ['xdz',' xez'] r = re.findall ('x [^ de] zoning, line) # pattern is the beginning of x, ending z, not containing d or eprint (r) # ['xyz',' xcz', 'xaz'] #\ w can be extracted in Chinese and English Numbers and underscores Can't extract the special characters import rea = 'Excel 12345Word\ n23456_PPT12lr'r = re.findall ('\ wicked grammar a) print (r) # ['eBay,' x','C', 'eBay,' lure,'1','2','3','4','5','9','3','r','d','2','3,'4,'5' '6percent,'_', 'characters,' 1characters, '2bands,' lashes,'r'] #\ W extract special characters The space\ n\ timport rea = 'Excel 12345Word\ n23456_PPT12lr'r = re.findall ('\ Warrior n23456_PPT12lr'r a) print (r) # [','\ n'] 3. Boundary matching

# restrict the location of the phone number must be 8-11 digits to extract import retel = '13811115888roomr = re.findall (' ^\ d {8c11} $', tel) print (r) # ['13811115888'] 4. Group # groups abc into a group, and {2} means to repeat several times, matching abcabcimport rea = 'abcabcabcxyzabcabcxyzabc'r = re.findall (' (abc) {2}', a) # and # ['abc',' abc'] print (r) r = re.findall ('(abc) {3}', a) # ['abc'] 5. Match pattern parameters

# findall third parameter re.I ignores case import rea = 'abcFBIabcCIAabc'r = re.findall (' fbi',a,re.I) print (r) # ['FBI'] # used between multiple modes | concatenate import rea =' abcFBI\ nabcCIAabc'r = re.findall ('fbi. {1}', aparentre.I | re.S) # match fbi and then match any character including\ nprint (r) # ['FBI\ n'] II, method re.findall

Match all the values related to the specified value in the string

Return as a list

If there is no match, an empty list is returned.

Import rere.findall (pattern, string, flags=0) pattern.findall (string [, pos [, endpos]]) import reline = "111aaabbb222 snore Oreo" r = re.findall ('[0-9]', line) print (r) # ['1','1','1','2','2','2'] re.match

Re.match attempts to match a pattern from the beginning of the string

Match () returns none if the starting position match is not successful.

Re.match (pattern, string, flags=0) # (standard, to match Flag bit) print (re.match ('www','www.xxxx.com')) print (re.match (' www','www.xxxx.com'). Span () print (re.match ('com','www.xxxx.com')) (0,3) Nonegroup matching object import rea =' life is short,i use python,i love python'r = re.search ('life (. *) python (. *) python' A) print (r.group (0)) # complete regular matching Life is short,i use python,i love pythonprint (r.group (1)) # values between the first grouping is short,i use print (r.group (2)) # values between the second grouping, i love print (r.group (0prime1)) # returns three result values as a tuple ('life is short,i use python,i love python',' is short,i use',' I love') print (r.groups ()) # returns group (1) and group (2) ('is short,i use',', I love') import re#. * indicates any matching of any single or multiple characters except newline characters (\ n,\ r) # (. *?) Represents the "non-greedy" mode, saving only the first matching substring # re.M multiline match Influence ^ and $# re.I to make matching case insensitive line = "Cats are smarter than dogs" matchObj1 = re.match (r'(. *) are (. *?). *', line, re.M | re.I) matchObj2 = re.match (r'(. *) smarter (. *?). *', line, re.M | re.I) matchObj3 = re.match (r'(. *) than (. *)', line Re.M | re.I) print (matchObj1) print (matchObj2) print (matchObj3) # Noneif matchObj1: print ("matchObj1.group ():", matchObj1.group ()) print ("matchObj1.group (1):", matchObj1.group (1)) print ("matchObj1.group (2):", matchObj1.group (2) else: print ("No matchmakers!") if matchObj2: print ("matchObj2.group ():" MatchObj2.group () print ("matchObj2.group (1):", matchObj2.group (1)) print ("matchObj2.group (2):", matchObj2.group (2)) else: print ("No matchmaking!") if matchObj3: print ("matchObj3.group ():", matchObj3.group ()) print ("matchObj3.group (1):", matchObj3.group (1)) print ("matchObj3.group (2):" MatchObj3.group (2) else: print ("No matchmakers!") # matchObj1.group (): Cats are smarter than dogs# matchObj1.group (1): Cats# matchObj1.group (2): smarter# matchObj2.group (): Cats are smarter than dogs# matchObj2.group (1): Cats are# matchObj2.group (2): than# matchObj3.group (): Cats are smarter than dogs# matchObj3.group (1): Cats are smarter# MatchObj3.group (2): the dogsimport re# dot matches a single character # star is the thing before it appears 0 or countless times # dot star means any character appears 0 or countless times str = "a b a b" matchObj1 = re.match (ryoga (. *) b' Str, re.M | re.I) matchObj2 = re.match (matchObj1.group (), str, re.M | re.I) print ("matchObj1.group ():", matchObj1.group ()) print ("matchObj2.group ():", matchObj2.group ()) # matchObj1.group (): a b a b# matchObj2.group (): a bre.search

Scans the entire string and returns the first successful match.

Re.search (pattern, string, flags=0) import reline = "cats are smarter than dogs" matchObj = re.match (ringing dogsexamples) re.M | re.I) matchObj1= re.search (re.I) matchObj2= re.match (r'(. *) dogs',line,re.M | re.I) if matchObj: print ("match-- > matchObj.group ():" MatchObj.group () else: print ("No matchmakers!") if matchObj1: print ("match-- > matchObj1.group ():", matchObj1.group ()) else: print ("No matchmakers!") if matchObj2: print ("match-- > matchObj2.group ():" MatchObj2.group () else: print ("No matchmakers!") # No matchmakers # match-> matchObj1.group (): dogs# match-> matchObj2.group (): cats are smarter than dogsre.compile

Re.compile converts a regular expression to a pattern object

This makes it more efficient to match. After using compile to convert once, you don't have to convert each time you use the pattern in the future

III. Retrieve and replace the re.sub replacement string re.sub ('replaced', 'replaced', a) # replace FBI with BBQimport rea = 'abcFBIabcCIAabc'r = re.sub (' FBI','BBQ',a) print (r) # replace FBI with BBQ, and the fourth parameter writes 1, proving that only the first time is replaced The default is 0 (infinite replacement) import rea = 'abcFBIabcFBIaFBICIAabc'r = re.sub (' FBI','BBQ',a,1) print (r) # abcBBQabcCIAabc# abcBBQabcFBIaFBICIAabc#. The function is passed as a parameter to the list of sub, and the business is left to the function to handle. For example, replace FBI with $FBI$import rea = 'abcFBIabcFBIaFBICIAabc'def function name (formal parameter): segmented acquisition = formal parameter .group () # group () is used to obtain segmented intercepted strings in regular expressions, and get FBI return' $'+ segmented acquisition +'$'r = re.sub ('FBI', function name, a) print (r) above is all the content of the article "how to use regular expressions of python" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.