How to use regular expressions to match and find text patterns 04/24 Update SLTechnology News&Howtos

How to use regular expressions to match and find text patterns

2025-04-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

How do you use regular expressions to match and find text patterns? In view of this problem, this article introduces the corresponding analysis and answers in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.

1. Demand

We want to match or find according to a specific text pattern.

2. Solution

If you want to match only simple text, you usually only need to use basic string methods, such as str.find (), str.endswith (), str.startswith (), or similar functions.

Example:

Text='mark, handsome, 18183 handsome, mark'print (text=='mark') print (text.startswith ('mark')) print (text.startswith (' mark')) print (text.find ('handsome)

Results:

False

True

six

For more complex matches, you need to use regular expressions and re modules. To illustrate the basic flow of using regular expressions, suppose we want to match dates in numeric form, such as "11max 27max 2018". Examples are as follows:

Import retext1='11/27/2018'text2='Nov 27, 2018'if re.match (r'\ dudes /\ dnumbers /\ dnumbers /\ dnumbers): print ('fit the model: numbers / numbers / numbers') else: print ('do not match the model: numbers / numbers / numbers') if re.match (r'\ dcards /\ numbers /\ numbers' Text2): print ('fit model: number / number / number') else: print ('not fit model: number / number / number')

Running result:

Fit model: number / number / number

Does not match the model: number / number / number

If you plan to match multiple times against the same model, you will usually precompile the regular expression pattern into a pattern object first.

For example:

Import retext1='11/27/2018'text2='Nov 27, 2018'datepat=re.compile (r'\ dudes /\ numbers /\ numbers') if datepat.match (text1): print ('fit model: numbers / numbers / numbers') else: print ('not fit model: numbers / numbers / numbers') if datepat.match (text2): print ('fit model: numbers / numbers / numbers') else: print ('not fit model: numbers / numbers / numbers')

Results:

Fit model: number / number / number

Does not match the model: number / number / number

The match () method always tries to find a match at the beginning of a string. If you want to search for all matches for the entire text, you should use the findall () method, for example:

Import retext=' today is 1127 shock 2018, and yesterday was 11/26/2018'datepat=re.compile (r'\ dink /\ dbath /\ dbath') print (datepat.findall (text))

Running result:

['11Accord27 Universe 2018,' 11Compact 26 Universe 2018']

When defining regular expressions, we often introduce some patterns into capture groups by wrapping them in parentheses, which usually simplify the subsequent processing of matching text, because the contents of each group can be extracted separately. The findall () method searches the entire text and finds all matches and returns them as a list. If you want to find a match iteratively, you can use the finditer () method.

For example:

Import re# joins the capture group datepat=re.compile (r'(\ d +) + / (\ d +) + / (\ d +)') m=datepat.match ('11Make27GB 2018') print (m.group (0)) print (m.group (1)) print (m.group (2) print (m.group (3) print (m.groups ()) month,day,year=m.groups () print (month) print (day) print (year) print (' *'* 20) text=' is 11A272018 today Yesterday it was 11/26/2018'for month,day,year in datepat.findall (text): print ('{}-{}-{} '.format (year,month,day)) print (' * * 20) form in datepat.finditer (text): print (m.groups ())

Results:

11/27/2018

eleven

twenty-seven

2018

('11,'27, '2018')

eleven

twenty-seven

2018

********************

2018-11-27

2018-11-26

********************

('11,'27, '2018')

('11,'26, '2018')

3. Analysis

This section mainly introduces the basic functions of the re module for text matching and search, first compiling the pattern with re.compile (), and then matching and searching using methods like match (), findall (), and finditer ().

We usually use the original string when specifying a pattern, for example:

R'(\ d +) / (\ d +) / (\ d +)'

Such a string does not escape backslash characters, which is very useful in regular expressions. Otherwise, we need to use a double backslash line to identify a single'', for example:

'(\\ d +) / (\\ d +) / (\\ d +)'

Note that the match () method only checks the beginning of the character, and the possible matching result is not what you want, for example:

Import re# joins capture group datepat=re.compile (r'(\ d +) + / (\ d +) + / (\ d +)') m=datepat.match ('11There 27TO 2018xxxx') print (m)

Results:

If you want an exact match, you can add a closing tag: $

Import re# joins the capture group datepat=re.compile (r'(\ d +) + / (\ d +) + / (\ d +) $') m1=datepat.match ('11There 27x / 2018xxxx') m2=datepat.match (' 11Charger 27xxx 2018') print (M1) print (m2)

Results:

None

The answers to the questions about how to use regular expressions to match and find text patterns are shared here. I hope the above content can be of some help to you, if you still have a lot of doubts to be solved. You can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.