Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of regular expression + Python re Module

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces the regular expression + Python re module example analysis, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

Regular expressions (Regluar Expressions), also known as regular expressions, are often abbreviated to REs,regexes or regexp (regex patterns) in code. It is essentially a small, highly specialized programming language. You can use regular expressions to implement the specified text

Matching testing, content search, content replacement, string segmentation and other functions.

Introduction of re module

The re module in Python provides a regular expression engine interface, which allows us to compile regular expressions into pattern objects, and then perform pattern matching search and string segmentation, substring substitution and other operations through these pattern objects. The re module provides module-level functions and encapsulation of related classes for these operations.

Some small rules for regular expressions

① metacharacter

② quantifier

③ greedy and non-greedy matching

Always match as many as possible within the range of quantifiers-greed

Always match as little as possible within the range of quantifiers-laziness

. *? X matches any content any number of times and stops when it encounters x

. +? X matches any content at least once when it encounters x.

④ escape character problem

. It has a special meaning, cancel a special meaning.

There are two ways to cancel the special meaning of a metacharacter

Precede this metacharacter with\

Valid for some characters, put this metacharacter in the character group

[. () +? *]

Python-- > re module

Findall

The contents of the group will be displayed first.

* cancel priority display (?: regular)

Search

Only the first eligible item can be returned

The result needs a .group value.

Get the complete match result by default

Fetch the contents of the nth packet through group (n)

# search still matches according to the complete rule, and the first matching content is also displayed, but we can get the content in the specific text group ret = re.search ('9 (\ d) (\ d)'by passing the parameter # to the group method. '19740ash93010uru') print (ret) # variable-- > if ret: print (ret.group ()) #-- > 974 print (ret.group (1)) #-> 7 print (ret.group (2)) #-> priority findall # take all those that meet the criteria, and the # search in the priority display group only takes the first one that meets the criteria. The result is a variable # the result of the variable .group () is exactly the same as the result of the variable .group (0) # the form of the variable .group (n) is specified to get the matching content in the nth grouping # plus parentheses is to extract ret = re.findall ('(\ w+)') from what is really needed 'askh930s02391j192agsj') print (ret) #-- > [' askh930s02391j192agsj']

Other content has detailed comments in the code, you can copy my code and run it step by step and then experiment.

The following contents are: split sub subn math,compile,finditer

# split sub subn math,compile,finditer# splitres = re.split ('\ cyx',', "cyx123456cyxx") print (res) #-- > ['cyx',' cyxx'] res = re.split ('(\ d +)', "cyx123456cyxx") # retain packet print (res) #-> ['cyx',' 123456, 'cyxx'] # sub replace res = re.sub ('\ dflowers,'I replaced the number' "cyx123456cyxxx123456") # replace all by default Of course, you can also replace re.sub ('\ dudes'I replaced the numbers', 'cyx123456cyxxx123456' 1) print (res) #-> cyx I replaced the number cyxxx I replaced the number # subn replaced and showed the number of replacements res = re.subn ('\ dflowers,'I replaced the number', "cyx123456cyxxx123456") print (res) #-> ('cyx I replaced the number cyxxx I replaced the number' 2) # match is equivalent to adding a ^ (similar to search)-- > is mainly used to specify what the character symbol must be: res = re.match ('\ dflowers, 'cyx123456cyxxx') print (res) #-> Noneres = re.match ('\ dflowers') '123cyx456cyxxxx') print (res.group ()) #-> 123cyxxxx-- A time-saving tool # if the same regular expression is used multiple times # saves the time of parsing the same regular expression multiple times ret = re.compile ("\ d +") res = ret.search ("cyx12456cyxXX123") print (res.group ()) #-> 1245 cycles finditer-> saves space ret = re.finditer ("\ d +") "cyx123456cyxxx125644") for rin ret: print (r.group ()) #-> 123456 # 12564savings how can it save time and space? Ret = re.compile ('\ cyx222231fddsf45746sdf2123sdf56456sdf10123sdf123132sdf') res = ret.finditer ("cyx222231fddsf45746sdf2123sdf56456sdf10123sdf123132sdf") for rin res: print (r.group ()) "2222314574621235645610123123132"# Group naming (? P regular) (? P = group name) # sometimes the content we want to match is included in the unwanted content, # only match the unwanted content first Then find a way to remove the use of # grouping naming from the results (find the same content in both groups) exp = 'asdasf54545645698asdasd00545sdfsdf'ret = re.search ('. *?

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report