In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
Today, the editor will share with you the relevant knowledge points about how to use the re module of python Automation. The content is detailed and the logic is clear. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article. Let's take a look at it.
What is re?
A regular expression is a special sequence of characters that can easily check whether a string matches a pattern. The re module enables python to have all the regular expression functions.
Second, the function of re module
By using regular expressions, you can:
Test the pattern within the string. For example, you can test the input string to see if a phone number pattern or a credit card number pattern appears within the string. This is called data validation.
Replace the text. -- you can use regular expressions to identify specific text in a document, delete the text completely, or replace it with other text.
Extract a substring from a string based on pattern matching. -- you can find specific text within the document or in the input field.
Third, the use of re module 1. Common methods
FindAll (): matches all strings and returns the matching result as a list
Match (): matches the start position of the string, and returns None if there is no start position
Search (): searches in a string and returns the first one found
Finditer (): matches all strings and returns iterator
2. Metacharacter
Match any character (except\ n) h. Represents any character after matching h
Import reres ='h. Ho' = 'hello python'result = re.findall (res, s) print (result) # [' he', 'ho']
[] take any character in [], match it in the string, return one to one, and finally return it as a list.
Import reres2 ='[hon]'s = 'hello python'result = re.findall (res2, s) print (result) # [' hints, 'oaks,' hacks, 'oaks,' n']
\ d match the number 0-9
Import reres2 ='[\ d]'s = 'hell666o pyt999hon'result = re.findall (res2, s) print (result) # [' 6,'6,'6,'9,'9,'9]
\ d match non-numeric, including spaces
Import reres2 ='[\ D]'s = 'hello 3334 python 88'result = re.findall (res2, s) print (result) # [' hash, 'eBay,' lump, 'lump,' oasis, 'pinch,' yearly, 'taper,' hitch, 'oasis,' nasty,']
'\ s' matches white space characters
Import reres2 ='[\ s]'s = 'hello 3334 python 88'result = re.findall (res2, s) print (result) # [',',']
'\ S' matches non-white space characters
Import reres2 ='[\ s]'s = 'hello 3334 python 88'result = re.findall (res2, s) print (result) # [' hobbled, 'eyed,' lumped, 'lumped,' oily,'3','3','4','p','y','y','h,'o,'n','8','8']
\ w matches non-special characters, that is, amurz, Amurz, 0-9, _, Chinese characters
Import reres2 ='[\ w]'s = 'hello#&_ aa 8python China' result = re.findall (res2, s) print (result) # ['hogwash,' eBay, 'lager,' lager, 'oasis,' _', 'ajar,' 8mm, 'pinch,' yearly, 'taper,' hype, 'oasis,' nasty, 'medium', 'country']
\ W matching special characters (- ~ @ # $& *) spaces are also special characters
Import reres2 ='[\ W]'s ='- hello#&_ aa 8python China 'result = re.findall (res2, s) print (result) # [' -','#','&',''] 3, multi-character matching
(1) *: matches the previous character once, or an unlimited number of greedy patterns
Import reres2 = 'hackers ='-hhello hhh python'result = re.findall (res2, s) print (result) # [', 'hh','', 'hhh',' hacks,''] import reres2 = 'hacks ='-hhello hhh python'result = re.findall (res2 S) print (result) # [', 'hh','', 'hhh',' hobby,'']
(2) +: matches the previous character once or infinitely
Import reres2 = 'hh', 's ='-hhello hhh python'result = re.findall (res2, s) print (result) # ['hh',' hhh','h']
(3)?: match the previous character 0 or 1 times, non-greedy mode
Import reres2 = 'hackers' ='- hhello hhh python'result = re.findall (res2, s) print (result) # [', 'hints,' hints,''hacks,']
(4) {n}: matches the previous character n times in a row
Import reres2 = 'https {2}' s ='- hhello-httpssss-python'result = re.findall (res2, s) print (result) # ['httpss'] match to the previous character s appears twice in a row
{nMagnem}: match the previous character with nmerm times.
Import reres2 = 'https {1mer3}' s ='- hhello-httpssss-python'result = re.findall (res2, s) print (result) # ['httpss']
(5) greedy mode and non-greedy mode
Regular expressions are commonly used to find matching strings. Greedy mode, always try to match as many characters as possible; non-greedy mode, on the contrary, always try to match as few characters as possible. Add after "*", "?", "+", "{mrecoery n}"? To turn greed into non-greed.
(6) |: match between two conditions, or the relationship between
Import reres2 ='he | ll's = 'hello python'result = re.findall (res2, s) print (result) # [' he', 'll']
(7) Boundary value:
^: match which character to start with
Import reres2 ='^ he's = 'hello python'result = re.findall (res2, s) print (result) # [' he']
$: matches the character with which it ends
Import reres2 = 'on$'s =' hello python'result = re.findall (res2, s) print (result) # ['on'] 4, packet matching
(): only match those in ()
Import reres2 ='# (\ w. Www', str?) # s = "{'mobile_phone':'#mobile_phone#','pwd':'Aa123456'}" result = re.findall (res2, s) print (result) # [' mobile_phone'] 5, use of the match () method str = "www.runoob.com" print (re.match ('www', str). Span ()) # match at the starting position Returns the matching interval subscript (0Power3) print (re.match ('com', str)) # does not match None6 and search () at the starting position: searches in the string and returns the first str = "www.runoob.com" print (re.search (' www', str). Span ()) # matches at the starting position Returns the matching interval subscript print (re.search ('com', str). Span ()) # does not match at the starting position
Re.match only matches the beginning of the string. If the string does not match the regular expression at the beginning, the match fails, and the function returns None; and re.search matches the entire string until a match is found.
7. Finditer ():
Matches all strings and returns an iterator similar to findall, finding all the substrings matched by the regular expression in the string and returning them as an iterator.
Res ='h. Heho = 'hello python'result = re.finditer (res, s) for str in result: print (str.group ()) heho is all the contents of this article "how to use the re Module of python Automation". Thank you for reading! I believe you have gained a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.