Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the methods contained in regular expressions

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly shows you "what are the methods contained in regular expressions", the content is simple and clear, and I hope it can help you solve your doubts. Let me lead you to study and learn this article "what methods are included in regular expressions?"

Methods contained in regular expressions:

(1) match (pattern, string, flags=0)

Def match (pattern, string, flags=0): "Try to apply the pattern at the start of the string, returning a match object, or None if no match was found." return _ compile (pattern, flags) .match (string)

Note from above: Try to apply the pattern at the start of the string,returning a match object,or None if no match was found. The search starts at the beginning of the string and returns a match object object, or a None if it is not found.

Key points: (1) search from the beginning; (2) if the search does not return None.

Let's look at a few examples:

Import re string = "abcdef" m = re.match ("abc", string) (1) matches "abc" And check what the returned result is print (m) print (m.group ()) n = re.match ("abcf", string) print (n) (2) if the string is not found in the list l = re.match ("bcd", string) (3) the string looks for the situation in the middle of the list print (l)

The running results are as follows:

(1) abc (2) None (3) None (4)

As you can see from the output (1) above, using match () matching returns a match object object, and if you want to convert it to what you can see, use group () to convert what is shown in (2); if the matching regular expression is not in the string, return None (3); match (pattern,string,flag) matches from the beginning of the string and can only match from the beginning of the string (4).

(2) fullmatch (pattern, string, flags=0)

Def fullmatch (pattern, string, flags=0): "Try to apply the pattern to all of the string, returning a match object, or None if no match was found." return _ compile (pattern, flags) .fullmatch (string)

Note from above: Try to apply the pattern to all of the string,returning a match object,or None if no match was found...

(3) search (pattern,string,flags)

Def search (pattern,string,flags = 0): "Scan through string looking for a match to the pattern,returning a match object,or None if no match was found." return _ compile (pattern, flags). Search (string) search (pattern,string,flags) is annotated as Scan throgh string looking for a match to the pattern,returning a match object,or None if no match was found. Looks for a regular expression anywhere in the string, returns a match object object if found, and returns None if it cannot be found.

Key points: (1) search from any position in the string, unlike match (), which starts from the beginning; (2) returns None if it cannot be found.

Import re string = "ddafsadadfadfafdafdadfasfdafafda" m = re.search ("a", string) (1) starts in the middle to match print (m) print (m.group ()) n = re.search ("N", string) (2) print (n)

The running results are as follows:

(1) a (2) None (3)

As can be seen from the above results (1), search (pattern,string,flag=0) can be matched from any position in the middle, expanding the scope of use, unlike match () can only match from the beginning, and the match returned is also a match_object object; (2) to display a match_object object, you need to use the group () method; and (3) if it cannot be found, a None is returned.

(4) sub (pattern,repl,string,count=0,flags=0)

Def sub (pattern, repl, string, count=0, flags=0): "Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. Repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used. "" return _ compile (pattern, flags) .sub (repl,string,count) sub (pattern,repl,string,count=0,flags=0) looks for replacement, that is, to find out whether pattern is in the string string; repl is the object to match pattern, so why should you replace the characters found by regular expressions? Count can specify the number of matches and how many matches. The example is as follows: import re string = "ddafsadadfadfafdafdadfasfdafafda" m = re.sub ("a", "A", string) # does not specify the number of replacements (1) print (m) n = re.sub ("a", "A", string,2) # specify the number of substitutions (2) print (n) l = re.sub ("F", "B", string) # cases where there is no match (3) print (l)

The running results are as follows:

DdAfsAdAdfAdfAfdAfdAdfAsfdAfAfdA-(1)

DdAfsAdadfadfafdafdadfasfdafafda-(2)

Ddafsadadfadfafdafdadfasfdafafda-(3)

The above code (1) does not specify the number of matches, then all matches are matched by default; (2) if the number of matches is specified, then only the specified number is matched; and (3) the regular pattern to be matched is not in the string, then the original string is returned.

Key points: (1) you can specify the number of matches, but not all matches; (2) if there is no match, the original string will be returned.

(5) subn (pattern,repl,string,count=0,flags=0)

Def subn (pattern, repl, string, count=0, flags=0): "" Return a 2-tuple containing (new_string, number). New_string is the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in the source string by the replacement repl. Number is the number of substitutions that were made. Repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used. "return _ compile (pattern, flags) .subn (repl, string, count)

Note Return a 2-tuple containing (new_string,number) above: returns a tuple to hold the new string after the regular match and the number of matches (new_string,number).

Import re string = "ddafsadadfadfafdafdadfasfdafafda" m = re.subn ("a", "A", string) # all substitutions (1) print (m) n = re.subn ("a", "A", string,3) # replacement part (2) print (n) l = re.subn ("F", "A", string) # specified replacement string does not exist (3) print (l)

The running results are as follows:

('ddAfsAdAdfAdfAfdAfdAdfAsfdAfAfdA', 11) (1)

('ddAfsAdAdfadfafdafdadfasfdafafda', 3) (2)

('ddafsadadfadfafdafdadfasfdafafda', 0) (3)

As you can see from the output of the above code, sub () and subn (pattern,repl,string,count=0,flags=0) can see that the matching effect is the same, except that the result returned is different. Sub () returns a string, while subn () returns a tuple to store the new string after the regular and the number of substitutions.

(6) split (pattern,string,maxsplit=0,flags=0)

Def split (pattern, string, maxsplit=0, flags=0): "Split the source string by the occurrences of the pattern, returning a list containing the resulting substrings. If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list. If maxsplit is nonzero, at most maxsplit splits occur, and the remainder of the string is returned as the final element of the list. "" return _ compile (pattern, flags). Split (string,maxsplit) split (pattern,string,maxsplit=0,flags=0) is the segmentation of a string, which, according to a rule, requires pattern to split the string and returns a list returning a list containing the resulting substrings. Is to split the string in some way and put the string in a list. An example is as follows: import re string = "ddafsadadfadfafdafdadfasfdafafda" m = re.split ("a", string) # split string (1) print (m) n = re.split ("a", string,3) # specify the number of splits print (n) l = re.split ("F", string) # the split string does not exist in the list print (l)

The running results are as follows:

['dd',' fs', 'dudes,' df', 'df',' fd', 'fd',' df', 'sfd',' fags, 'fd','] (1) ['dd',' fs', 'dudes,' dfadfafdafdadfasfdafafda'] (2) ['ddafsadadfadfafdafdadfasfdafafda'] (3)

As can be seen from (1), if the beginning or end of the string includes the string to be split, the following element will be a ""; (2) we can specify the number of times to split; (3) if the string to be split does not exist in the list, put the original string in the list.

(7) findall (pattern,string,flags=)

Def findall (pattern, string, flags=0): "" Return a list of all non-overlapping matches in the string. If one or more capturing groups are present in the pattern, return a list of groups This will be a list of tuples if the pattern has more than one group. Empty matches are included in the result. "" return _ compile (pattern, flags) .findall (string) findall (pattern,string,flags=) returns a list of all matching elements. Stored in a list. The example is as follows: import re string = "dd12a32d46465fad1648fa1564fda127fd11ad30fa02sfd58afafda" m = re.findall ("[amurz]", string) # matches letters, matches all letters, returns a list (1) print (m) n = re.findall ("[0-9]", string) # matches all numbers Returns a list (2) print (n) l = re.findall ("[ABC]", string) # mismatches (3) print (l)

The running results are as follows:

['kids,' boys, 'boys,' boys, girls, girls, boys, girls, etc. 'a'] (1) [1] [1] [1] [1, 2, 3, 2, 2, 4, 6, 6, 4, 6, 5, 1, 6, 6, 4, 8, 1, 1, 5, 6, 4, 1, 1, 3, 0, 7, 1, 1, 3, 0 '019,' 2], '520,' 8'] (2) [] (3)

The above code matches all strings at (1), a single match, (2) matches the numbers in the string and returns to a list, and (3) returns an empty list if the match does not exist.

Key points: (1) return an empty list when there is no match; (2) if you do not specify the number of matches, only a single match.

(8) finditer (pattern,string,flags=0)

Def finditer (pattern,string, flags=0): "" Return an iterator over all non-overlapping matches in the string.For each match,the iterator returns a match object. Empty matches are included in the result. "" return _ compile (pattern, flags). Finditer (string) finditer (pattern,string) lookup mode, Return an iterator over all non-overlapping matches in the string.For each match,the iterator a match object.

The code is as follows:

Import re string = "dd12a32d46465fad1648fa1564fda127fd11ad30fa02sfd58afafda" m = re.finditer ("[a Murz]", string) print (m) n = re.finditer ("AB", string) print (n)

The running results are as follows:

1) (2)

As you can see from the running results above, finditer (pattern,string,flags=0) returns an iterator object.

(9) compile (pattern,flags=0)

Def compile (pattern, flags=0): "Compile a regular expression pattern, returning a pattern object." return _ compile (pattern, flags)

(10) pruge ()

Def purge (): "Clear the regular expression caches" _ cache.clear _ cache_repl.clear ()

11) template (pattern,flags=0)

Def template (pattern, flags=0): "Compile a template pattern, returning a pattern object" return _ compile (pattern, flags | T)

Regular expression:

Syntax:

Import re string = "dd12a32d46465fad1648fa1564fda127fd11ad30fa02sfd58afafda" p = re.compile ("[a Murz] +") # first use compile (pattern) to compile m = p.match (string) # and then match print (m.group ())

Lines 2 and 3 above can also be merged into one line to write:

M = p.match ("^ [0-9]", '14534Abc')

The effect is the same, the difference is that the first way is to compile the format to be matched in advance (parse the matching formula), so that when you go to match, you don't have to compile the matching format. The second abbreviation is to compile the matching formula every time you match, so if you need to match all the lines that start with a number from a 5w-line file It is recommended that the regular formula be compiled before matching, so that the speed will be faster.

Matching format:

(1) ^ matches the beginning of the string

Import re string = "dd12a32d41648f27fd11a0sfdda" # ^ matches the beginning of a string. Now we use search () to match m = re.search ("^ [0-9]", string) # matching string that begins with a number (1) print (m) n = re.search ("^ [amurz] +", string) # matching string begins with a letter, if it starts with a match It is not much different from search () (2) print (n.group ())

The running results are as follows:

None

Dd

In (1) above we use ^ to match from the beginning of the string, whether the match starts with a number, because the string is preceded by letters, not numbers, so the match fails. When we return None; (2), we start matching with letters, because the beginning is a letter, the match is correct, and the correct result is returned; in this way, ^ is similar to match () starting from the beginning.

(2) $matches the end of the string

Import re string = "15111252598" # ^ matches the beginning of the string. Now we use search () to match m = re.match ("^ [0-9] {11} $", string) print (m.group ()) that starts with a number.

The running results are as follows:

15111252598

Re.match ("^ [0-9] {11} $", string) means to match a format that begins with a number, has a length of 11, and ends with a number.

(3) the dot () matches any character except the newline character. When the re.DoTALL tag is specified, any character including newline characters can be matched

Import re string = "1511\ n1252598" # dot () matches all characters except newline characters m = re.match (".", string) # dot () matches any character, does not specify a number to match a single (1) print (m.group () n = re.match (". +", string) #. + matches multiple arbitrary characters Except for the newline character (2) print (n.group ())

The running results are as follows:

one

1511

From the running results of the above code, we can see that (1) the dot () matches any character; (2) we match any number of characters, but because the string contains spaces in the middle, the result matches only the content in front of the newline character in the string, and the rest does not match.

Key points: (1) Point () matches any character except newline characters; (2). + can match multiple characters except newline characters.

(4) [.] If [abc] matches "a", "b" or "c"

[object] matches the contained characters in parentheses. [A-Za-z0-9] indicates that it matches Amurz or Amurz or 0-9.

Import re string = "1511\ n125dadfadf2598" # [] matches the characters m = re.findall ("[5fd]", string) # in parentheses # matches 5Magi fjor d print (m) in the string

The running results are as follows:

['5miles,' 5miles, 'dudes,' dudes, 'fags,' fads,'5']

In the above code, we want to match the 5pm fjd in the string and return a list.

(5) [^...] [^ abc] matches any character except abc

Import re string = "1511\ n125dadfadf2598" # [^] matches the character m = re.findall ("[^ 5fd]", string) # matching string except the character print (m) in parentheses

The operation is as follows:

['1th,' 1th, '1th,'\ NY, '1th,' 2th, 'asides,' 2', '9levels,' 8']

In the above code, we match characters other than 5ref and d, and [^] matches characters other than those in parentheses.

(6) * match 0 or more expressions

Import re string = "1511\ n125dadfadf2598" # * is the expression m = re.findall ("\ d *", string) # that matches 0 or more numbers print (m)

The running results are as follows:

['1511,', '125,', '2598,']

From the results of the above run, you can see that (*) is an expression that matches 0 or more characters, and we match 0 or more digits, and you can see that if the match does not return empty, and where the last position returns is an empty (").

(7) + matches one or more expressions

Import re string = "1511\ n125dadfadf2598" # (+) is an expression m = re.findall ("\ d +", string) # that matches one or more numbers print (m)

The operation is as follows:

['1511', '125', '2598']

Plus (+) is to match one or more expressions, and above\ d + is to match one or more numeric expressions, at least one number.

(8)? Match 0 or 1 expressions, not greedy

Import re string = "1511\ n125dadfadf2598" # (?) is the expression m = re.findall ("\ d?", string) # matches 0 or 1 expression print (m)

The running results are as follows:

['1th, 5th, 1th, 1th, 5th, 8th, 5th, 8th, 5th, 5th, 5th, 5th, 8th, 8th, 5th, 8th, 8th, 8th, 8th, 8th, 8th, 8th, 8th, 8th, 8th, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 2nd, 5th, 5th, 5th, 5th, 5th, 9th, 8th, 8th, 5th, 8th, 5th, 8th, 5th, 8th, 5th, 8th, 5th, 8th, 5th, 8th, 5th, 8th, 9th, 8th,']

The question mark above (? ) is to match 0 or 1 expression, which is the expression that matches 0 or 1, and returns empty (") if it does not match.

(9) {n} matches n times, defining the number of times a string matches

(10) {nQuery m} matches n to m times expressions

(11)\ w match alphanumeric

\ w matches the letters and numbers in the string. The code is as follows:

Import re string = "1511\ n125dadfadf2598" # (?) is the expression m = re.findall ("\ w", string) # matches 0 or 1 expression print (m)

The operation is as follows:

['1th, 5th, 5th, 1th, 1th, 1th, 2th, 5th, 5th, 5th, 8th]

As you can see from the above code,\ w is used to match alphanumeric characters in a string. We use regular matching letters and numbers.

(12)\ W\ W uppercase W is used to match non-letters and numbers, as opposed to lowercase w

Examples are as follows:

Import re string = "1511\ n125dadfadf2598" #\ W to match non-letters and numbers in a string m = re.findall ("\ W", string) #\ W to match non-letters and numbers print (m) in a string

The operation is as follows:

['\ n']

In the above code,\ W is used to match non-letters and numbers, and the newline characters are matched.

(13)\ s matches any white space character, which is equivalent to [\ n\ t\ f]

Examples are as follows:

Import re string = "1511\ n125d\ ta\ rdf\ fadf2598" #\ s is used to match any white space character in the string, which is equivalent to [\ n\ t\ r\ f] m = re.findall ("\ s", string) #\ s to match any white space character print (m) in the string

The operation is as follows:

['\ nails,'\ tweeds,'\ rackers,'\ x0c']

From the running result of the above code, we can see that\ s is used to match any empty character, and we have matched the empty character.

(14)\ S matches any non-empty character

Examples are as follows:

Import re string = "1511\ n125d\ ta\ rdf\ fadf2598" #\ S is used to match any non-empty character m = re.findall ("\ S", string) #\ S is used to match any daily non-empty character print (m)

The operation is as follows:

['1th, 5th, 5th, 1th, 1th, 1th, 2th, 5th, 5th, 5th, 8th]

As you can see from the above code,\ S is used to match any non-null character, and as a result, we match any non-empty character.

(15)\ d match any number, equivalent to [0-9]

(16)\ D matches any non-number

These are all the contents of the article "what are the methods contained in regular expressions?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report