In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is about how python implements regular expressions. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
1. Re.sub and replace:
The full spelling of sub is substitute, which means replacement. Now that you know it is replacement, it is easy to use it in an example. In fact, replace also means replacement, but their usage is different. Here is an example to illustrate their similarities and differences in detail:
> import re > str1 = 'Hello 111is 222' > str2 = str1.replace ('111') Hello 222is > print (str2)
This is a simple example. In the case of the following situation, changing all the numbers to 222would be troublesome to implement with replace, while it would be relatively simple to implement with the sub method of the re module: (if it is a more complex operation, it may not be possible to achieve it with replace. )
> import re > str1 = 'Hello 123 is 456' > str2 = re.sub ('\ danzhong Hello 222MJ is str1) > print (str2)
2. Re.search () and re.match ():
Match: matches the regular expression only from the beginning of the string, and returns matchobject if the match succeeds, otherwise returns none.
Search: attempts to match all strings of a string with the regular expression, and returns none if none of the strings match successfully, otherwise returns matchobject.
The following example illustrates the similarities and differences between match and search, and also shows that search is often used in practical applications:
Import restr = 'helloword,i am alex'if not re.match (' word',str): print ('cannot match') print (re.match (' hello',str1). Group ()) print (re.search ('word',str1). Group ()) # display the result cannot matchhelloword
3 、 re.split:
In Python, if you want to split a string, you only need to call the split method of str, but this split can only be divided according to a certain character. If you want to specify multiple characters to split at the same time, it cannot be realized.
Fortunately, the re module also provides a split method to split strings, and this method is more powerful, which can be divided according to multiple characters at the same time. Let's take a look at the differences between str's split and re's split:
Str1 = 'helloword,i;am\ nalex'str2 = str1.split (',') print (str2) import restr3 = re.split ('[, | |\ n]', str1) print (str3) # below are different output results ['helloword', 'iposiam\ nalex'] [' helloword', 'ibis, 'am',' alex']
From which we can see that the above is true.
4 、 findall:
The findall method basically occurs at the same time as the compile method, and their usage is:
First, compile converts the string form of a regular expression into a pattern instance, and then uses the patte instance to call the findall method to generate a match object to get the result. Before combining them with the instance, let's take a look at the preset meaning of special characters in the regular expression:
\ d matches any decimal number; it is equivalent to class [0-9].
\ D matches any non-numeric character; it is equivalent to class [^ 0-9].
\ s matches any white space character; it is equivalent to the class ["t" n "r" f "v].
\ s matches any non-white space character; it is equivalent to the class [^ "t" n "r" f "v].
\ w matches any alphanumeric character; it is equivalent to the class [a-zA-Z0-9 _].
\ W matches any non-alphanumeric character; it is equivalent to the class [^ a-zA-Z0-9 _].
After reading the meaning of these special characters, let's give another example to illustrate the above argument:
Import restr1 = 'asdf12dvdve4gb4'pattern1 = re.compile ('\ d') pattern2 = re.compile ('[0-9]') mch2 = pattern1.findall (str1) mch3 = pattern2.findall (str1) print ('mch2:\ t% s% mch2) print (' mch3:\ t% s% mch3) # output result mch2: ['1x,'2y, '44th,' 4'] 13 mch3: ['1mm,' 2fang, '44th,' 4']
The above two examples can well illustrate the above argument, and it also shows that the special character\ d is indeed the same as [0 # 9], as you can see from the output result. so if you don't want to split each number into an element in the list, but want to output 12 as a whole, you can do this: (that is, add a + sign after\ d to achieve this. The + sign here indicates the overall output of one or more connected decimal digits)
Import restr1 = 'asdf12dvdve4gb4'pattern1 = re.compile ('\ dwells') pattern2 = re.compile ('[0-9]') mch2 = pattern1.findall (str1) mch3 = pattern2.findall (str1) print ('mch2:\ t% slots% mch2) print (' mch3:\ t% slots% mch3) # output result mch2: ['12','4','4'] mch3: ['1','2','4','4']
Let's take another small example, which combines special characters with re's sub function to remove all spaces in the string:
Import restr1 = 'asd\ tf12d vdve4gb4'new_str = re.sub ('\ s output result asdf12dvdve4gb4) print (new_str) #
5. Metacharacters:
What we usually call binary characters are; 2 metacharacters:. ^ $* +? {} [] | ()\
The first metacharacters we examined are "[" and "]". They are often used to specify a character category, which is the character set you want to match. Characters can be listed individually or separated by a "-" sign for two given
Characters to represent a character range. For example, [abc] will match any character in "a", "b", or "c"; you can also use the interval [Amurc] to represent the same character set, which is the same as the former. If you only want to match lowercase letters, then RE should be written as [a Murz]. Metacharacters do not work in categories. For example, [akm$] will match any of the characters "a", "k", "m", or "$"; "$" is usually used as a metacharacter, but in the character category, its properties are removed and restored to normal words.
Fu.
[]: the metacharacter [] represents a character class, in which only the characters ^, -,], and\ have a special meaning. The character\ still indicates escape, the character-can define the character range, and the character ^ is placed in front of it, indicating that it is not. (this is also withdrew in the above special character example)
+ match + the content before the number 1 to unlimited times
? Match? The content before the number is 0 to 1 times
{m} match the previous content m times
{mdirection n} match the previous content m to n times
The following is a small example to illustrate the use of the above characters in the metacharacter []: (in the following example, there are two points to pay attention to: one is after\ d +? The meaning of the sign, the second is to add a character r before the match, in fact, in this example, the same result can be displayed with or without)
> import re > print (re.findall (r "a (\ d +)", "a123b")) ['1'] > print (re.findall (r "a (\ d+)", "a123b")) ['123'] > > Thank you for reading! This is the end of the article on "how to implement regular expressions in python". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.