Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the knowledge points of regular expressions in Python

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the Python regular expression knowledge points of what the relevant knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe that you read this Python regular expression knowledge points of what articles will have a harvest, let's take a look at it.

1.1 regular expression

A regular expression (Regular Expression) is a string that represents a piece of regular information. Python comes with a regular expression module, through which you can find, extract, and replace a piece of regular information. It is difficult to find one person among 10,000 people, but it is easy to find a very "characteristic" person among 10,000 people. Suppose there is a person with green skin and a height of three meters, then even if this person is mixed with ten thousand people, others can find him at a glance. This process of "finding" is called "matching" in regular expressions. In program development, regular expressions can be used to enable computer programs to find what they need from a large piece of text. There are the following steps to use regular expressions.

(1) looking for rules.

(2) regular symbols are used to represent rules.

(3) extracting information.

1.2 the basic symbol of a regular expression is 1.2.1 dot "."

A dot can replace any character except a newline, including but not limited to English letters, numbers, Chinese characters, English punctuation and Chinese punctuation.

1.2.2 asterisk "*"

An asterisk can represent a subexpression in front of it (ordinary characters, another regular expression symbol or several regular expression symbols) 0 to infinite times.

As above: (the asterisk represents the previous expression)

1.2.3 dot + asterisk ". *"

The dot represents any character that is not a newline character, and the asterisk indicates that the character preceding it is matched 0 or any number of times. So ". *" means to match any number of strings of any length.

You can do all of the above:

It means that "any number of characters except newline characters" appear between "such" and "ha".

1.2.4 question mark "?"

The question mark indicates the subexpression in front of it 0 or 1 times. Note that the question mark here is an English question mark.

You can do all of the above:

1.2.5 dot + asterisk + question mark ". *?" (most commonly used)

Usage after combination:

You can do all of the above:

Note: ". *?" The difference between ". *"

. *? It means to match the shortest string that meets the requirements.

The sentence is summarized as follows.

① ". *": greedy mode to get the longest string that satisfies the condition.

② ". *?": non-greedy mode to get the shortest string that satisfies the condition.

1.2.6 parentheses "()"

Extract a portion of a string from a string.

There is a string as follows:

It can be seen that the password here has an English colon on the left and a Chinese character "you" on the right. When constructing a regular expression:. *? When you do, the result will be:

However, the colon and the Chinese character "you" are not part of the password, so if you only want "12345abcde", you need to use parentheses:

Get:

1.2.7 backslash "\"

In regular expressions, many symbols have special meanings, such as question marks, asterisks, curly braces, square braces, and parentheses. Backslashes need to be used with other characters to turn special symbols into ordinary symbols and ordinary symbols into special symbols.

1.2.8 number "\ d"

"\ d" is used in regular expressions to represent a number.

If you want to extract two numbers, you can use\ d\ d; if you want to extract three numbers, you can use\ d\ d\ d. But what if I don't know how many digits there are? You need to use the * sign to represent an arbitrary number of digits.

All can be represented by the following regular expression:

1.3 use regular expressions

Python's regular expression module is called "re", which is the acronym of "regular expression". In Python, you need to import this module before using it. The imported statements are:

Import re1.3.1 findall method

Python's regular expression module contains a findall method that returns all strings that meet the requirements in the form of a list.

The function prototype of findall is:

Re.findall (pattern,string,flags=0)

Pattern represents regular expressions, string represents the original string, and flags represents flags for some special functions. The result of the findall is a list of all the matching results. If there is no match, an empty list is returned.

When you need to extract something, enclose it in parentheses so that you don't get irrelevant information. How to return if it contains more than one "(. *?)"? As shown in figure 3-2, the list is still returned, but the elements in the list become tuples, the first element in the tuple is the account, and the second element is the password.

There is a flags parameter in the function prototype. This parameter can be omitted. When not omitted, it has some auxiliary functions, such as ignoring case, ignoring newline characters, and so on.

Here we take ignoring newline characters as an example to illustrate that to ignore newline characters, you need to use the flag of "re.S".

Although the symbol "\ n" appears in the matching result, it is better than nothing. The newline character in the content can be replaced later when the data is cleaned.

1.3.2 search method

The use of search () is the same as that of findall (), but search () only returns the first string that meets the requirement. Once it finds something that meets the requirements, it stops looking. It is especially useful to find only the first data from super-large text, which can greatly improve the running efficiency of the program.

The function prototype of search () is:

For the result, if the match succeeds, it is the object of a regular expression; if no data is matched, it is None.

If you need to get the matching result, you need to use the .group () method to get the value inside.

The result in parentheses in the regular expression is printed only if the parameter in .group () is 1.

The parameter of .group () cannot exceed the number of parentheses in a regular expression. A parameter of 1 means to read the contents of the first parenthesis, a parameter of 2 means to read the contents of the second parenthesis, and so on.

(note that it is not findall in the intention)

1.3.3 compile method

Re.findall () comes with the functionality of re.compile (), so there is no need to use re.compile ().

1.4 skills for extracting regular expressions 1.4.1 grasp the big first and then the small: secondary extraction

1.4.2 inside parentheses and outside parentheses

There can be other characters in parentheses.

The specific impact is shown in the figure below.

If there are other ordinary characters in parentheses, then these ordinary characters will appear in the obtained result.

This is the end of the article on "what are the knowledge points of regular expressions in Python?" Thank you for reading! I believe you all have a certain understanding of "what are the knowledge points of regular expressions in Python". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 283

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report