Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A detailed explanation of the examples of using Python regular expressions

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

The main content of this article is to explain "detailed examples of the use of Python regular expressions". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "the detailed explanation of the examples of using Python regular expressions".

As a concept, regular expressions are not unique to Python. However, there are some small differences in the actual use of regular expressions in Python.

This article is part of a series of articles about Python regular expressions. In the first article in this series, we will focus on how to use regular expressions in Python and highlight some unique features in Python.

We will introduce some of the ways to search and find strings in Python. Then we'll discuss how to use grouping to handle the children of the matching objects we found.

The module of regular expressions in Python that we are interested in using is usually called 're'.

> import re

1. Primitive type string in Python

The Python compiler uses'\'(backslash) to represent escape characters in string constants.

If the backslash is followed by a string of special characters that the compiler can recognize, the entire escape sequence will be replaced with the corresponding special characters (for example,'\ n' will be replaced by the compiler with newline characters).

But this poses a problem for using regular expressions in Python, because backslashes are also used in the 're' module to escape special characters in regular expressions (such as * and +).

The mix of the two means that sometimes you have to escape the escape character itself (when the special character can be recognized by both the Python and regular expression compilers), but at other times you don't have to do so (if the special character can only be recognized by the Python compiler).

Instead of focusing on figuring out how many backslashes are needed, we can use the original string instead.

The primitive type string can be simply created by preceded by a character'r 'before the double quotation marks of a normal string. When a string is a primitive type, the Python compiler does not attempt to replace it. In essence, you are telling the compiler not to interfere with your strings at all.

> string = 'This is a\ nnormal string' > rawString = r'and this is a\ nraw string' > print string this is a normal string > print rawString and this is a\ nraw string this is a primitive type string.

Use regular expressions to find in Python

The 're' module provides several methods to exactly query the input string. The methods that we will discuss are:

Re.match ()

Re.search ()

Re.findall ()

Each method receives a regular expression and a string to find a match. Let's take a closer look at each of these methods to figure out how they work and how they are different.

two。 Use re.match to find-match start

Let's first look at the match () method. The match () method works so that it can find a match only if the beginning of the searched string matches the pattern.

For example, call the mathch () method on the string 'dog cat dog', and the lookup pattern' dog' will match:

> re.match (dog cat dog') > > match = re.match (dog cat dog') > match.group (0) 'dog'

We'll talk more about the group () method later. For now, we just need to know that we called it with 0 as its parameter, and the group () method returns the matching pattern found.

I also skipped the returned SRE_Match object for a moment, which we'll talk about shortly.

However, if we call the math () method on the same string, the lookup pattern 'cat', will not find a match.

> re.match (ringing catering, 'dog cat dog') >

3. Use re.search to find-match anywhere

The search () method is similar to match (), but the search () method does not restrict us to find a match only from the beginning of the string, so looking for 'cat' in our sample string will find a match:

Search (ringing catchers, 'dog cat dog') > match.group (0)' cat'

However, the search () method stops looking after it finds a match, so look for 'dog' only where it first appeared in our sample string with the searc () method.

> match = re.search (dog cat dog') > match.group (0) 'dog'

4. Use re.findall-all matching objects

By far the most frequently used lookup method in Python is the findall () method. When we call the findall () method, we can easily get a list of all the matching patterns instead of getting the match object (we'll talk more about the match object next). It's easier for me. Call the findall () method on the sample string and get:

['dog',' dog'] > re.findall (ringing catches, 'dog cat dog') [' cat']

5. Using the match.start and match.end methods

So what exactly is the 'match' object' that the search () and match () methods previously returned to us?

Instead of simply returning the matching part of a string, the "matching object" returned by search () and match () is actually a wrapper class about matching substrings.

You saw earlier that I can get matching substrings by calling the group () method (as we'll see in the next section, matching objects are actually very useful when dealing with grouping problems), but matching objects also contain more information about matching substrings.

For example, the match object can tell us where the match begins and ends in the original string:

> match = re.search (ringing dogstore, 'dog cat dog') > match.start () 0 > match.end () 3

Knowing this information is sometimes very useful.

6. Using mathch.group to group by numbers

As I mentioned earlier, matching objects are very handy when dealing with grouping.

Grouping is the ability to locate specific substrings of an entire regular expression. We can define a grouping as part of the entire regular expression, and then individually locate the matching content of this part.

Let's take a look at how it works:

> contactInfo = 'Doe, John: 555-1212'

The string I just created is similar to a fragment taken from someone's address book. We can match this line with such a regular expression:

> re.search (r'\ w\,\ w\:\ Shearing, contactInfo) > match = re.search (r'(\ w +), (\ w +): (\ S +)', contactInfo)

These groupings can be obtained by using the group () method of grouping objects. They can be located by the order of numbers that appear from left to right in the regular expression (starting at 1):

> match.group (1) 'Doe' > match.group (2)' John' > match.group (3) '555-1212'

The reason why the ordinal number of groups starts from 1 is that the 0th group is reserved to hold all matching objects (we saw it in the previous study of the match () method and the search () method).

> > match.group (0) 'Doe, John: 555-1212'

7. Use match.group to group by alias

Sometimes, especially when there are many groups in a regular expression, it becomes unrealistic to locate by the order in which the groups appear. Python also allows you to specify a group name with the following statement:

> match = re.search (r'(? P\ w +), (? P\ w +): (? P\ S+)', contactInfo)

We can still get the contents of the group with the group () method, but at this time we need to use the group name we specified instead of the number of digits of the group we used before.

> match.group ('last')' Doe' > match.group ('first')' John' > match.group ('phone')' 555-1212'

This greatly enhances the clarity and readability of the code. You can imagine that as regular expressions become more and more complex, it will become more and more difficult to understand what a grouping captures. Naming your group will clearly tell you and your readers your intentions.

Although the findall () method does not return grouping objects, it can also use grouping. Similarly, the findall () method returns a collection of tuples, where the Nth element in each tuple corresponds to the Nth grouping in the regular expression.

> re.findall (r'(\ w +), (\ w +): (\ S+)', contactInfo) [('Doe',' John', '555-1212')]

However, naming a group does not apply to the findall () method.

At this point, I believe you have a deeper understanding of the "detailed explanation of the use of Python regular expressions". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report