Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the regular expression character set?

2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly shows you "what is the regular expression character set", which is easy to understand and clear. I hope it can help you solve your doubts. Let me lead you to study and learn the article "what is the regular expression character set".

A regular expression character set is a set of regular expression characters enclosed by a square bracket "[]". Using the regular expression character set, you can tell the regular expression engine to match only one of the multiple characters. If you want to match an "a" or an "e", use < [ae] > >. You can use < < grae] y > to match gray or grey. This is especially useful when you are not sure whether the characters you are searching for are in American English or British English. Conversely, < GRAE] y > will not match graay or graey. The character order in the regular expression character set does not matter, and the result is the same.

You can use the hyphen "-" to define a range of characters as the regular expression character set. < [0-9] > matches a single number between 0 and 9. You can use more than one range. < [0-9a-fA-F] > matches a single hexadecimal number and is case-insensitive. You can also combine scope definitions with individual character definitions. < < [0-9a-fxA-FX] > > matches a hexadecimal number or letter X. Again, the order of character and range definitions has no effect on the result.

Some applications of ◆ regular expression character set

Look for a word that may be misspelled, such as < < sepae] r [ae] te > or < < li [cs] en [cs] e >.

Find the identifier of the program language, < < AmurZamurz _] [A-Za-z_0-9] * >. (* indicates repeating 0 or more times)

Find the C-style hexadecimal number < < 0 [xX] [A-Fa-f0-9] + > >. (+ means repeat one or more times)

◆ takes the anyway expression character set

The left square bracket "[" followed by an angle bracket "^" will reverse the regular expression character set. The result is that the regular expression character set matches any character that is not in square brackets. Unlike ".", the expression character set can match the carriage return newline character anyway.

It is important to remember that the expression character set must match one character anyway. < Q [^ u] > does not mean that a Q is matched, followed by a u. It means that it matches a Q, followed by a character that is not u. So it does not match the Q in "Iraq", but matches the Q in "Iraq is a country" and a space character. In fact, the space character is part of the match because it is a "character that is not u".

If you just want to match a Q, provided that Q is followed by a character that is not u, we can solve it by looking forward as we will talk about later.

Metacharacters in ◆ regular expression character set

It is important to note that only 4 characters in the regular expression character set have a special meaning. They are: "]\ ^ -". "]" Represents the end of the regular expression character set definition; "\" represents escape; "^" represents inversion; and "-" represents range definition. Other common metacharacters are normal characters within the regular expression character set definition and do not need to be escaped. For example, to search for an asterisk * or plus +, you can use < < [+ *] >. Of course, if you escape the usual metacharacters, your regular expressions will also work well, but this will reduce readability.

In the regular expression character set definition, in order to use the backslash "\" as a literal character rather than a character with special meaning, you need to escape it with another backslash. < [\ x] > will match a backslash and an X. "] ^ -" can be escaped with a backslash, or put them in a position where it is impossible to use their special meaning. We recommend the latter because it increases readability. For example, for the character "^", put it except after the left parenthesis "[", using the literal character meaning rather than the reverse meaning. Such as < < [x ^] > > matches an x or ^. < < [] x] > matches a "]" or "x". < < [- x] > or < < [x -] > > will match a "-" or "x".

Abbreviation of ◆ regular expression character set

Because some regular expression character sets are very common, there are some abbreviations.

< <\ d > stands for < < [0-9] > >

< <\ w > > stands for word characters. This varies depending on the implementation of the regular expression. Most regular expressions implement word regular expression character sets that contain < < A-Za-z0-9 _] >.

< <\ s > stands for "white character". This is also related to different implementations. In most implementations, the space and Tab characters are included, as well as the carriage return newline character < <\ r\ n >.

The abbreviated form of the regular expression character set can be used inside or outside the square brackets. < < s\ d > matches a white character followed by a number. < < [\ s\ d] > matches a single white character or number. < [\ da-fA-F] > will match a hexadecimal number.

Take the abbreviation of the character set of the anyway expression

< < [\ S] > > = < < [^\ s] > > < < [\ W] > > = < < [^\ w] > > < < [\ D] > > = < < [^\ d] > >

Repetition of ◆ regular expression character set

If you repeat a regular expression character set with the "? * +" operator, you will repeat the entire regular expression character set. Not just the character it matches. The regular expression < [0-9] + > matches 837 and 222.

If you just want to repeat the character that is matched, you can use a backward reference to achieve the goal. We'll talk about quoting back later.

The above is all the content of the article "what is the regular expression character set?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report