What are the characters supported by regular expressions 07/19 Update SLTechnology News&Howtos

What are the characters supported by regular expressions

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the characters supported by regular expressions, which have a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, let's take a look at it.

1. An example of a character that matches a character class.\ d matches any number\ d\ d: match 72 from 0Mui 9, but does not match aa or 7a\ D any non-numeric character\ D\ D\ D: match abc, but does not match 123\ w any word character, including Alym Zmai Zlim 0-9 and the underscore\ w\ w\ w: match Ab-2 But does not match ∑? $% * or Ab_@\ W any non-word character\ W matches @, but does not match any of the a\ s white space characters, including tabs, line feeds, carriage returns, page feeds and vertical tabs matches all traditional white space characters in HTML,XML and other standard definitions\ S any character other than a non-white space character, such as A%&g3 Wait. Any character matches any character except the newline character unless the MultiLine precedent [...] Any character in parentheses [abc]: will match a single character, acentine b or c.

[amurz]: will match any character from a to z [^...] Any character not in parentheses [^ abc]: will match a single character other than a, b, c, which can be a parenthesis b or A, B, C

[^ aMurz] will match any character that does not belong to Amurz, but can match all capital letters, repeating characters. For example, {n} matches the preceding character n times x {2} matches xx, but does not match x or xxx {n,} matches at least n times x {2} matches 2 or more x, such as xxx,xxx.. {nprime m} matches the previous character at least n times, at most m times. If n is 0, this parameter is an optional parameter x {2pm 4} that matches xx,xxx,xxxx but does not match xxxxx? Matches the previous character 0 or 1 times, which is essentially an optional x? Match x or zero x + match the previous character 0 or more times x + match x or xx or any number of x * matches the previous character 0 or more times x * match 0 or more x 3, positioning character description ^ the subsequent pattern must be at the beginning of the string, if it is a multiline string, it must be at the beginning of the line. For multiline text (a string containing carriage returns), the mode before the multiline flag $must be at the end of the string, if it is a multiline string, the pattern that must be in front of the end of the line\ A must be at the beginning of the string, the pattern before the multiline flag\ z must be at the end of the string, and the pattern before ignoring the multiline flag\ Z must be at the end of the string Or in front of a newline character\ b matches a word boundary, that is, the point between a word character and a non-word character. Remember that a word character is a character in [a-zA-Z0-9]. At the beginning of a word\ B matches the boundary position of a non-word character, not the beginning of a word

Note: positioning characters can be applied to characters or combinations, placed at the left or right end of the string

4. Group character group character definition example () this character can combine the characters matched by the pattern in parentheses, it is a capture group, that is to say, the character of pattern matching is finally set as the ExplicitCapture option-- by default, the character is not part of the match. The input string is: ABC1DEF2XY.

A regular expression that matches 3 characters from A to Z and 1 number: ([Amurz] {3}\ d)

There will be two matches: Match 1 match 2=DEF2 matches one group each time: the first group of Match2 = the first group of ABC;Match3 = DEF

With a backreference, you can access the group through its number in the regular expression, as well as the C # and class Group,GroupCollection. If the ExplicitCapture option is set, you cannot use the content captured by the group (?:) this character can combine the characters matched by the pattern in parentheses, it is a non-capture group, which means that the characters of the pattern will not be captured as a group, but it forms part of the final match result. It is basically the same group type as above, but with the option ExplicitCapture input string set to: 1A BB SA1 C

Matches a number or a letter from A to Z, followed by the regular expression of any word character: (?:\ d | [Amurz]\ w)

It will produce 3 matches: every 1 match = 1A; every 2 matches = BB; every 3 matches = SA but no group is captured. This option combines the characters matched by the pattern in parentheses and names the group with the value specified in the angle brackets. In a regular expression, you can use a name for backreference instead of a number. It is a capture group even if the ExplicitCapture option is not set. This means that backreferences can take advantage of matching characters within the group, or access the input string through the Group class as: Characters in Sienfeld included Jerry Seinfeld,Elaine Benes,Cosno Kramer and George Costanza can match their names, and the regular expression of the last name is captured in a group llastName as:\ b [A-Z] [a Mizz] + (? [A Muz] [aMuz] +)\ b

It produced four matches: First Match=Jerry Seinfeld; Second Match=Elaine Benes; Third Match=Cosmo Kramer; Fourth Match=George Costanza

Each match corresponds to a lastName group:

First match: lastName group=Seinfeld

Second match: lastName group=Benes

3rd match: lastName group=Kramer

4th match: lastName group=Costanza

Regardless of whether or not the option ExplictCapture is set, the group will be captured (? =) and declared. The right side of the declaration must be the pattern specified in parentheses. This pattern does not form part of the final match the regular expression\ S+ (? = .NET) the input string to match is: The languages were Java,C#.NET,VB.NET,C,Jscript.NET,Pascal

The following match will be generated:

JScript. (?) Negative statement. It states that the pattern cannot be immediately to the right of the declaration. This pattern does not form part of the final match\ d {3} (?! [Amurz]) the input string to match is: 123A 456 789111C

The following match will be generated:

four hundred and fifty six

789 (? Regular expression (?) non-retrospective group. Prevents the Regex engine from backtracking and prevents the implementation of a match assuming that all words ending in "ing" are matched. Input string is as follows: He was very trusing

The regular expression is:. * ing

It will achieve a match-the word trusting. "." Matches any character and, of course, "ing". So, the Regex engine backtracks one bit and stops at the second "t", then matches the specified pattern "ing". However, if you disable the backtracking operation: (? >. *) ing

It will achieve 0 matches. "." Can match all characters, including "ing"-does not match, thus matching failure 5, decision character description example (? (regex) yes_regex | no_regex) if the expression regex matches, then an attempt will be made to match the expression yes. Otherwise, match the expression no. The regular expression no is a preemptive parameter. Note that the width of the mode for making decisions is 0. 5%. This means that the expression yes or no will match the regular expression (? (\ d) dA | Amurz) B from the same position as the regex expression. The input string to match is: 1A CB3A5C 3B. The match it achieves is: 1ACB3A (? (group name or number) yes_regex | no_regex) if the regular expression in the group matches, then try to match the yes regular expression. Otherwise, try to match the regular expression no. No is the first parameter regular expression (\ d7)?-(? (1)\ d\ d [A-Z] | [Amurz] [Amurz] the input string to be matched is: 77-77A 69-AA 57Mub

The match it achieves is:

77-77A

-AA

Note: the characters listed in the table above force the processor to make an if-else decision

6. The replacement character description $group replaces ${name} with the group number specified by group by a (?) The last substring $$of the group match replaces a character $$& replace the entire matching $^ replace all text before the input string match $'replace all text after the input string match $+ replace the last captured group $_ replace the entire input string note: the above are commonly used replacement characters, incomplete seven, escape sequence character description\ match character "\". Matching character "."\ * matching character "*"\ + matching character "+"\? Matching character "?"\ | matching character "\ (matching character" ("\) matching character")\ {matching character "{"\} matching character "}"\ ^ matching character "^"\ $matching newline character\ r matching carriage return character\ t matching vertical tab character\ v matching vertical tab character\ f matching feed character\ nnn matches an octal number, the ASCII character specified by nnn. For example, the uppercase C\ xnn matches a hexadecimal number, the ASCII character specified by nn. For example,\ x43 matches uppercase C\ unnnn matches Unicode characters specified by 4-digit hexadecimal digits (represented by nnnn)\ cV matches a control character, such as\ cV matches Ctrl- V8, option flag name IIgnoreCaseMMultilineNExplicitCaptureSSingleLineXIgnorePatternWhitespace

Note: the meaning of the option itself is shown in the following table:

The flag name IgnoreCase makes pattern matching case-insensitive. The default option is to match the case-sensitive RightToLeft search input string from right to left. The default is from left to right to conform to reading habits such as English, but not Arabic or Hebrew. None does not set the flag. This is the default option Multiline specifies that ^ and $can match the beginning and end of a line, as well as the beginning and end of a string. This means that each line separated by a newline character can be matched. However, the character "." Still does not match the newline character SingleLine specifies the special character "." Matches any character, including newline characters. By default, special characters "." Does not match newline characters. Usually used with the MultiLine option ECMAScript.ECMA (European Coputer Manufacturer's Association, European Association of computer Manufacturers) has defined how regular expressions should be implemented and has been implemented in the ECMAScript specification, which is a standards-based JavaScript. This option can only be used with the IgnoreCase and MultiLine flags. Used with any other flag, ECMAScript generates an exception IgnorePatternWhitespace this option removes all non-escaped white space characters from the regular expression pattern used. It enables expressions to span multiple lines of text, but you must ensure that all white space in the pattern is escaped. If this option is set, you can also use the "#" character to annotate the expression Complied, which compiles the regular expression into code that is closer to machine code. This is fast, but it is not allowed to make any changes to it. 9. Oracle's regular expression (regular expression)

test data

Create table test (mc varchar2 (60))

Insert into test values ('112233445566778899')

Insert into test values ('22113344 5566778899')

Insert into test values ('33112244 5566778899')

Insert into test values ('44112233 5566 778899')

Insert into test values ('5511 2233 4466778899')

Insert into test values ('661122334455778899')

Insert into test values ('771122334455668899')

Insert into test values ('881122334455667799')

Insert into test values ('991122334455667788')

Insert into test values ('aabbccddee')

Insert into test values ('bbaaaccddee')

Insert into test values ('ccabbddee')

Insert into test values ('ddaabbccee')

Insert into test values ('eeaabbccdd')

Insert into test values ('ab123')

Insert into test values ('123xy')

Insert into test values ('007ab')

Insert into test values ('abcxy')

Insert into test values ('The final test is is is how to find duplicate words.')

Commit

Function description REGEXP_LIKE syntax: REGEXP_LIKE (source_string, pattern [, match_parameter]) description: returns a string that satisfies the matching pattern. Equivalent to the enhanced like function.

Source_string specifies the source character expression

Pattern specifies a regular expression

Match_parameter specifies the text string for the default match operation.

The position,occurtence,match_parameter parameter is an optional example sentence:

Select * from test where regexp_like (mc,' ^ a {1J 3}')

Select * from test where regexp_like (mc,'a {1pm 3}')

Select * from test where regexp_like (mc,' ^ a. Roomeholders')

Select * from test where regexp_like (mc,' ^ [[: lower:]]? [[: digit:]')

Select * from test where regexp_like (mc,' ^ [[: lower:]')

Select mc FROM test Where REGEXP_LIKE (mc,' [^ [: digit:]]')

Select mc FROM test Where REGEXP_LIKE (mc,' ^ [^ [: digit:]]')

REGEXP_INSTR syntax: REGEXP_INSTR (source_string, pattern [, start_position [, occurrence [, return_option [, match_parameter]) indicates that this function looks for pattern and returns the first location of the pattern. Feel free to specify the start_position you want to start the search for.

The occurrence parameter defaults to 1 unless you specify that you want to find a pattern that appears next.

The default value of return_option is 0, which returns the starting position of the pattern, and a value of 1 returns the starting position of the next character that meets the matching criteria:

Select REGEXP_INSTR (mc,' [[: digit:]] $') from test

Select REGEXP_INSTR (mc,' [[: digit:]] + $') from test

Select REGEXP_INSTR ('The price is $400.) FROM DUAL $[[: digit:]] +')

Select REGEXP_INSTR ('onetwothree',' [^ [[: lower:]') FROM DUAL

Select REGEXP_INSTR (',','[^,] *') FROM DUAL

Select REGEXP_INSTR (',','[^,]') FROM DUAL

REGEXP_SUBSTR syntax: REGEXP_SUBSTR (source_string, pattern [, position [, occurrence [, match_parameter]) description: returns a substring that matches the pattern. Equivalent to the enhanced substr function.

Source_string specifies the source character expression

Pattern specifies a regular expression

Position specifies the starting search location

Occurtence specifies the nth string in which the replacement occurs

Match_parameter specifies the text string for the default match operation.

Where position,occurtence,match_parameter parameters are optional

The values of match_option are as follows:

'c 'indicates case sensitivity when matching (default)

'i' indicates that the match is case-insensitive

'n' allows the use of operators that match any character

'm 'treats x as a string containing multiple lines. Example:

SELECT REGEXP_SUBSTR (mc,' [a murz] +') FROM test

SELECT REGEXP_SUBSTR (mc,' [0-9] +') FROM test

SELECT REGEXP_SUBSTR ('aababcde',' ^ a. Roomb') FROM DUAL

REGEXP_REPLACE syntax: REGEXP_REPLACE (source_string,pattern,replace_string,position,occurtence,match_parameter) description: string replacement function. Equivalent to the enhanced replace function.

Source_string specifies the source character expression

Pattern specifies a regular expression

Replace_string specifies the string to use for replacement

Position specifies the starting search location

Occurtence specifies the nth string in which the replacement occurs

Match_parameter specifies the text string for the default match operation.

Where the replace_string,position,occurtence,match_parameter parameters are optional. Example:

Select REGEXP_REPLACE ('Joe Smith',' () {2,}',',') AS RX_REPLACE FROM dual

Select REGEXP_REPLACE ('aa bb cc',' (. *)','3, 2, 1') FROM dual

Special characters:

'^' matches the start of the input string, used in square brackets expressions, which indicates that the character collection is not accepted.

The'$'matches the end of the input string. If the Multiline property of the RegExp object is set, $also matches'n'or'r'.

'.' Matches any single character except the newline character n.

'?' Matches the previous subexpression zero or once.

'*' matches the previous subexpression zero or more times.

The'+ 'matches the previous subexpression one or more times.

'()' marks the start and end position of a subexpression.

'[]' marks a bracketed expression.

'{mdirection n}' an exact range of occurrences, m =

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.