How to learn regular expressions 07/15 Update SLTechnology News&Howtos

How to learn regular expressions

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces how to learn regular expressions, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, let the editor take you to understand it.

At the beginning, we still have to say that ^ and $are used to match the beginning and end of a string, respectively. Here are some examples.

"^ The": must begin with a "The" string

"of despair$": must end with a string of "of despair"

that,

"^ abc$": strings that begin with abc and end with abc are actually matched only by abc

"notice": matches a string containing notice

You can see that if you don't use the two characters we mentioned (the last example), that is, the pattern (regular expression) can appear anywhere in the checked string, you don't lock it to both sides.

Next, talk about'*','+', and'?'

They are used to indicate the number or order in which a character can appear. They said respectively:

"zero or more" equals {0,}

"one or more" equals {1,}

"zero or one." This is equivalent to {0Pol 1}. Here are some examples:

"ab*": synonymous with ab {0,}, matching begins with a, followed by 0 or N b strings ("a", "ab", "abbb", etc.)

"ab+": synonymous with ab {1,}, the same as above, but at least one b exists ("ab", "abbb", etc.)

"ab?": synonymous with ab {0jue 1}, there can be no or only one b

Match a string ending with one or zero a plus more than one b.

Main points,'*','+', and'?' Just care about the character in front of it.

You can also limit the number of characters in curly braces, such as

"ab {2}": an is required to be followed by two b (no less) ("abb")

"ab {2,}": a must be followed by two or more b (e.g. "abb", "abbbb", etc.)

"ab {3 abbbbb 5}": an is required to be followed by 5b ("abbb", "abbbb", or "abbbbb").

Now let's put a few characters in parentheses, such as:

"a (bc) *": matches a followed by 0 or a "bc"

"a (bc) {1pm 5}": one to five "bc."

There is also a character '│', which corresponds to the OR operation:

"hi │ hello": matches a string containing "hi" or "hello"

"(b │ cd) ef": matches a string containing "bef" or "cdef"

"(a │ b) * c": the match contains so many (including 0) an or b, followed by a c

String of

A point (.') Can represent all single characters, excluding "\ n"

What if you want to match all individual characters, including "\ n"?

By the way, use'[\ n.]' This mode.

"a. [0-9]": a plus a character plus a number from 0 to 9

"^. {3} $": ends with three arbitrary characters.

The content enclosed in brackets matches only a single character

"[ab]": matches a single an or b (same as "a │ b")

"[amurd]": matches a single character from'a'to'd'(the same effect as "a │ b │ c │ d" and "[abcd]"); we usually use [a-zA-Z] to specify a character in uppercase and lowercase English

"^ [a-zA-Z]": matches strings that begin with uppercase and lowercase letters

"[0-9]%": matches a string with the shape x%

", [a-zA-Z0-9] $": matches a string ending with a comma plus a number or letter

You can also list the characters you don't want in square brackets. You just need to use'^'in the closing brackets.

^ and $are used to match the beginning and end of the string, respectively.

Example 1 ^ must begin with a "" string.

Example 2$ must end with a string of "".

Example 3 ^ abc$ begins with abc and ends with abc. In fact, only abc matches

Example 4 abc matches a string containing abc without a symbol

* + and? Used to indicate the number or order in which a character can appear. They said separately that

{0,} = * example 1 ab {0,} matching starts with an and then B occurs N times ("a", "ab", "abb", "abbbbbbbbbbbbbbbbb", infinite.)

{1,} = + example 2 ab {1,} matching starts with an and then B appears 1mi N times ("ab", "abb", "abbbbbbbbbbbbbbbbb", infinite.)

{0menth1} =? Example 3 ab {0jue 1} matching begins with an and then B appears once ("a", "ab")

Example 4 a {0jue 1} bounded $matches a string that ends with 0 or 1 a plus a b. ("b", "ab")

Note (2 ways of writing)

Ab {0,} can also be written as ab*

Ab {1,} can also be written as ab+

Ab {0Magol 1} can also be written as ab?

A {0jue 1} baked $can also be written as aura baked $.

(1) 1 main points,'*'+', and'?' Just control the number of times the character in front of it appears.

2 {NMagneN} several times to several times {0} O times

3 {} this number cannot be negative.

(2) the number of times can be modified

Example 5 ab {2} requires that a be followed by two b (not less) such as ("abb").

Example 6 ab {2,} requires that there must be two or more b ("abb", "abbbb", etc.) after a.

Example 7 ab {3 abbbbb 5} requires that there can be 5 b ("abbb", "abbbb", or "abbbbb") after a.

(3) followed by multiple characters ()

Example 8 a (bc) * matches a followed by 0 or a "bc"; of course, you can also write "a (bc) {0,}"

Example 9 a (bc) {1 bc 5} matches one to five "bc."

│ is equivalent to OR to indicate one or more or

Example 1 A │ B matches a string containing "A" or "B"

Example 2 (A │ B) C matches a string containing "AC" or "BC"

Example 3 (A │ B) * C match contains (including 0-1) an or b, followed by a c

. Can represent all single characters

. Do not include "\ n" spaces if there are spaces, but add a character to the space [\ n.] Multiple spaces + 1 characters [\ n\ n.]

Example 1 a. [0-9] A plus a character plus a number from 0 to 9

Example 2 ^. {3} $ends with three arbitrary characters

The content enclosed in parentheses in'[ab] 'matches only a single character

Example 1 [ab] matches a single an or b (same as "a │ b")

Example 2 [an a-zA-Z] matches a single character from'a'to'd'(the same effect as "a │ b │ c │ d" and "[abcd]"); we usually use [a-zA-Z] to specify a character in uppercase and lowercase English.

Example 3 ^ [a-zA-Z] matches strings that begin with uppercase and lowercase letters

Example 4 [0-9]% match contains a string like x%

Example 5, [a-zA-Z0-9] $matches a string ending with a comma plus a number or letter

Example 6% [^ a-zA-Z]% matches a string containing two percent signs with a (non) letter in it.

You can also list the characters you don't want in square brackets. You just need to use'^'in the closing brackets.

Point 1: [content] ^ begins outside [], it begins with content.

Point 2: [^ content] ^ at the beginning of [], it means to exclude the content (^ non).

Point 3: match the string containing these characters. In square brackets [*\ +? {}.] Or the 'symbol will fail. Parentheses only match a single character.

Point 4: [] contains'] 'preferably as the first character in the list (probably followed by' ^')

Point 5: [] contains'- 'preferably at the front or back, and the' -'in the middle of or or the second end point of a range [a-d-0-9] will be valid.

\ b and\ B 1 match a word right boundary 2 match non-word boundary

Example 1've\ baked: can match the ve in love but not the ve in very

Example 2'ov\ ov: can match the ov in love but not the ov in ovry

\ d and\ D

Example 1\ d matches a numeric character. Equivalent to [0-9].

Example 2\ D matches a non-numeric character. Equivalent to [^ 0-9].

\ W and\ W

Example 1\ w matches any word character including an underscore. Equivalent to'[A-Za-z0-9]'

Example 2\ W matches any non-word characters including underscores. Equivalent to'[^ A-Za-z0-9]'.

Match non-print characters

Character meaning

\ cx matches the control characters indicated by x. For example,\ cM matches a Control-M or carriage return. The value of x must be one of Amurz or aMuz. Otherwise, c is treated as a literal'c 'character.

\ f matches a feed character. Equivalent to\ x0c and\ cL.

\ nmatches a newline character. Equivalent to\ x0a and\ cJ.

\ r matches a carriage return. Equivalent to\ x0d and\ cM.

\ s matches any white space characters, including spaces, tabs, page breaks, and so on. Equivalent to [\ f\ n\ r\ t\ v].

\ s matches any non-white space character. Equivalent to [^\ f\ n\ r\ t\ v].

\ t matches a tab. Equivalent to\ x09 and\ cI.

\ v matches a vertical tab. Equivalent to\ x0b and\ cK.

Examples

Regular expression that matches the leading and trailing white space characters: ^ s* | slots $

Regular expression that matches the Email address: W + ([- +.] w +) * @ w + ([-.] w +) * .w + ([-.] w +) *

Regular expression that matches the URL URL: [a-zA-z] +: / / [^ s] *

Whether the matching account is legal (5-16 bytes are allowed at the beginning of the letter, and alphanumeric underscores are allowed): ^ [a-zA-Z] [a-zA-Z0-9 _] {4j 15} $

Match domestic phone number: d {3}-d {8} | d {4}-d {7} matching form such as 0511-4405222 or 021-87888822

Match the QQ number of Tencent: [1-9] [0-9] {4,} 1 +, starting with the last four numbers, that is, 10000

Match China Postal Code: [1-9] d {5} (?! d) China Postal Code is 6 digits

Matching ID card: d {15} | d {18} comment: Chinese ID card is 15 or 18 digits

Match ip address: dumb.dumb.ding.d+ comment: useful when extracting ip address

^ $/ / start to end

+ / / consecutive 1mi N (connected) (that is,-{1,})

-? / / means negative and non-negative (that is,-{0pd1})

[0-9] * / / indicates the preceding digit 0min N (that is, [0-9] {0,})

. / / indicates a little or no point

[^ / / the contents are not inside

[amurz] / / matches all lowercase letters

[Amurz] / / matches all capital letters

[a-zA-Z] / / matches all letters

[0-9] / / integers that match all the numbers 0-9

[0-9 -] / / matches all numbers, periods and minus signs

^ [a-zA-Z0-9 _] + $/ / all strings containing more than one letter, number or underscore / / are concatenated together, for example, aA0_A001a_

^ [0-9] + $/ / all positive numbers (non-negative integers) / / for example, 345500687008099900999

^ -? [0-9] + $/ / all integers (including negative and integer) / / for example-43443 or 43443

^ -? [0-9] *.? [0-9] * $/ / all decimal places (including the infinite length of the number of decimal places before and after the positive and secondary decimal point) / / for example-10.00 or 100000.0000

If there is no decimal point, there must be no number after the decimal point, so add a.? To judge whether there is a decimal point, it is reasonable not to need. Is superfluous.

Because this is a special judgment of decimals, if there is no decimal point, is it still called payment?

[^ amurz] / / all characters except lowercase letters

[^ / ^] / all characters except "/" and "^" characters

[^ "] / / all characters except double quotation marks (") and single quotation marks (')

Thank you for reading this article carefully. I hope the article "how to learn regular expressions" shared by the editor will be helpful to everyone. At the same time, I also hope that you will support and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.