Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the common knowledge points of regular expressions?

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Xiaobian to share with you what are the common regular expression knowledge points, I believe that most people do not know much, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

1. Regularities are just used to process strings: matching, capturing

Match: verify that the current string conforms to our rules (each rule is a rule)

Capture: in the whole string, the characters that conform to the rules are obtained in turn-> exec, match, replace.

2. Composition of regularities: metacharacters and modifiers

Metacharacters:

Metacharacters with special meaning:

\ d matching a 0-9 number is equivalent to [0-9], opposite to it.

\ d matching any character except 0-9 is equivalent to []

\ w matches a number or character of 0-9, Amurz, Amurz, which is equivalent to [0-9aMurz Amurz],

\ s matches a white space character (space, tab. )

\ b matches the boundary of a word "w100 w000"

\ t match a tab

\ nmatch a newline

. Match any character except\ n

^ begins with a metacharacter

$ends with a metacharacter

\ translated characters

X | y x or one of y

[xyz] any one of x, y, z,

[^ xyz] except for any one of x, y, z

[amurz]-> matches any character in amurz

[^ amurz]-> match except for any character in amurz

Grouping in () regularization

Quantifier:

* 0 to multiple

+ 1 to multiple

? 0 to 1

? It has more meaning in the rule.

Putting it after a non-quantifier metacharacter represents 0-1 occurrences such as / ^\ dwords / 0-9 direct numbers 0-1.

Put it after a quantifier metacharacter, cancel the greed of capture / ^\ dwords capture / capture only the first captured number to get "2015"-> 2

(?:) packet value matching is not captured

(?) forward pre-check

(?) Negative pre-check

The role of ()

1) change the default priority

2) packet capture can be carried out

3) grouping references

{n} appear n times

{n,} appear n to multiple times

{n ~ m} appears n to m times.

Ordinary metacharacter

Any character in the rule is an ordinary metacharacter that represents its own meaning except that the above has a special meaning.

Modifier:

I: ignore the case of letters

M:multiline multiline matching

G:global global matching

Regularities that are often used in projects

1) the judgment is the regularity of the valid number.

Significant numbers are positive, negative, zero, and decimal

Part one: there may be addition or subtraction or no

The second part: one digit can be 0, and multiple digits cannot start with 0.

The third part: there can be decimals or no decimals, but once there is a decimal point, it is followed by at least one digit.

Var reg = / ^ [+ -]? (\ d | [1-9]\ d +) (\.\ d +)? $/

Valid positive integer (including 0): / ^ [+]? (\ d | [1-9]\ d +) $/

Valid negative integer (including 0): / ^-(\ d | [1-9]\ d +) $/

Determine the mobile phone number (simple version):

Var reg=/ ^ 1\ d {10} $/

Judge mailbox

Part I: numbers, letters, underscores,-one to many digits

Part two: @

Part III: numbers, letters, one or more digits

Part IV: (. Two to four) .com .cn .net.. .com.cn

Var reg = / ^ [0-9a muraza Amurz Zhuan -] + @ [0-9a murz Amurz Z -] + (\. [a-zA-Z] {2jue 4}) {1jue 2} $/

Judge those between the ages of 18 and 65

18-19, 20-59, 60-65.

Var reg = / ^ ((18 | 19) | ([2-5]\ d) | (6 [0-5])) $/

True and valid 2-4-digit Chinese characters of the names of the people's Republic of China

Var reg = / ^ [\ u4e00 -\ u9fa5] {2pr 4} $/

ID number

The top six are province-> city-> county (district)

Four years, two months, two days.

Simple version

Var reg = / ^\ d {17} (\ d | X) $/

130828199012040617

Complex version

Var reg = / ^ (\ d {2}) (\ d {4}) (\ d {4}) (\ d {2}) (\ d {2}) (?:\ d {2}) (\ d) (?:\ d | X) $/

Detailed knowledge points

Any character that appears in it represents its own meaning, for example: [.] In the "." It represents a decimal point instead of any character except\ n.

18 is not the number 18 but 1 or 8. For example, [18-65] is either 1 or 8-6 or 5.

1. Exec regular capture method-> match first, and then capture the matching content

If the string does not match this rule, the captured return result is null

If it matches the regular, the result returned is an array

Examples

Var str = "2015zhufeng2016peixun"

Var reg = /\ dwells /

The first item is what we captured.

Index: the index position where the captured content begins in the meta string

Input: the original string captured

2. Regular capture is lazy

Every regular capture starts with the lastIndex value. On the first capture, lastIndex=0 looks for the capture from the location where the original string index is 0, but by default, when the first capture is completed, the value of lastIndex does not change, or 0, so the second capture starts from the original string index of 0, so the content captured for the first time is found.

Solution: after adding the global modifier g-- > plus g, after the first capture is completed, the value of lastIndex changes to the starting index of the first character after the first capture, and the second capture continues to look backwards.

Question: is it not possible to manually modify the value of lastIndex after each capture without the global modifier g?

No, although the lastIndex has been manually modified and the value of lastIndex has been changed, the regular search starts with index 0.

Var str = "zhufeng2015peixun2016"; var reg = /\ dhammer _ g

Examples

In order to prevent the endless loop caused by the absence of the global modifier g, we manually add a g to those that do not add g before processing

RegExp.prototype.myExecAll = function myExecAll () {var _ this = this, str = arguments [0], ary = [], res = null;! _ this.global? _ this = eval (_ this.toString () + "g"): null; res = _ this.exec (str); while (res) {ary [ary.length] = res [0]; res = _ this.exec (str);} return ary;} Var ary = reg.myExecAll (str); console.log (ary); console.log (reg.lastIndex); / /-> 0 var res = reg.exec (str); console.log (res); console.log (reg.lastIndex); / /-> 11 res = reg.exec (str); console.log (res); console.log (reg.lastIndex); / /-> 21 res = reg.exec (str); console.log (res); / /-> null

3. Match: a method called match in the capture string can also capture, and as long as we eliminate the regular laziness, we can capture everything by executing the match method once.

Var str = "zhufeng2015peixun2016"; var reg = /\ dhammer G; console.log (str.match (reg))

Question: how nice would it be that we all replace exec with match?

4. Regular packet capture

In each capture, not only the content of the large regular match can be captured, but also the content of each small packet (sub-regular) can be captured separately.

Var str = "zhufeng [2015] peixun [2016]"; var reg = /\ [(\ d) (\ d+)\] / g; var res = reg.exec (str); console.log (res); ["[2015]", "2", "015", index: 7, input: "zhufeng [2015] peixun [2016]"]

The first item is the content captured by the big regular res [0]

The second item is the first packet captured content res [1]

The third item is the content captured by the second packet rex [2]

.

The matching of the grouping is only matched but not captured: if we perform matching the contents of the grouping without capturing it, we only need to add?: in front of the grouping.

Var str = "zhufeng [2015] peixun [2016]"; var reg = /\ [(?:\ d) (\ d+)\] / g; var res = reg.exec (str); console.log (res); ["[2015]", "015".]

The first item in the array is the content captured by the big regular res [0]

The second item in the array is the content captured by the second packet res [1]

The first grouping is added?:, so only match but not capture

5. The difference between exec and match

Match can only capture the content of large regular matching, but it is impossible to obtain the content of packet matching in packet capture, so if you do not need to capture the content of the packet, it is more convenient for us to use match directly. If you need to capture the content of the packet, we can only use exec to capture it one by one.

Var str = "zhufeng [2015] peixun [2016]"; var reg = /\ [(\ d+)\] / g; / / console.log (str.match (reg)); / /-> ["[2015]", "[2016]"] var ary = []; var res = reg.exec (str); while (res) {/ / ary.push (res [1]); ary.push (RegExp.$1) / / RegExp.$1 gets the content of the current regular first packet capture (may not be captured in some IE browsers) res = reg.exec (str);} console.log (ary)

6. Regular greed: in each capture, it is always captured according to the longest result of regular matching.

Var str = "zhufeng2015peixun2016"; var reg = /\ drabbit str g; console.log (reg.myExecAll (str)); / /-- > ["2015", "2016"] var str = "zhufeng2015peixun2016"; var reg = /\ dhammer Universe g; console.log (reg.myExecAll (str)); / /-- > ["2", "0", "1", "5", "2", "0", "1", "6"]

7. Group reference

\ 2 represents exactly the same content as the second group.

\ 1 represents exactly the same content as the first group.

Var reg=/ ^ (\ w) (\ w)\ 2\ 1 $/; "woow", "1221"

8. String method-> replace: replace a character in a string with new content

1) without using regularization

Only one replace can be replaced at a time, and multiple replacements are also required.

Var str = "zhufeng2015 zhufeng2016"; "zhufeng"-> "Everest" str = str.replace ("zhufeng", "Mount Qomolangma"). Replace ("zhufeng", "Mount Qomolangma")

Sometimes a replacement cannot be achieved even if it is executed multiple times.

"zhufeng"-> zhufengpeixun "str = str.replace (" zhufeng "," zhufengpeixun ") .replace (" zhufeng "," zhufengpeixun ")

[the first parameter can be a regular] replace everything that matches the regular (but it is lazy by default, just like capture, only with the global modifier g)

Var str = "zhufeng2015 zhufeng2016"; str = str.replace (/ zhufeng/g, "zhufengpeixun"); console.log (str)

1) execution and number of execution

In fact, the principle is exactly the same as that of exec capture.

For example, if our second argument passes a function, the current function is executed every time the rule captures the current function in the string-> twice in this question, so the function is executed twice.

Var str = "zhufeng2015 zhufeng2016"; str = str.replace (/ zhufeng/g, function () {

2) Parameter problem

Console.dir (arguments)

Not only execute the function, but also pass parameters to our function, and the parameters passed are exactly the same as those captured by each exec

If this is the first exec capture-> ["zhufeng", index:0,input: "original string"]

Execute the parameters in the function for the first time

Arguments [0]-> "zhufeng" / *

Arguments [1]-> 0 is equivalent to the index location captured by index in exec

Arguments [2]-> "original string" is equivalent to input in exec

3) return value problem

What return returns is equivalent to replacing the current capture with something.

Return "zhufengpeixun";}); console.log (str); these are all the contents of the article "what are the knowledge points of common regular expressions?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report