In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the relevant knowledge of "what are the knowledge points of javascript regular expressions". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Regular basic knowledge points
1. Metacharacter
Everything is predestined, and so are regularities. Metacharacters are a basic element for constructing regular expressions.
Let's first memorize a few commonly used metacharacters:
Metacharacter description. Match any character except newline characters w matches letters or numbers or underscores or Chinese character s matches any blank character d matches the beginning or end of a word ^ matches the beginning of a string $matches the end of a string
With metacharacters, we can use them to write some simple regular expressions.
For example:
Matches a string that begins with abc:
Abc or ^ abc
QQ numbers that match 8 digits:
^ dddddddd$
Match the mobile phone number with the first 11 digits of 1:
^ 1dddddddddddddd$
two。 Repeat qualifier
With metacharacters, you can write a lot of regular expressions, but if you are careful, you may find that other people's regular writing is simple and clear, while Jun's regular is made up of messy and repetitive metacharacters. Doesn't the rule provide a way to deal with these duplicate metacharacters?
The answer is yes!
To deal with these repetition problems, some repetition qualifiers in regular expressions are replaced by appropriate qualifiers. Let's take a look at some qualifiers:
Syntax description * repeat zero or more times + repeat one or more times? Repeat zero or once {n} repeat n times {n,} repeat n or more times {nMagnem} repeat n to m times
With these qualifiers, we can modify the previous regular expression, such as:
QQ numbers that match 8 digits:
^ d {8} $
Match the mobile phone number with the first 11 digits of 1:
^ 1D {10} $
The matching bank card number is 14-18 digits:
^ d {144.18} $
Matches strings that begin with an and end with 0 or more b
^ ab*$
3. Grouping
As you can see from the example (4) above, the qualifier acts on the nearest character to his left, so the question is, what if I want ab to be qualified at the same time?
Regular expressions are grouped with parentheses (), that is, the contents in parentheses as a whole.
So when we want to match multiple ab, we can do this
For example, the matching string contains 0 to more than one ab beginning:
^ (ab) *
4. Transfer of meaning
We see regular expressions grouped in parentheses, so here's the problem:
If the string you want to match contains parentheses, is that a conflict? What should I do?
In view of this situation, regularization provides a way to escape, that is, to escape these metacharacters, qualifiers, or keywords into ordinary characters, which is a simple answer, that is, a slash before the character to be escaped.
For example, to match starts with (ab):
^ (ab)) *
5. Condition or
Going back to the matching of our mobile phone numbers just now, we all know that domestic numbers all come from three major networks, and they all have their own number ranges. For example, China Unicom has 130x131x131max 132max 155max 155max 156max 185max 145max 176, if we are asked to match a Unicom number, then according to the rules we have learned so far, it should be impossible to start, because there are some juxtaposition conditions, that is, "or" So how does it mean "or" in the rule?
The regularity is represented by the symbol |, which is also called the branch condition, and when any of the conditions of the branch condition in the rule is satisfied, it will be regarded as a successful match.
Then we can use or conditions to deal with this problem.
^ (130 | 131 | 132 | 155 | 156 | 185 | 186 | 145 | 176) d {8} $
6. Interval
Looking at the above example, do you see any pattern? Is there an impulse to simplify?
Actually, there is.
The regular provides a metacharacter bracket [] to indicate the interval condition.
The limit from 0 to 9 can be written as [0-9]
Limit Amurz to [Amurz]
Limit certain numbers [165]
We changed the rules up there like this:
^ ((13 [0-2])) | (15 [56]) | (18 [5-6]) | 145 | 176) d {8} $
Well, that's all for the basic usage of regular expressions. In fact, it still has a lot of knowledge points and metacharacters. We only enumerate some metacharacters and grammar here. The aim is to give a quick entry-level tutorial for those who do not understand the rules or want to learn the rules but can't read the documentation. After reading this tutorial, even if you can't write high-end rules. At least be able to write some simple rules or understand the rules written by others.
Regular advanced knowledge points
1. Zero width assertion
Whether it's zero width or assertion, it all sounds weird.
Then explain these two words first.
Assertion: the proverb assertion is "what I judge", while the assertion in the regular rule means that the regular rule can indicate that the content that meets the specified rule will appear before or after the specified content.
The meaning rule can also judge something like human beings, such as "ss1aa2bb3". Regular can use assertions to find out that there is a bb3 in front of an aa2, or it can find out that there is a ss1 after an aa2.
Zero width: there is no width. In the rule, the assertion only matches the position and does not occupy the character, that is, the assertion itself is not returned in the matching result.
It means to be clear, so what's the use of him?
Let's take a chestnut:
Suppose we want to use crawlers to grab the number of articles read in csdn. By looking at the source code, you can see that the number of articles read is structured like this.
"readings: 641"
Among them, the '641' is a variable, that is to say, different articles have different values. When we get this string, there are many ways to get the '641' in it, but how should the rules match?
Let's start with several types of assertions:
Positive advance assertion (positive foresight):
Syntax: (? = pattern)
Function: matches the preceding content of the pattern expression without returning itself.
In this way, you still look confused. Well, to go back to that chestnut just now, to get the amount of reading, it means to be able to match the digital content in front of''in the regular expression.
According to the positive antecedent assertion mentioned above, you can match the content in front of the expression, which means: (? =) can match the previous content.
What does it match? If you want all the content, it is:
String reg= ". + (? =)"; String test = "readings: 641"; Pattern pattern = Pattern.compile (reg); Matcher mc= pattern.matcher (test); while (mc.find ()) {System.out.println ("matching results:") System.out.println (mc.group ());} / / matching results: / / readings: 641
But brother, all we want is the number in front of us, and that's easy. Match the number d, and it can be changed to:
String reg= "\ d + (? =)"; String test = "number of readings: 641"; Pattern pattern = Pattern.compile (reg); Matcher mc= pattern.matcher (test); while (mc.find ()) {System.out.println (mc.group ());} / / matching result: / / 641
The great task has been completed!
Positive backward assertion (looking back):
Grammar: (?
Function: matches the rest of a non-pattern expression without returning itself.
two。 Capture and non-capture
When it comes to capture, he means matching expression, but capture is usually associated with grouping, that is, "capture group"
Capture group: match the contents of the sub-expression, save the matching results to a numeric number or display named group in memory, number them in depth first, and then use these matching results by serial number or name.
According to the different naming methods, it can be divided into two groups:
Numeric number capture group:
Syntax: (exp)
Explanation: starting from the left side of the expression, the content between each left parenthesis and its corresponding right parenthesis is a group. in the grouping, the group 0 is the whole expression, and the * group begins with the grouping.
For example, for landlines: 020-85653333
His regular expression is: (0d {2})-(d {8})
In the order of the left parenthesis, this expression is grouped as follows:
Serial number grouping content 00 (0d {2})-(d {8}) 020-8565333311 (0d {2}) 02022 (d {8}) 85653333
Let's verify it with Java:
String test = "020-85653333"; String reg= "(0\ d {2})-(\ d {8})"; Pattern pattern = Pattern.compile (reg); Matcher mc= pattern.matcher (test); if (mc.find ()) {System.out.println ("the number of groups is:" + mc.groupCount ()); for (int itemositi)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.