In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is about how to improve the efficiency of regular expressions. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
If it is purely to challenge your regular level, and to achieve some special effects (such as using regular expressions to calculate prime numbers and solve linear equations), efficiency is not a problem; if you write regular expressions only to meet one or two or dozens of runs, it doesn't make much difference whether you optimize them or not. However, if the regular expression you write will run millions or thousands of times, efficiency will be a big problem.
For the convenience of writing, first define two concepts.
Mismatch: means that the content of the regular expression matches beyond the required range, and some text obviously does not meet the requirements, but is "hit" by the written regular expression. For example, if you use\ d {11} to match an 11-digit mobile phone number,\ d {11} will match not only the correct mobile phone number, but also strings such as 98765432100 that are obviously not mobile phone numbers. We call such a match a mismatch.
Missing match: means that the scope of the content matched by the regular expression is too narrow, and some text is really needed, but the regular written does not include this situation. For example, if you use\ d {18} to match an 18-digit ID number, you will miss the situation where the letter X ends.
If you write a regular expression, there may be only mismatches (the conditions are extremely loose and the scope is larger than the target text), or there may only be missing matches (describing only one of many situations in the target text). There may also be both mismatches and missing matches. For example, using\ w +\ .com to match a domain name ending with .com will not only mismatch a string like abc_.com (legal domain names do not contain underscores, but\ w includes underscores), but also omit domain names such as ab-c.com (legitimate domain names can contain underscores, but\ w do not match underscores).
An accurate regular expression means that there is neither an error match nor a missing match. Of course, in reality, there is a situation where only a limited number of texts can be seen and rules are written according to these texts, but these rules will be used in large amounts of text. In this case, it is our goal to eliminate mismatches and missing matches as much as possible, if not completely, and to improve operational efficiency. The experience proposed in this paper is mainly aimed at this situation.
Master the details of grammar. In various languages, regular expressions have roughly the same syntax and different details. Clarifying the details of the regular syntax of the language used is the basis for writing correct and efficient regular expressions. For example, the matching range of perl equivalent to\ w is [a-zA-Z0-9]; perl regular expressions do not support the use of variable repetitions (variable repetition inside lookbehind) in affirmative reverse look, such as (?
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.