Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of matching rules in regular expressions

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article shares with you the content of an example analysis of matching rules in regular expressions. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Basic pattern matching

It all starts with the basics. Patterns, the most basic elements of regular expressions, are a set of characters that describe the characteristics of a string. Patterns can be simple, made up of ordinary strings, or very complex, often using special characters to represent a range of characters, repetition, or context. For example:

^ once

This pattern contains a special character ^, indicating that the pattern matches only those strings that begin with once. For example, the pattern matches the string "once upon a time" and not "There once was a man from NewYork". Just as the ^ symbol indicates the beginning, the $symbol is used to match strings that end in a given pattern.

Bucket$

This pattern matches "Who kept all of this cash in a bucket" and does not match "buckets". When the characters ^ and $are used at the same time, it indicates an exact match (the string is the same as the pattern). For example:

^ bucket$

Matches only the string "bucket". If a pattern does not include ^ and $, it matches any string that contains the pattern. For example: mode

Once

And string

There once was a man from NewYorkWho kept all of his cash in a bucket.

It's a match.

The letters in this pattern (o-n-c-e) are literal characters, that is, they represent the letter itself, and the numbers are the same. Other slightly more complex characters, such as punctuation and white characters (spaces, tabs, and so on), use escape sequences. All escape sequences start with a backslash (\). The escape sequence of tabs is:\ t. So if we want to detect whether a string begins with a tab, we can use this mode:

^\ t

Similarly, use\ nfor "new line" and\ r for carriage return. Other special symbols can be preceded by a backslash, such as the backslash itself, a full stop. Use\. Means, and so on.

Character cluster

In INTERNET programs, regular expressions are usually used to validate the user's input. When a user submits a FORM, it is not enough to use ordinary literal characters to determine whether the entered phone number, address, EMAIL address, credit card number and so on are valid.

So use a freer way to describe the pattern we want, which is a character cluster. To create a character cluster that represents all vowel characters, place all vowel characters in square brackets:

[AaEeIiOoUu]

This pattern matches any vowel character, but can only represent one character. A hyphen can be used to indicate a range of characters, such as:

[a murz] / / matches all lowercase letters [Amurz] / / matches all uppercase letters [a-zA-Z] / / matches all letters [0-9] / / matches all numbers [0-9\.\ -] / / matches all numbers, periods and minus signs [\ f\ r\ t\ n] / / matches all white characters

Again, these represent only one character, which is a very important one. If you want to match a string consisting of a lowercase letter and a number, such as "Z2", "T6", or "G7", but not "ab2", "r2d3", or "b52", use this mode:

^ [amurz] [0-9] $

Although [amurz] represents a range of 26 letters, here it can only match a string whose first character is a lowercase letter.

It was mentioned earlier that ^ represents the beginning of a string, but it has another meaning. When ^ is used in a set of square brackets, it means "not" or "exclude" and is often used to remove a character. Also using the previous example, we require that the first character cannot be a number:

^ [^ 0-9] [0-9] $

This pattern matches "& 5", "G7" and "- 2", but does not match "12" and "66". Here are a few examples of excluding specific characters:

[^ amurz] / / all characters except lowercase letters [^\\ /\ ^] / / all characters except (\) (/) (^) [^\ "\'] / / all characters except double quotes (") and single quotes (')

Special character "." (dot, full stop) used in regular expressions to represent all characters except "new lines". So the pattern "^ .5 $" matches any two-character string that ends with the number 5 and begins with other non-"new line" characters. Mode. " You can match any string except an empty string and a string that includes only one "new line".

PHP's regular expressions have some built-in universal character clusters, which are listed below:

Character cluster description [[: alpha:]] any letter [: digit:]] any number [[: alnum:]] any letter and number [[: space:]] any blank character [[: upper:]] any uppercase letter [: lower:]] any lowercase letter [: punct:]] any punctuation mark [[: xdigit:]] any number in hexadecimal, equivalent to [0-9a-fA-F] to determine repetition

By now, you already know how to match a letter or number, but more often, you may want to match a word or set of numbers. A word consists of several letters and a set of numbers consists of several singular numbers. Curly braces ({}) followed by a character or character cluster are used to determine the number of repetitions of the preceding content.

Character cluster description ^ [alpha:] $all letters and underscores ^ [[: AAA:]] {3} $all three-letter words ^ a $letter a ^ a {4} $AAA ^ a {2pm 4} $aa,aaa or AAA ^ a {1} $amah ^ an or AAA ^ a {2,} $containing more than two a strings ^ a {2,} such as: aardvark and aaab, but apple cannot be a {2,} such as baad and aaa But not Nantucket\ t {2} two tabs. {2} all two characters

These examples describe three different uses of curly braces. A number, {x}, means "the preceding character or character cluster appears only x times"; a number plus a comma, {x,} means "the number of times that the preceding content appears x or more"; two numbers separated by commas, {xPowery} means "the preceding content appears at least x times, but not more than y times". We can extend the pattern to more words or numbers:

^ [a-zA-Z0-9] {1,} $/ / all strings containing more than one letter, number or underscore ^ [0-9] {1,} $/ / all positive numbers ^\-{0jue 1} [0-9] {1,} $/ / all integers ^\-{0prime1} [0-9] {0,} {0,} / all decimals

The last example is not easy to understand, is it? Look at it this way: start with an optional minus sign (^), followed by 0 or more digits ([0-9] {0,}), and an optional decimal point (\. {0score1}) followed by 0 or more digits ([0-9] {0,}), and nothing else ($). Below you will know the simpler methods that can be used.

Special character "?" They are equivalent to {0Jol 1}, and they all represent "0 or 1 previous content" or "previous content is optional". So the example can be simplified as follows:

^\ -? [0 # 9] {0,}\.? [0 # 9] {0,} $

The special characters "*" and {0,} are equal, and they both represent "0 or more preceding content". Finally, the character "+" is equal to {1,}, meaning "one or more preceding contents", so the above four examples can be written as:

^ [a-zA-Z0-9] + $/ / all strings containing more than one letter, number or underscore ^ [0-9] + $/ all positive numbers ^\ -? [0-9] + all integers ^\ -? [0-9] *.? [0-9] * $/ all decimals thank you for reading! This is the end of this article on "sample analysis of matching rules in regular expressions". I hope the above content can be of some help to you, so that you can learn more knowledge. If you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report