In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly shows you "what is the use of regular expressions in php", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "what is the use of regular expressions in php" this article.
1. Introduction
In a nutshell, regular expressions are a powerful tool for pattern matching and substitution. Regular expressions can be found in almost all UNIX-based tools, such as vi editors, Perl or PHP scripting languages, and awk or sed shell programs. In addition, client-side scripting languages like JavaScript also provide support for regular expressions. Thus it can be seen that regular expressions have gone beyond the limitations of a certain language or system and become a widely accepted concept and function.
Regular expressions allow users to build matching patterns by using a series of special characters, then compare the matching patterns with target objects such as data files, program inputs, and form inputs on WEB pages, and execute the corresponding programs according to whether the matching patterns are included in the comparison objects.
For example, one of the most common applications of regular expressions is to verify that the email address entered by the user online is in the correct format. If you verify that the format of the user's email address is correct through the regular expression, the form information filled in by the user will be processed normally; otherwise, if the email address entered by the user does not match the pattern of the regular expression, a prompt will pop up, requiring the user to re-enter the correct email address. Thus it can be seen that regular expressions play an important role in the logical judgment of WEB applications.
2. Basic grammar
After we have a preliminary understanding of the function and function of regular expressions, let's take a specific look at the grammatical format of regular expressions.
Regular expressions generally take the following form:
/ love/
The part between the "/" delimiters is the pattern that will be matched in the target object. Users only need to put the content of the pattern that they want to find a match between the "/" delimiters. In order to give users more flexibility to customize pattern content, regular expressions provide special "metacharacters". Metacharacters refer to those special characters with special meaning in regular expressions, which can be used to define the occurrence pattern of their leading characters (that is, characters in front of metacharacters) in the target object.
The more commonly used metacharacters include "+", "*", and "?". Among them, the "+" metacharacter stipulates that its leading character must appear one or more times in the target object, the "*" metacharacter stipulates that its leading character must appear zero or several times in a row in the target object, and "?" The metacharacter states that its leading object must appear zero or once in a row in the target object.
Next, let's take a look at the specific application of regular expression metacharacters.
/ fo+/
Because the above regular expression contains the "+" metacharacter, it means that it can match strings such as "fool", "fo", or "football" in the target object where one or more letters o appear after the letter f in succession.
/ eg*/
Because the above regular expression contains the "*" metacharacter, it means that it can match strings such as "easy", "ego", or "egg" in the target object that have zero or more letters g after the letter e in succession.
/ Wil?/
Because the above regular expression contains "?" Metacharacters that can match "Win" or "Wilson" in the target object, such as a string with zero or one letter l after the letter I.
In addition to metacharacters, users can specify exactly how often patterns appear in matching objects. For example,
/ jim {2,6} /
The above regular expression specifies that the character m can appear 2-6 times in a row in the matching object, so the above regular expression can match strings such as jimmy or jimmmmmy.
Now that we have a preliminary understanding of how to use regular expressions, let's take a look at how several other important metacharacters are used.
\ s: used to match a single space character, including the tab key and newline characters
\ s: used to match all characters except a single space character
\ d: used to match numbers from 0 to 9
\ w: used to match alphabetic, numeric or underscore characters
\ W: used to match all characters that do not match\ w
. Used to match all characters except newline characters
(note: we can think of\ s and\ S and\ w and\ W as inverse operations for each other)
Next, let's take a look at how to use the above metacharacters in regular expressions through an example.
/\ swords /
The above regular expression can be used to match one or more space characters in the target object.
/\ d000 /
If we have a complex financial statement on hand, we can easily find all the sums totaling thousands of yuan through the above regular expression.
In addition to the metacharacters we introduced above, there is another unique special character in regular expressions, the locator. The locator is used to specify where the matching pattern appears in the target object.
The more commonly used locators include "^", "$", "\ b" and "\ B". The "^" locator specifies that the matching pattern must appear at the beginning of the target string, the "$" locator specifies that the matching pattern must appear at the end of the target object, and the\ b locator states that the matching pattern must appear at one of the two boundaries at the beginning or end of the target string. The "\ B" locator states that the matching object must be within the beginning and end of the target string. That is, the matching object can be used neither as the beginning nor the end of the target string. Similarly, we can think of "^" and "$" and "\ b" and "\ B" as two sets of locators that inverse each other. For example:
/ ^ hell/
Because the above regular expression contains the "^" locator, it can match a string in the target object that starts with "hell", "hello" or "hellhound".
/ ar$/
Because the above regular expression contains a "$" locator, it can match a string in the target object that ends with "car", "bar" or "ar".
/\ bbom/
Because the above regular expression pattern begins with a "\ b" locator, it can match a string in the target object that starts with "bomb" or "bom".
/ man\ b /
Because the regular expression pattern above ends with a "\ b" locator, it can match a string in the target object that ends with "human", "woman" or "man".
In order to facilitate the user to set the matching pattern more flexibly, the regular expression allows the user to specify a certain range in the matching pattern without being limited to specific characters. For example:
/ [Amurz] /
The above regular expression will match any uppercase letter in the range A to Z.
/ [aMusz] /
The above regular expression will match any lowercase letter in the range a to z.
/ [0-9] /
The above regular expression will match any number in the range from 0 to 9.
/ ([a murz] [Amurz] [0-9]) + /
The above regular expression will match any string of letters and numbers, such as "aB0" and so on. One thing to remind users here is that you can use "()" to combine strings in regular expressions. The content contained in the "()" symbol must appear in the target object at the same time. Therefore, the above regular expression will not match a string such as "abc" because the last character in "abc" is a letter rather than a number.
If we want to implement an "OR" operation similar to the "OR" operation in programming logic in regular expressions, we can use the pipe character "|" if we choose one of several different patterns to match. For example:
/ to | too | 2 /
The above regular expression will match "to", "too", or "2" in the target object.
There is also a more common operator in regular expressions, the negative character "[^]". Unlike the locator "^" we introduced earlier, the negative character "[^]" states that the string specified in the pattern cannot exist in the target object. For example:
/ [^ Amurc] /
The above string will match any character in the target object except Aline B and C. In general, "^" is considered a negative operator when it appears within "[]", and should be treated as a locator when "^" is outside "[]" or when there is no "[]".
Finally, the escape character "\" can be used when the user needs to add metacharacters to the pattern of the regular expression and find its match. For example:
/ Th\ * /
The above regular expression will match "Th*" instead of "The" in the target object.
3. Use examples
The ereg () function can be used in ① PHP for pattern matching operations. The format of the ereg () function is as follows:
The following is the referenced content:
Ereg (pattern, string)
Where pattern represents the pattern of the regular expression, and string is the target object for the find and replace operation. Also to verify the email address, the program code written in PHP is as follows:
The copy code is as follows:
< ?php if (ereg("^([a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+",$email)){ echo "Your email address is correct!";} else{ echo "Please try again!"; } ?>There is a powerful RegExp () object in ② JavaScript 1.2 that can be used to match regular expressions. The test () method verifies that the target object contains a matching pattern and returns true or false accordingly.
We can use JavaScript to write the following script to verify the validity of the email address entered by the user.
The following is the referenced content:
The copy code is as follows:
A lot of people must have a headache about regular expressions. Today, with my understanding, plus some articles on the Internet, I hope to express it in a way that ordinary people can understand. To share the learning experience with you.
At the beginning, we have to say that ^ and $are used to match the beginning and end of a string, respectively. Here are some examples:
"^ The": must begin with a "The" string
"of despair$": must end with a string of "of despair"
that,
"^ abc$": strings that begin with abc and end with abc are actually matched only by abc
"notice": matches a string containing notice
You can see that if you don't use the two characters we mentioned (the last example), that is, the pattern (regular expression) can appear anywhere in the checked string, you don't lock it to both sides.
Next, talk about'*'+ 'and'?'
They are used to indicate the number or order in which a character can appear, and they say:
"zero or more" equals {0,}
"one or more" equals {1,}
"zero or one." It is equivalent to {0jue 1}.
Here are some examples:
"ab*": synonymous with ab {0,}, matching begins with a, followed by 0 or N b strings ("a", "ab", "abbb", etc.)
"ab+": synonymous with ab {1,}, the same as above, but at least one b exists ("ab", "abbb", etc.)
"ab?": synonymous with ab {0jue 1}, there can be no or only one b
"a string ending with one or zero a plus more than one b."
Main points:'*'+ 'and'?' Regardless of the character in front of it.
You can also limit the number of characters in curly braces, such as:
"ab {2}": an is required to be followed by two b (no less) ("abb")
"ab {2,}": requires that a be followed by two or more b (such as "abb", "abbbb", etc.)
"ab {3 abbbbb 5}": an is required to be followed by 5b ("abbb", "abbbb", or "abbbbb").
Now let's put a few characters in parentheses, such as:
"a (bc) *": matches a followed by 0 or a "bc"
"a (bc) {1pr 5}": one to five "bc"
There is also a character'|', which is equivalent to the OR operation:
"hi | hello": matches a string containing "hi" or "hello"
"(b | cd) ef": matches a string containing "bef" or "cdef"
"(a | b) * c": matches a string containing so many (including 0) an or b followed by a c
A point (.') Can represent all single characters, excluding "\ n"
What if you want to match all individual characters, including "\ n"?
Use'[\ n.]' This pattern.
"a. [0-9]": a plus a character plus a number from 0 to 9
"^. {3} $": ends with three arbitrary characters.
The content enclosed in brackets matches only a single character
"[ab]": matches a single an or b (same as "a │ b")
"[amurd]": matches a single character from'a'to'd'(same effect as "a │ b │ c │ d" and "[abcd]")
Usually we use [a-zA-Z] to specify a character as a case English:
"^ [a-zA-Z]": matches strings that begin with uppercase and lowercase letters
"[0-9]%": matches a string with the shape x%
", [a-zA-Z0-9] $": matches a string ending with a comma plus a number or letter
You can also list the characters you don't want in square brackets, you just need to use'^'in the brackets as the beginning "% [^ a-zA-Z]%" to match a string with a non-letter in the two percent sign.
Main point: when used at the beginning of brackets, it means to exclude the characters in parentheses.
In order for PHP to be able to explain, you have to add "before these characters and escape some characters.
Don't forget that the characters in brackets are the exception to this rule-in brackets, all special characters, including ("), will lose their special property" [*\ +? {}.] "matches the string containing these characters:
And, as regx's manual tells us, "if the list contains']', it's best to use it as the first character in the list (probably followed by'^'). If there is a'-', it's best to put it before or after it, and the'-'in the middle of or or the second end point of a range [a-d-0-9] will be valid.
After looking at the above example, you should understand {n.j.m}. Note that neither n nor m can be negative integers, and n is always less than m. In this way, we can match at least n times and up to m times. For example, "p {1pr 5}" will match the first five p in "pvpppppp".
Let's talk about the ones that start with\.
The book says he is used to match the boundary of a word, that is. For example,'ve\ baked 'can match the ve in love but not the ve in very.
\ B is exactly the opposite of\ b above. I will not cite an example.
... .. I suddenly remembered... . You can go to http://www.phpv.net/article.php/251 to see other grammars that begin with\.
OK, let's do an application: how to build a pattern to match the input of the number of currencies.
Build a matching pattern to check whether the input information is a number representing money. We think that there are four ways to represent the number of money: "10000.00" and "10000.00", or there is no fractional part, "10000" and "10000". Now let's start building this matching pattern:
^ [1-9] [0-9] * $
This is that all variables must start with a number other than 0. But it also means that a single "0" cannot pass the test. Here is the solution:
^ (0 | [1-9] [0-9] *) $
"only 0 and numbers that do not start with 0 match", and we can also allow a minus sign before the number:
^ (0 | -? [1-9] [0-9] *) $
This is: 0 or a number that begins with 0 and may be preceded by a minus sign. All right, now let's not be so strict and allow it to start with 0. Now let's give up the minus sign, because we don't need to use it when expressing coins. We now specify the pattern to match the decimal part:
^ [0-9] + (\. [0-9] +)? $
This implies that the matching string must begin with at least one Arabic numeral. Note, however, that in the above mode, "10." It doesn't match. Only "10" and "10.2" can do it. Do you know why?
^ [0-9] + (\. [0-9] {2})? $
We specify above that there must be two decimal places after the decimal point. If you think this is too harsh, you can change it to:
^ [0-9] + (\. [0-9] {1pr 2})? $
This will allow one or two characters after the decimal point. Now that we add a comma (every three digits) to increase readability, we can say:
^ [0-9] {1jue 3} (, [0-9] {3}) * (\. [0-9] {1jue 2})? $
Don't forget that'+ 'can be replaced by' *'if you want to allow blank strings to be entered, don't forget that backslash'\ 'may have errors (common errors) in php strings:
Now that we can confirm the string, we now remove all the commas from str_replace (", $money) and then think of the type as double, and then we can do the math with it.
One more:
Construct a regular expression for checking email
There are three parts in a full email address:
1. User name (everything to the left of'@')
2. Please tell me'
3. Server name (the rest of it)
The user name can contain uppercase and lowercase letters Arabic numerals, full stop ('.') The minus sign ('-') and underscores'_'). The server name also conforms to this rule, except for underscores.
Now, neither the beginning nor the end of the user name can be a period, and neither can the server. And you can't have two consecutive periods with at least one character between them, so now let's take a look at how to write a matching pattern for the user name:
^ [_ a-zA-Z0-9 -] + $
The existence of a full stop is not allowed yet. Let's add it:
^ [_ a-zA-Z0-9 -] + (\ .[ _ a-zA-Z0-9 -] +) * $
The above means that it begins with at least one canonical character (except.), followed by 0 or more strings that start with a dot.
To keep it simple, we can replace ereg () and eregi () with eregi (), so we don't need to specify two ranges "Amurz" and "Amurz". We only need to specify one:
^ [_ a-z0-9 -] + (\ .[ _ a-z0-9 -] +) * $
The name of the following server is the same, but remove the underscore:
^ [a-z0-9 -] + (\ .[ a-z0-9 -] +) * $
good. Now you just need to connect the two parts with "@":
^ [_ a-z0-9 -] + (\ .[ _ a-z0-9 -] +) * @ [a-z0-9 -] + (\ .[ a-z0-9 -] +) * $
This is the complete email authentication matching pattern, just call:
Eregi ("^ [_ a-z0-9 -] + (\ .[ _ a-z0-9 -] +) * @ [a-z0-9 -] + (\ .[ a-z0-9 -] +) * $", $eamil)
You can get whether it is email or not.
Other uses of regular expressions
Extract string
Ereg () and eregi () has a feature that allows users to extract part of a string through regular expressions (you can read the manual for specific usage). For example, if we want to extract the file name from path/URL, the following code is what you need:
Ereg ("([^\ /] *) $", $pathOrUrl, $regs)
Echo $regs [1]
Advanced substitution
Ereg_replace () and eregi_replace () are also very useful if we want to replace all interval minus signs with commas:
Ereg_replace ("\ n\ r\ t] +", trim ($str))
Finally, I'll let you analyze another string of regular expressions that check EMAIL:
"^ [-! # $% &\'* +\\. / 0-9 roomA-Z^ _ `alyz {|} ~] +'.'@.'[-! # $% &\ * +\ / 0-9 roomA-Z^ _ `amurz {|} ~] +\.'. [-! # $% &\ * +\. / 0-9 roomA-Z^ _ `aRaz {|} ~] + $"
The above is all the content of the article "what's the use of regular expressions in php?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.