Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the basic knowledge points for getting started with Java regular expressions

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Editor to share with you what are the basic knowledge points of the introduction to Java regular expression, I believe most people do not know much about it, so share this article for your reference. I hope you will gain a lot after reading this article. Let's learn about it together.

1. Basic knowledge of regular expressions

1.1 period symbol

Suppose you are playing English Scrabble and want to find three-letter words that start with the letter "t" and end with the letter "n". In addition, suppose there is an English dictionary, and you can search all its contents with regular expressions. To construct this regular expression, you can use a wildcard, the period symbol. Thus, the complete expression is "t.n", which matches "tan", "ten", "tin" and "ton", as well as "tween", "tpn" or even "t n", and many other meaningless combinations. This is because the period symbol matches all characters, including spaces, Tab characters, and even newline characters:

1.2 square bracket symbol

To solve the problem that the matching range of period symbols is too wide, you can specify characters that look meaningful in square brackets ("[]"). At this point, only the characters specified in square brackets participate in the match. That is, the regular expression "t [aeio] n" only matches "tan", "Ten", "tin" and "ton". But "Toon" does not match, because you can only match a single character within square brackets:

1.3 "or" symbol

If you want to match "toon" in addition to all the words above, you can use the "|" operator. The basic meaning of the "|" operator is the "OR" operation. To match "toon", use the regular expression "t (a | e | I | o | oo) n". Square expansions cannot be used here because square brackets only allow matching of individual characters; parentheses "()" must be used here. Parentheses can also be used to group.

1.4 symbol indicating the number of matches

The following table shows the syntax of regular expressions:

Table 1.1 regular expression syntax

Suppose we want to search for American Social Security numbers in a text file. The format of this number is 999-99-9999. The regular expression used to match it is shown in figure 1. In regular expressions, a hyphen ("-") has a special meaning, representing a range, such as 0 to 9. Therefore, when matching a hyphen in a social security number, it is preceded by an escape character "/".

Suppose you want hyphens to appear or not to appear when searching-that is, 999-99-9999 and 999999999 are in the correct format. At this point, you can add "?" after the hyphen. Quantity limit symbol.

One format of an American license plate is four numbers plus two letters. Its regular expression is preceded by the numeric part "[0-9] {4}" and the alphabetic part "[Amurz] {2}".

1.5 "No" symbol

The "^" symbol is called the "no" symbol. If used in square brackets, "^" indicates a character that you do not want to match. For example, the regular expression in figure 4 matches all words, except those that begin with the letter "X".

1.6 parentheses and white space symbols

The "/ s" symbol is a white space symbol that matches all white space characters, including Tab characters. If the strings match correctly, how do you extract the month part next? Just add a parenthesis around the month to create a group, and then use ORO API to extract its value.

1.7 other symbols

For simplicity, you can use some shortcut symbols created for common regular expressions. As shown below:

/ t: tab, equivalent to / u0009

/ n: newline character, equivalent to / u000A

/ d: represents a number, equivalent to [0-9]

/ D: stands for non-numeric, equivalent to [^ 0-9]

/ s: represents white space characters such as newline characters, Tab tabs, etc.

/ S: represents a non-white space character

/ w: alphabetic characters, equivalent to [a-zA-Z_0-9]

/ W: non-alphabetic characters, equivalent to [^ / w]

For example, in the previous example of a social security number, we can use "/ d" wherever "[0-9]" appears.

2. The following is the procedure I have sorted out for reference:

Package org.luosijin.test; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Regex {public static void main (String [] args) {Pattern pattern = Pattern.compile ("baggage"); Matcher matcher = pattern.matcher ("bbg"); System.out.println (matcher.matches ()); System.out.println (pattern.matches ("breadg", "bbg")) / / verify the zip code System.out.println (pattern.matches ("[0-9] {6}", "200038"); System.out.println (pattern.matches ("/ / d {6}", "200038")); / / verify the phone number System.out.println ("[0-9] {3jue 4} / -? [0-9] +", "021789799"); getDate ("Nov 10Jing 200038"); charReplace () / / verify ID card: determine whether a string is an ID number, that is, whether it is a 15-or 18-digit number. System.out.println (pattern.matches ("^ / d {15} | / d {18} $", "123456789009876"); getString ("D:/dir1/test.txt"); getChinese ("welcome to china, Jiangxi Fengxin, welcome, you!"); validateEmail ("luosijin123@163.com") } / * date extraction: extract the month to * / public static void getDate (String str) {String regEx= "([a-zA-Z] +) | / / s + [0-9] {1 a-zA-Z 2}, / / s * [0-9] {4}"; Pattern pattern = Pattern.compile (regEx); Matcher matcher = pattern.matcher (str); if (! matcher.find ()) {System.out.println ("date format error!"); return } System.out.println (matcher.group (1)); / / the index value of the packet starts at 1, so the method of taking the first packet is m.group (1) instead of m.group (0). } / * character substitution: in this example, all places in a string that contain one or more consecutive "a" are replaced with "A". * / public static void charReplace () {String regex = "a +"; Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher ("okaaaa LetmeAseeaaa aa booa"); String s = matcher.replaceAll ("A"); System.out.println (s);} / * string extraction * / public static void getString (String str) {String regex = ". + / (. +) $"; Pattern pattern = Pattern.compile (regex) Matcher matcher = pattern.matcher (str); if (! matcher.find ()) {System.out.println ("incorrect file path format!") ; return;} System.out.println (matcher.group (1));} / * * Chinese extraction * @ param str * @ author Luo Sijin * @ date 2009-11-10 12:27:17 * / public static void getChinese (String str) {String regex = "[/ / u4E00-//u9FFF] +"; / / [/ / u4E00-//u9FFF] is the Chinese character Pattern pattern = Pattern.compile (regex) Matcher matcher = pattern.matcher (str); StringBuffer sb = new StringBuffer (); while (matcher.find ()) {sb.append (matcher.group ());} System.out.println (sb);} public static void validateEmail (String email) {String regex = "[0-9a-zA-Z] + @ [0-9a-zA-Z] + /. [0-9a-zA-Z] +"; Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher (email) If (matcher.matches ()) {System.out.println ("this is legal Email");} else {System.out.println ("this is illegal Email");} these are all the contents of the article "what are the basics of Java regular expressions?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report