In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly shows you "Java Pattern and Matcher string matching example analysis", the content is simple and easy to understand, organized clearly, I hope to help you solve doubts, let Xiaobian lead you to study and learn "Java Pattern and Matcher string matching example analysis" this article bar.
Pattern class definition
public final class Pattern extends Object implementsThe compiled representation of a regular expression. Used to compile regular expressions and create a matching pattern.
Regular expressions specified as strings must first be compiled as instances of this class. The resulting pattern can then be used to create a Matcher object that matches any sequence of characters according to the regular expression. All states involved in performing matching reside in matchers, so multiple matchers can share the same pattern.
Thus, the typical invocation sequence is:
Pattern p =Pattern.compile("a*b");Matcher m =p.matcher("aaaaab");boolean b = m.matches();
This class is a convenient way to define matches methods when using regular expressions only once. This method compiles the expression and matches the input sequence to it in a single call. Statement:
boolean b =Pattern.matches("a*b", "aaaaab");
Equivalent to the three statements above, although it is inefficient for repeated matches because it does not allow reuse of compiled patterns.
An instance of this class is immutable and can be safely consumed by multiple concurrent threads. An instance of the Matcher class is unsafe for this purpose.
Detailed explanation of pattern method
Pattern complie(String regex): Compile regular expressions and create Pattern classes.
Since the constructor of Pattern is private and cannot be created directly, it is created by the static simple factory method compile(String regex), which compiles and assigns a given regular expression to the Pattern class.
String pattern(): Returns the string form of a regular expression.
This returns the regex parameter of Pattern. complete (Stringregex). Examples are as follows:
String regex ="\\?|\\* ";Pattern pattern= Pattern.compile(regex);StringpatternStr = pattern.pattern();//return\?\*
3、Pattern compile(String regex, int flags)。
The function of the method is the same as compile(Stringregex), but the flag parameter is added. The flag parameter is used to control the matching behavior of the regular expression. The value range is as follows:
Pattern.CANON_EQ: Enable specification equivalence. a match is considered if and only if the "canonical decomposition" of both characters is identical. By default, canonical equivalence is not considered.
Pattern.CASE_INSENSITIVE: Enables case-insensitive matching. By default, case-insensitive matching applies only to US-ASCII character sets. This flag allows expressions to match regardless of case. To match Unicode characters size insensitive, simply combine UNICODE_CASE with this flag.
Pattern.COMMENTS: White space and comments are allowed in patterns. In this pattern, matching ignores spaces (not "\s" in expressions, but spaces, tabs, carriage returns, etc.). Comments start with #and continue to the end of this line. Unix line mode can be enabled with embedded flags.
Pattern.DOTALL: Enable dotall mode. In this mode, the expression '. 'can match any character, including the terminator that represents a line. By default, the expression '. 'Doesn't match line terminator.
Pattern.LITERAL: enables literal parsing of patterns.
Pattern.MULTILINE: Enables multiline mode. In this pattern,'\^' and '$' match the beginning and end of a line, respectively. Also,'^' still matches the beginning of the string, and '$' also matches the end of the string. By default, these two expressions match only the beginning and end of a string.
Pattern.UNICODE_CASE: Enables Unicode-aware case folding. In this mode, if you also enable the CASE_INSENSITIVE flag, it matches Unicode characters case-insensitive. By default, case-insensitive matching applies only to US-ASCII character sets.
Pattern.UNIX_LINES: Enables Unix line mode. In this mode, only '\n' is considered a line stop, and is associated with '. ','^', and'$'.
int flags(): Returns the matching flag parameter of the current Pattern.
5、String[] split(CharSequence input)。
Pattern has a split(CharanceSequenceinput) method that separates strings and returns a String[]. In addition, String[] split(CharSequence input, int limit) functions the same as String[]split(CharSequence input), adding the parameter limit to specify the number of segments to split.
6、static boolean matches(String regex, CharSequenceinput)。
Is a static method for fast string matching that is suitable for matching only once and all strings. Method compiles a given regular expression and matches the input string with that regular expression as a pattern. This method performs only one matching operation and does not need to generate a Matcher instance.
7、Matcher matcher(CharSequence input)。
Pattern.matcher(CharenceSequenceinput) Returns a Matcher object. Matcher class constructor method is also private, can not be arbitrarily created, only through Pattern.matcher(CharSequence input) method to get an instance of the class. Pattern class can only do some simple matching operations, in order to get stronger and more convenient regular matching operations, it is necessary to pattern and Matcher together. The Matcher class provides grouping support for regular expressions and multiple matching support for regular expressions.
Java code example:
Pattern p = Pattern.compile("\\d+");Matcher m = p.matcher("22bb23");//Returns p i.e. returns m.pattern() from which Pattern object the Matcher object was created;
Example of Pattern:
package com.zxt.regex; import java.util.regex.Pattern; public classPatternTest { public static void main(String[] args) { //Compile a regular expression using the Pattern.compile method to create a matching pattern Patternpattern = Pattern.compile("\\?|\\* "); // pattern() returns the string form of a regular expression\?\* StringpatternStr = pattern.pattern(); System.out.println(patternStr); // flags() Returns the matching flag parameter of the current Pattern, which is not defined here int flag = pattern.flags(); System.out.println(flag); The// split method splits the string // 123 123 456 456 String[]splitStrs = pattern.split("123? 123*456*456"); for (int i = 0; i
< splitStrs.length; i++) { System.out.print(splitStrs[i] + " "); } System.out.println(); // 123 123*456*456 String[]splitStrs2 = pattern.split("123?123*456*456",2); for (int i = 0; i < splitStrs2.length; i++) { System.out.print(splitStrs2[i] + " "); } System.out.println(); Patternp = Pattern.compile("\\d+"); String[]str = p.split("我的QQ是:456456我的电话是:0532214我的邮箱是:aaa@aaa.com"); for (int i = 0; i < str.length; i++) { System.out.printf("str[%d] = %s\n",i, str[i]); } System.out.println(); // Pattern.matches用给定的模式对字符串进行一次匹配,(需要全匹配时才返回true) System.out.println("Pattern.matches(\"\\\\d+\",\"2223\") is " + Pattern.matches("\\d+", "2223")); // 返回false,需要匹配到所有字符串才能返回true,这里aa不能匹配到 System.out.println("Pattern.matches(\"\\\\d+\", \"2223aa\")is " + Pattern.matches("\\d+", "2223aa")); // 返回false,需要匹配到所有字符串才能返回true,这里bb不能匹配到 System.out.println("Pattern.matches(\"\\\\d+\",\"22bb23\") is " + Pattern.matches("\\d+", "22bb23")); } } Matcher类定义 public final class Matcher extends Object implementsMatchResult通过调用模式(Pattern)的matcher方法从模式创建匹配器。创建匹配器后,可以使用它执行三种不同的匹配操作: 1、matches方法尝试将整个输入序列与该模式匹配。 2、lookingAt尝试将输入序列从头开始与该模式匹配。 3、find方法扫描输入序列以查找与该模式匹配的下一个子序列。 每个方法都返回一个表示成功或失败的布尔值。通过查询匹配器的状态可以获取关于成功匹配的更多信息。 匹配器在其输入的子集(称为区域)中查找匹配项。默认情况下,此区域包含全部的匹配器输入。可通过region方法修改区域,通过regionStart和regionEnd方法查询区域。区域边界与某些模式构造交互的方式是可以更改的。 此类还定义使用新字符串替换匹配子序列的方法,需要时,可以从匹配结果计算出新字符串的内容。可以先后使用appendReplacement和appendTail方法将结果收集到现有的字符串缓冲区,或者使用更加便捷的replaceAll方法创建一个可以在其中替换输入序列中每个匹配子序列的字符串。 匹配器的显式状态包括最近成功匹配的开始和结束索引。它还包括模式中每个捕获组捕获的输入子序列的开始和结束索引以及该子序列的总数。出于方便的考虑,还提供了以字符串的形式返回这些已捕获子序列的方法。 匹配器的显式状态最初是未定义的;在成功匹配导致IllegalStateException抛出之前尝试查询其中的任何部分。每个匹配操作都将重新计算匹配器的显式状态。匹配器的隐式状态包括输入字符序列和添加位置,添加位置最初是零,然后由appendReplacement方法更新。 可以通过调用匹配器的reset()方法来显式重置匹配器,如果需要新输入序列,则调用其reset(CharSequence)方法。重置匹配器将放弃其显式状态信息并将添加位置设置为零。 此类的实例用于多个并发线程是不安全的。 Matcher类方法详解 1、Matcher类提供了三个匹配操作方法,三个方法均返回boolean类型,当匹配到时返回true,没匹配到则返回false。 boolean matches()最常用方法:尝试对整个目标字符展开匹配检测,也就是只有整个目标字符串完全匹配时才返回真值。 boolean lookingAt()对前面的字符串进行匹配,只有匹配到的字符串在最前面才会返回true。 boolean find():对字符串进行匹配,匹配到的字符串可以在任何位置。 2、返回匹配器的显示状态:intstart():返回当前匹配到的字符串在原目标字符串中的位置;int end():返回当前匹配的字符串的最后一个字符在原目标字符串中的索引位置;String group():返回匹配到的子字符串。 3、int start(),int end(),int group()均有一个重载方法,它们分别是int start(int i),int end(int i),int group(int i)专用于分组操作,Mathcer类还有一个groupCount()用于返回有多少组。 4、Matcher类同时提供了四个将匹配子串替换成指定字符串的方法: 1)、String replaceAll(Stringreplacement):将目标字符串里与既有模式相匹配的子串全部替换为指定的字符串。 2)、String replaceFirst(Stringreplacement):将目标字符串里第一个与既有模式相匹配的子串替换为指定的字符串。 3)、还有两个方法Matcher appendReplacement(StringBuffersb, String replacement) 和StringBufferappendTail(StringBuffer sb)也很重要,appendReplacement允许直接将匹配的字符串保存在另一个StringBuffer中并且是渐进式匹配,并不是只匹配一次或匹配全部,而appendTail则是将未匹配到的余下的字符串添加到StringBuffer中。 5、其他一些方法:例如Matcherreset():重设该Matcher对象。 Matcher reset(CharSequence input):重设该Matcher对象并且指定一个新的目标字符串。 Matcher region(int start, int end):设置此匹配器的区域限制。 Matcher类使用示例: package com.zxt.regex; import java.util.regex.Matcher;import java.util.regex.Pattern; public classMatcherTest { public static void main(String[] args) { Patternp = Pattern.compile("\\d+"); // matches()对整个字符串进行匹配 // 返回false,因为bb不能被\d+匹配,导致整个字符串匹配未成功。 Matcherm = p.matcher("22bb23"); System.out.println(m.matches()); m = p.matcher("2223"); // 返回true,因为\d+匹配到了整个字符串 System.out.println(m.matches()); // lookingAt()对字符串前缀进行匹配 m = p.matcher("22bb23"); // 返回true,因为\d+匹配到了前面的22 System.out.println(m.lookingAt()); m = p.matcher("aa2223"); // 返回false,因为\d+不能匹配前面的aa System.out.println(m.lookingAt()); // find()对字符串进行匹配,匹配到的字符串可以在任何位置。 m = p.matcher("22bb23"); System.out.println(m.find()); // true m = p.matcher("aa2223"); System.out.println(m.find()); // true m = p.matcher("aabb"); System.out.println(m.find()); // false // 当匹配器匹配失败时,使用返回匹配器状态的方法将出错,例如:m.start(); m = p.matcher("aa2223bb"); System.out.println(m.find()); // true System.out.println(m.start()); // 2 System.out.println(m.end()); // 6 System.out.println(m.group()); // 2223 p = Pattern.compile("([a-z]+)(\\d+)"); m = p.matcher("aaa2223bb"); // 匹配aaa2223 m.find(); // 返回2,因为有2组 System.out.println(m.groupCount()); // 返回0, 返回第一组匹配到的子字符串在字符串中的索引号 System.out.println(m.start(1)); // 返回3 System.out.println(m.start(2)); // 返回3 返回第一组匹配到的子字符串的最后一个字符在字符串中的索引位置. System.out.println(m.end(1)); // 返回2223,返回第二组匹配到的子字符串 System.out.println(m.group(2)); }}应用实例 1、一个简单的邮箱验证小程序 package com.zxt.regex; import java.util.Scanner;import java.util.regex.Matcher;import java.util.regex.Pattern; /* * 一个简单的邮件地址匹配程序 */public classEmailMatch { public static void main(String[] args) throws Exception { Scannersc = new Scanner(System.in); while (sc.hasNext()) { Stringinput = sc.nextLine(); // 检测输入的EMAIL地址是否以非法符号"."或"@"作为起始字符 Patternp = Pattern.compile("^@"); Matcherm = p.matcher(input); if (m.lookingAt()) { System.out.println("EMAIL地址不能以'@'作为起始字符"); } // 检测是否以"www."为起始 p = Pattern.compile("^www."); m = p.matcher(input); if (m.lookingAt()) { System.out.println("EMAIL地址不能以'www.'起始"); } // 检测是否包含非法字符 p = Pattern.compile("[^A-Za-z0-9.@_-~#]+"); m = p.matcher(input); StringBuffersb = new StringBuffer(); boolean result = m.find(); boolean deletedIllegalChars= false; while (result) { // 如果找到了非法字符那么就设下标记 deletedIllegalChars= true; // 如果里面包含非法字符如冒号双引号等,那么就把他们消去,加到SB里面 m.appendReplacement(sb, ""); result = m.find(); } // 此方法从添加位置开始从输入序列读取字符,并将其添加到给定字符串缓冲区。 // 可以在一次或多次调用 appendReplacement 方法后调用它来复制剩余的输入序列。 m.appendTail(sb); if (deletedIllegalChars){ System.out.println("输入的EMAIL地址里包含有冒号、逗号等非法字符,请修改"); System.out.println("您现在的输入为: " + input); System.out.println("修改后合法的地址应类似: " + sb.toString()); } } sc.close(); }} 2、判断身份证:要么是15位,要么是18位,最后一位可以为字母,并写程序提出其中的年月日。 可以使用正则表达式来定义复杂的字符串格式:(\d{17}[0-9a-zA-Z]|\d{14}[0-9a-zA-Z])可以用来判断是否为合法的15位或18位身份证号码。因为15位和18位的身份证号码都是从7位到第12位为身份证为日期类型。这样我们可以设计出更精确的正则模式,提取身份证号中的日期信息。 package com.zxt.regex; import java.util.regex.Matcher;import java.util.regex.Pattern; public classIdentityMatch { public static void main(String[] args) { // 测试是否为合法的身份证号码 String[]id_cards = { "130681198712092019","13068119871209201x","13068119871209201","123456789012345", "12345678901234x","1234567890123"}; // 测试是否为合法身份证的正则表达式 Patternpattern = Pattern.compile("(\\d{17}[0-9a-zA-Z]|\\d{14}[0-9a-zA-Z])"); // 用于提取出生日字符串的正则表达式 Patternpattern1 = Pattern.compile("\\d{6}(\\d{8}).*"); // 用于将生日字符串分解为年月日的正则表达式 Patternpattern2 = Pattern.compile("(\\d{4})(\\d{2})(\\d{2})"); Matchermatcher = pattern.matcher(""); for (int i = 0; i < id_cards.length; i++) { matcher.reset(id_cards[i]); System.out.println(id_cards[i] + " is id cards:" + matcher.matches()); // 如果它是一个合法的身份证号,提取出出生的年月日 if (matcher.matches()) { Matchermatcher1 = pattern1.matcher(id_cards[i]); matcher1.lookingAt(); Stringbirthday = matcher1.group(1); Matchermatcher2 = pattern2.matcher(birthday); if (matcher2.find()) { System.out.println("它对应的出生年月日为:" + matcher2.group(1) + "年" + matcher2.group(2) + "月" +matcher2.group(3) + "日"); } } System.out.println(); } }}The above is "Java Pattern and Matcher string matching example analysis" all the content of this article, thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.