Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize grouping and substitution in Java regular expressions

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

Editor to share with you how to achieve grouping and substitution in Java regular expressions. I hope you will get something after reading this article. Let's discuss it together.

Subexpressions (grouping) of regular expressions are not easy to understand, but they are powerful text processing tools.

1 regular expression warm-up match phone number / / phone number match / / only 13xxx 15xxx 18xxxx 17xxxSystem.out.println ("18304072984" .telephone ("1 [3578]\\ d {9}"); / / true// seat number: 010-657842360316-3312617022-1246564703123312336String regex = "0\ d {2} -?\ d {8} | 0\ d {3} -?\ d {7}"; String telStr = "010-43367458" System.out.println (telStr.matches (regex)); / / true matching mailbox String mail = "i@jiaobuchong.com.cn"; String reg = "[a-zA-Z_0-9] + @ [a-zA-Z0-9] + (\. [a-zA-Z] +) {1 a-zA-Z 2}"; System.out.println (mail.matches (reg)); / / true special character substitution

Replace characters that are not in Chinese with blank:

String input = "Diren & *% $the four Heavenly Kings of Jie @ bdfbdbdfdgds23532"; String reg = "[^\ u4e00 -\ u9fa5]"; input = input.replaceAll (reg, "); System.out.println (input); / / the four Heavenly Kings of Di Renjie

The Unicode coding range of Chinese characters is:\ u4e00 -\ u9fa5

2 grouping

A group is a regular expression divided by parentheses and can refer to a group according to its number. The group number 0 represents the entire expression, the group number 1 represents the group expanded by the first pair of parentheses, and so on.

Look at the description in Pattern in Java API:

Capturing groups are numbered by counting their opening parentheses from left to right. In the _ expression ((A) (B (C), for example, there are four such groups:

1. (a) (B (C))

2. (a)

3. (B (C))

4. (C)

For example, A (B (C)) D has three groups: group 0 is ABCD, group 1 is BC, and group 2 is C.

You can determine how many groups there are according to how many left parentheses there are, and the expressions in parentheses are called subexpressions.

Eg1:

The Matcher object provides many methods:

GoupCount () returns the number of groups in the regular expression pattern, corresponding to the number of "left parentheses"

Group (int I) returns the matching characters of the corresponding group. If no match is reached, null is returned.

Start (int group) returns the starting index of the matching characters of the corresponding group

End (int group) returns the value of the last character index plus one of the matching characters of the corresponding group

/ / this regular expression has two groups, / / group (0) is\\ $\ {([^ {}] +)\} / / group (1) is ([^ {}] +) String regex = "\\ $\ ([^ {}] +)\}"; Pattern pattern = Pattern.compile (regex); String input = "${name}-babalala-$ {age}-${address}"; Matcher matcher = pattern.matcher (input) System.out.println (matcher.groupCount ()); / / find () traverses the input string while (matcher.find ()) {System.out.println (matcher.group (0) + ", pos:" + matcher.start () + "-" + (matcher.end ()-1)) forward like an iterator System.out.println (matcher.group (1) + ", pos:" + matcher.start (1) + "-" + (matcher.end (1)-1);}

Output:

one

${name}, pos: 0-6

Name, pos: 2-5

${age}, pos: 17-22

Age, pos: 19-21

${address}, pos: 24-33

Address, pos: 26-32

The translation of group into Chinese is grouping.

Group () or group (0) corresponds to what the entire regular expression matches each time.

Group (1) represents what is matched in parentheses (a subexpression grouping).

Eg2:

To see the grouping more intuitively, add another pair of parentheses to the regular expression of Eg1:

String regex = "([^ {}] +)\\})"; Pattern pattern = Pattern.compile (regex); String input = "${name}-babalala-$ {age}-${address}"; Matcher matcher = pattern.matcher (input) The / / matcher.find () method will match the string input many times. If it can match, there will be multiple packets in the matching result. We can extract the desired result while (matcher.find ()) {System.out.println (matcher.group (0) + ", pos:" + matcher.start ()) from the grouping. System.out.println (matcher.group (1) + ", pos:" + matcher.start (1)); System.out.println (matcher.group (2) + ", pos:" + matcher.start (2));}

Output:

${name}, pos: 0

${name}, pos: 0

Name, pos: 2

${age}, pos: 17

${age}, pos: 17

Age, pos: 19

${address}, pos: 24

${address}, pos: 24

Address, pos: 26

As a result, a pair of parentheses can be obtained, and how many groups can be determined by the number of left parentheses.

Using group () to obtain matching strings in a packet is widely used.

In one of the author's projects, through the use of this feature to achieve a very interesting wildcard replacement, moved!

Eg3 (extract the desired data by grouping):

/ / this regular expression extracts the "number" and "letter" from the string Pattern pattern = Pattern.compile ("([0-9] +). *? ([a-zA-Z] +)"); String input = "that's 20200719. Sunny. 122432 what should be taken to compete with tears twinkle "; Matcher matcher = pattern.matcher (input); / / the number of packets of each matching substring int group = matcher.groupCount () / / if the input string has multiple substrings that can be matched, while (matcher.find ()) {System.out.println ("matched substring:" + matcher.group ()) will be matched multiple times; / / the matched substring for (int I = 1; I)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report