In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article is to share with you about Java's method of dealing with string search nesting structure, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
When using Java to analyze HTML text, if you want to take out the content between nodes with nested structures, you can not directly use regular expressions to deal with, because the regular expressions brought by Java do not support the description of nested structures, although Perl, .net and PHP can. At this time, we can first use the regular expression to find out the position of the node in the string, and then match the nodes, take out the content between the matching nodes, and realize the processing of the nested structure.
For example, to start from
Data=abcd1234
You want to return two strings after the contents are taken out in the
Abcd and 1234.
The source code is as follows:
To record the value and position of the node in the string, define a class and save this information:
Public class Tag {public Tag (String value, int beginPos, int endPos) {super (); this.value = value; this.beginPos = beginPos; this.endPos = endPos;} private String value; private int beginPos; private int endPos; public String getValue () {return value } public void setValue (String value) {this.value = value;} public int getBeginPos () {return beginPos;} public void setBeginPos (int beginPos) {this.beginPos = beginPos;} public int getEndPos () {return endPos;} public void setEndPos (int endPos) {this.endPos = endPos }}
The function to get the content between nodes from a string is as follows:
/ * get the content between strings. If nesting is included, return the outermost nested content * * @ param data * @ param stag start node string * @ param etag end node string * @ return * / public List get (String data,String stag, String etag) {/ / store the start node Used to match the end node with Stack work = new Stack () / / Save all start and end nodes List allTags = new ArrayList (); / / precede metacharacters with the escape character String nstag = stag.replaceAll ("([\ *\.\\ +\ (\]\]\ [\?\\ {\}\ ^\\ $\ |\])", "\ $1") String netag = etag.replaceAll ("([\ *\.\\ +\ (\]\]\ [\\?\ {\\}\ ^\ $\\ |\])", "\ $1"); String reg = "((?:" + nstag+ ") | (?:" + netag+ "))"; Pattern p = Pattern.compile (reg, Pattern.CASE_INSENSITIVE | Pattern.MULTILINE) Matcher m = p.matcher (data); while (m.find ()) {Tag tag = new Tag (m.group (0), m.start (), m.end ()); allTags.add (tag);} / / saves the content between start and end nodes, excluding node List result = new ArrayList () For (Tag t: allTags) {if (stag.equalsIgnoreCase (t.getValue () {work.push (t) } else if (etag.equalsIgnoreCase (t.getValue () {/ / if the stack is empty, it does not match if (work.empty ()) {throw new RuntimeException ("pos" + t.getBeginPos () + "tag not match start tag.");} Tag otag = work.pop () / / if the stack is empty, match if (work.empty ()) {String sub = data.substring (otag.getEndPos (), t.getBeginPos ()); result.add (sub) } / / if the stack is not empty at this time, if (! work.empty ()) {Tag t = work.pop (); throw new RuntimeException ("tag" + t.getValue () + "not match.");} return result;}
Function returns a list of content strings between nodes.
For example, call get (data, "", "") to return a list of two elements, each of which is
Abcd, 1234
It is important to note that if the node contains metacharacters of regular expressions, you need to precede the metacharacters with the escape character\\, which is achieved on lines 16 and 17 of the source code.
This is how Java handles string search nesting structures. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.