In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
In this article, the editor introduces in detail "how to use python to extract strings in Chinese and English". The content is detailed, the steps are clear, and the details are handled properly. I hope this article "how to use python to extract strings in Chinese and English" can help you solve your doubts.
one。 Sub function in re
Using Python's re module, the re module provides re.sub to replace matches in strings.
Re.sub (pattern, repl, string, count=0)
Parameter description:
Pattern: regular heavy pattern string
Repl: the string to be replaced
String: the original string to be used for replacement
Count: the maximum number of substitutions after pattern matching. If omitted, the default is 0, which means all matches are replaced.
1.1 extract Chinese
Think of it this way: we can just replace characters that are not in Chinese with empty ones.
For example
Import restr = "world" str = re.sub ("[A-Za-z0-9,.]" Str) print (str) output: children of God are singing
1.2 extract English
Import restr = "world" str = re.sub ("[u4e00-u9fa5-9,.]" , ", str) print (str) output: helloHworld
1.3 extract numbers
Import restr = "world" str = re.sub ("[A-Za-zu4e00-u9fa5,.]]" , str) print (str) output: 123 II. Findall function in re
Finds all the substrings matched by the regular expression in the string and returns a list, or an empty list if no matches are found.
The syntax format is:
Findall (string [, pos [, endpos]])
Parameters:
String: the string to match.
Pos: optional parameter that specifies the starting position of the string. The default is 0.
Endpos: optional parameter that specifies the end position of the string, which defaults to the length of the string. Find all the numbers in the string:
Extension: there are match and search in the rule, they are matched once, findall
Match all. For more information, please see the rookie tutorial.
2.2 extract English
Popular writing method
Import string# provides the lowercase letter dd = "Child of God hello sings in H, world" # prepares the English character temp= "" letters=string.ascii_lowercase# contains the lowercase letter of Amurz for word in dd:#for loop takes out a single word if word.lower () in letters:# to determine whether the English temp+=word# is added to make up the English word print (temp) output: helloHworld
Regular pattern
# A-Za-zimport redd = "world out of 123jianghu hello" result = '.join (re.findall (r' [A-Za-z]', dd)) print (result) output: helloHworld
2.3 extract numbers
# 0-9 pay attention to not being able to precede this number, otherwise he will even count import redd = "hello of God 123 is singing H songs., world" result = '.join (re.findall (r' [0-9]', dd)) print (result) output: 123 III. Compile function in re
The compile function is used to compile regular expressions and generate a regular expression (Pattern) object for use by other functions.
The syntax format is:
Re.compile (pattern [, flags])
Parameters:
Pattern: a regular expression in string form
Flags: optional, indicating matching pattern, such as ignoring case, multiline mode, etc. The specific parameters are:
Re.I ignores case
Re.L represents the special character set w, W, B, s, S depending on the current environment
Re.M multiline mode
Re.S is. And any character, including a newline character (. Does not include newline characters)
Re.U represents a special character set w, W, B, d, D, S that depends on the Unicode character attribute database.
Re.X ignores spaces and comments followed by # to increase readability
After reading this, the article "how to use python to extract strings in Chinese and English" has been introduced. If you want to master the knowledge of this article, you still need to practice and use it before you can understand it. If you want to know more about related articles, welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.