In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-10-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "Python character coding, regularization, os operation example analysis". In daily operation, I believe many people have doubts in Python character coding, regularization, os operation example analysis. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "Python character coding, regularity, os operation example analysis". Next, please follow the editor to study!
Maybe you often encounter two problems in the process of coding:
1. Why there are all kinds of garbled code problems in the use of Python, which are clearly Chinese characters but are displayed in the form of "/ xe4/xb8/xad/xe6/x96/x87"?
two。 Why did you get the error "UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range (128)"?
The internal representation of strings in Python is unicode encoding, so when doing encoding conversion, we usually need to use unicode as the intermediate encoding, that is, we first decode (decode) other encoded strings into unicode, and then encode them from unicode to another encoding.
The purpose of decode is to convert other encoded strings into unicode encoding, such as str1.decode ('gb2312'), which means to convert gb2312-encoded string str1 to unicode encoding. The purpose of encode is to convert the unicode encoding into other encoded strings, such as str2.encode ('gb2312'), which means to convert the unicode-encoded string str2 to gb2312 encoding. Therefore, when transcoding, be sure to figure out what the string str is encoded, then decode it into unicode, and then encode it into other encodings.
#! / usr/bin/env python#coding=utf-8s= "Chinese" if isinstance (s, unicode): # Signoru "Chinese" print s.encode ('gb2312') else:#s= "Chinese" print s.decode (' utf-8'). Encode ('gb2312')
Regular filtering emoji
Import retext = '9'myre = re.compile (ur "[^ A-Za-z0-9\ s\ r\ t\ n\\ u4e00 -\\ u9fa5\\ uff08\\ u3008\\ u300e\\ ufe43\\ u2026\ uff5e\\ uffe5\\ uff0c\\ uff1f\ uff1a\\ u201c\ u2018\ uff09\\ u3009\ u300b\ u300d\\ u300f\ ufe44\ u3015\ u2014\ ufe4f\\ u3001\ u3011\ U3002\\ uff01\\ uff1b\\ u201d\\ u2019\ [\]\ (\) {}\ |\ "\: ~ `! @ # $% & *? . /:] ") cleanEmoji = myre.sub (u'[emoji]', text) print cleanEmoji// output 9
The above example is used to filter emoji when storing data into a sqlite3 database, because we haven't found a way to store emoji in the database for a long time. In other databases, we can change the encoding format to utf-8mb4 and then store emoji. The above regular means to match the text other than numbers, letters, Chinese and English punctuation marks.
Os operation
Os.path.join (str1,str2): for stitching two file paths, you can make up\ under windows and / automatically under Linux, so as to avoid pits with incorrect paths in different operating system environments. Os.path.exists (path): determines whether the path exists. It is generally used to determine whether a file exists. Os.system: execute terminal command os.remove: delete files os.removedirs: delete empty folders
# Delete a non-empty folder import shutilshutil.rmtree (path). This ends the study of "Python character encoding, regularization, and example analysis of os operations". I hope you can solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r
A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.