Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use the Python string Encoding conversion encode () and decode () methods

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains the "Python string encoding conversion encode () and decode () method how to use", the article explains the content is simple and clear, easy to learn and understand, now please follow the editor's ideas slowly in depth, together to study and learn "Python string encoding conversion encode () and decode () method how to use" it!

Foreword:

As we know, the earliest string encoding is ASCII encoding, which encodes only 10 numbers, 26 uppercase and lowercase letters, and some special characters. The long ASCII code can only represent 256 symbols, and each character only needs to occupy 1 byte.

With the development of information technology, characters in all countries need to be encoded, so GBK, GB2312, UTF-8 coding and so on have emerged. Among them, GBK and GB2312 are Chinese coding standards, which stipulate that English character letters occupy 1 byte and Chinese characters occupy 2 bytes. UTF-8 is an internationally adopted coding format, which contains the characters needed by all countries in the world. It stipulates that English characters occupy 1 byte and Chinese characters occupy 3 bytes.

Python 3.x adopts UTF-8 coding format by default, which effectively solves the problem of Chinese garbled code.

In Python, there are two commonly used string types, str and bytes, where str is used to represent Unicode characters and bytes is used to represent binary data. You need to convert between str types and bytes types using the encode () and decode () methods.

1.Python encode () method

The encode () method provides methods for string types (str) to convert str types to bytes types, a process also known as "encoding".

The syntax format of the encode () method is as follows:

Str.encode ([encoding= "utf-8"] [, errors= "strict"])

Note: the parameters enclosed in [] in the format are optional, that is, when using this method, you can use the parameters in [] or not.

The meaning of each parameter of this method is shown in Table 1.

Table 1 parameters and meanings of encode ():

The parameter meaning str represents the string to be converted. Encoding = "utf-8" specifies the character encoding to be used when encoding, which defaults to utf-8 encoding. For example, if you want to use simplified Chinese, you can set gb2312.

When only this parameter is used in the method, you can omit the previous "encoding=" and write the encoding format directly, such as str.encode ("UTF-8"). Errors = "strict" specifies error handling, and its optional values can be:

Strict: throw an exception when an illegal character is encountered.

Ignore: illegal characters are ignored.

Replace: use "?" Replace illegal characters.

Xmlcharrefreplace: use the character reference of xml.

The default value for this parameter is strict.

Note: using the encode () method to encode the original string will not modify the original string directly. If you want to modify the original string, you need to re-assign the value.

[example 1] convert the str type string "C language Chinese net" to bytes type.

> str = "C language Chinese net" > str.encode () bounded C\ xe8\ xaf\ xad\ xe8\ xa8\ x80\ xe4\ xb8\ xad\ xe6\ x96\ X87\ xe7\ xbd\ x91'

This method uses UTF-8 encoding by default, or you can specify other encoding formats manually, such as:

> str = "C language Chinese net" > > str.encode ('GBK') bounded C\ xd3\ xef\ xd1\ xd4\ xd6\ xd0\ xce\ xc4\ xcd\ xf8'2.Python decode () method

In contrast to the encode () method, the decode () method is used to convert binary data of type bytes to type str, a process also known as "decoding."

The syntax format of the decode () method is as follows:

Bytes.decode ([encoding= "utf-8"] [, errors= "strict"])

The meaning of the parameters in this method is shown in Table 2.

Table 2 parameters and meanings of decode ():

The parameter meaning bytes represents the binary data to be converted. Encoding= "utf-8" specifies the character encoding used for decoding, which defaults to utf-8 format. When only this parameter is used in the method, "encoding=" can be omitted and the encoding method can be written directly.

Note that the decoding of bytes type data should be in the same format as when it was originally encoded. Errors = "strict" specifies error handling, and its optional values can be:

Strict: throw an exception when an illegal character is encountered.

Ignore: illegal characters are ignored.

Replace: use "?" Replace illegal characters.

Xmlcharrefreplace: use the character reference of xml.

The default value for this parameter is strict.

[example 2]

> str = "C language Chinese net" > bytes=str.encode () > bytes.decode ()'C language Chinese net'

Note: if you do not use the default UTF-8 encoding, you should choose the same format as the encoding, otherwise an exception will be thrown, for example:

> str = "C language Chinese net" > bytes = str.encode ("GBK") > bytes.decode () # uses UTF-8 encoding by default and throws the following exception Traceback (most recent call last): File ", line 1, in bytes.decode () UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd3 in position 1: invalid continuation byte > bytes.decode (" GBK ")' C language Chinese net 'Thank you for your reading The above is the "Python string encoding conversion encode () and decode () method how to use" the content, after the study of this article, I believe you on the Python string encoding conversion encode () and decode () method how to use this problem has a deeper understanding, the specific use of the need for practice to verify. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report