Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the differences between character sets ASCII, GBK, UNICODE and UTF when storing characters?

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the character set ASCII, GBK, UNICODE, UTF in the storage of characters of what are the differences, have a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let Xiaobian take you to understand.

ASCII code (American Standard Code for Information Interchange, American standard code for information interchange) that uses 127octal bytes to represent English and half-width characters.

GBK (Guo Biao Kuozhan, GB extension), when in the range of ASCII code, it is represented by one byte, and then every two bytes represent a Chinese and full-width character.

UCS (Universal Multiple-Octet Coded Character Set) is commonly known as UNICODE, and all characters are two bytes. All the preceding zeros of ASCII characters are filled in, and other characters are re-encoded.

UTF (UCS Transfer Format, Universal Code), when in the range of ASCII code, is represented by one byte, a medium character occupies 3 bytes, from unicode to uft-8 is not a direct correspondence.

Comparison table of decimal codes for Chinese and English character sets

Character ASCIIGBKUNICODEUTF8a97979797 452183846315308991

Comparison table of binary codes for Chinese and English character sets

The character ASCIIGBKUNICODEUTF8a0110000101100001000000000110000101100001 A Wu 1011000010100010100101100011111110100110011000101111111111

From the first table, we can see that the coding order of English characters (or, more accurately, the characters of the ASCII character set) has not changed, while the Chinese characters have been rearranged.

From the second table, we can easily see that the four character coding sets use one byte to deal with English characters except "UNICODE" with two bytes. For Chinese characters, "GBK" and "UNICODE" are represented by two bytes, and UTF8 by three bytes.

Thank you for reading this article carefully. I hope the article "what are the differences between character sets ASCII, GBK, UNICODE and UTF when storing characters" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report