Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis of the reasons for using utf-8 instead of gbk or gb2312 in web page coding

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article is to share with you the analysis of the reasons why web coding uses utf-8 instead of gbk or gb2312. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

If you have a choice, you should use UTF-8.

In fact, the Windows system's own programs have been fully transferred to Unicode, while GBK is only a stopgap measure to cope with Chinese standards.

The text coding of GBK is represented by double bytes, that is, both Chinese and English characters are represented by double bytes, but in order to distinguish between Chinese, the highest bit is set to 1.

As for UTF-8 coding, it is used to solve the international multi-byte coding of characters. It uses 8 bits (one byte) for English and 24 bits (three bytes) for Chinese. For forums with more English characters, use UTF-8 to save space.

GBK contains all Chinese characters

UTF-8 contains characters that all countries in the world need to use.

GBK is an expanded GB2312 standard based on the national standard GB2312 (it doesn't seem to be a national standard yet)

UTF-8-encoded text can be displayed in various browsers that support the UTF8 character set in various countries.

For example, if it is a UTF8 code, Chinese can also be displayed on foreigners' English IE without requiring them to download IE's Chinese language support package.

Therefore, for forums with more English, using GBK takes up 2 bytes per character, while using UTF-8 English only takes up one byte.

Please note: although the UTF-8 version has good international compatibility, the Chinese version needs 50% more database storage space than the GBK/BIG5 version, so it is not recommended and is only for users who have special requirements for international compatibility.

To put it simply:

For forums with more Chinese, it is appropriate to use GBK coding to save database space.

For forums with more English, it is appropriate to use UTF-8 to save database space.

What are the differences between gbk and gb2312

First of all, you need to know what gbk is. What is gb2312? We need to know that they are all one kind of character encoding, of course, there are many kinds of character encoding.

And character coding can be understood like this:

The binary values of 0 and 1 are stored in the computer.

Eight bits correspond to one byte, often expressed in hexadecimal.

So how do we do that if we want to see the characters we want to display on the computer instead of the numbers of zeros and ones?

Here we need to make the computer convert the corresponding hexadecimal values into corresponding characters, including English and Chinese and other languages, and then output them to the screen.

So coding, that is, defines a set of rules to specify which values and which characters correspond.

So character coding defines a set of rules that specify which of the many values stored in the computer corresponds to which letter is displayed on the computer screen.

To sum up, everyone should be able to understand that GBK and GB2312 are a kind of character encoding.

Let's talk more about their differences and similarities:

Similarities:

1. Both GBK and GB2312 are 16-bit!

2. They are usually used in the meta tag of a web page.

Differences:

1. GBK character encoding supports simplified Chinese and traditional Chinese!

The full name of GBK is "extended Specification for Internal Code of Chinese characters" (GBK, the first letter of "extended" Hanyu Pinyin, English name: Chinese Internal Code Specification), formulated by the National Information Technology Standardization Technology Committee of the people's Republic of China on December 1, 1995. the Standardization Department of the State Bureau of Technical Supervision and the Department of Science, Technology and quality Supervision of the Ministry of Electronic Industry jointly adopted the form of Technical Supervision letter No. 1995 on December 15, 1995. It is identified as a technical specification guidance document.

2. GB2312 only supports simplified Chinese!

"coded character set of Chinese characters for Information Exchange" is a set of national standards issued by the State Administration of Standards of China in 1980 and implemented on May 1st, 1981. the standard number is GB 2312Mel 1980.

The GB 2312 standard contains 6763 Chinese characters, including 3755 first-class Chinese characters and 3008 second-class Chinese characters, while GB 2312 includes 682 full-width characters, including Latin alphabet, Greek alphabet, Japanese hiragana and katakana letters and Russian Cyrillic letters.

If your web page is mainly for Chinese speakers, it is very good to use GB2312 and GBK, and the text storage volume should be small, which has some advantages. If your web page is to face the world, and you use GB2312 and GBK as the web page coding, some computer browsers do not have this kind of coding, your web page Chinese character content will become unrecognizable garbled.

Thank you for reading! This is the end of this article on the analysis of the reasons why web coding uses utf-8 instead of gbk or gb2312. I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report