In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail how many bytes Chinese accounts for in mysql. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.
In mysql, the number of bytes occupied by a Chinese character is related to the coding format: if it is GBK coding, a Chinese character accounts for 2 bytes; if it is UTF8 coding, a Chinese character accounts for 3 bytes and English letters account for 1 byte.
How many bytes does Chinese account for in mysql?
1. The number of bytes in a Chinese character is related to coding:
UTF8: one Chinese character = 3 bytes
GBK: one Chinese character = 2 bytes
Utf-8, 1 byte of English letter
How many Chinese characters can 2.varchar (n) store?
Varchar (n) represents n characters. Regardless of Chinese characters and English, Mysql can store n characters, only the actual byte length is different.
How does 3.MySQL check the length (number of bytes)?
The length function in SQL is available:
Select LENGTH (fieldname) from tablename
Description:
UTF-8:Unicode Transformation Format-8bit, which allows BOM, but usually does not contain BOM. Is used to solve the international character of a multi-byte encoding, it uses 8 bits (that is, one byte) for English, 24 (three bytes) for Chinese coding. UTF-8 contains characters needed by all countries in the world. It is an international code with strong versatility. UTF-8-encoded text can be displayed on browsers that support the UTF8 character set in various countries. For example, if it is a UTF8 code, Chinese can also be displayed on the English IE of foreigners, and they do not need to download the Chinese language support package of IE.
GBK is a standard compatible with GB2312 after expansion based on the national standard GB2312. The text coding of GBK is represented by double bytes, that is, both Chinese and English characters are represented by double bytes. In order to distinguish between Chinese characters, the highest bit is set to 1. GBK, which contains all Chinese characters, is a national code, and its versatility is worse than UTF8, but UTF8 occupies a larger database than GBD.
GBK, GB2312, etc., and UTF8 must be encoded by Unicode before they can be converted to each other:
GBK, GB2312-- > Unicode-- > UTF8
UTF8-- > Unicode-- > GBK, GB2312
GB2312 is a subset of GBK, and GBK is a subset of GB18030
GBK is a collection of large characters including Chinese, Japanese and Korean characters
In order to avoid all garbled code problems, UTF-8 should be used, and it is very convenient to support internationalization in the future.
UTF8 can be thought of as a large character set, which contains the encoding of most of the text.
One of the benefits of using UTF8 is that users in other regions (such as Hong Kong and Taiwan) can view your text without garbled code without installing simplified Chinese support.
Summary:
Gb2312 is a simplified Chinese code
Gbk supports both simplified and traditional Chinese
Big5 supports traditional Chinese
Utf8 supports almost all characters
So much for sharing a few bytes of Chinese in mysql. I hope the above content can be helpful to you and learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.