Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The concrete method of setting MySQL character set

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

This article is mainly about the specific methods of setting the MySQL character set, if you are interested, let's take a look at this article. I believe that after reading the specific methods of setting the MySQL character set, it is of some reference value to everyone.

Character set introduction

Gbk/gb2312

Gbk/gb2312

The double-byte character set is adopted, and both Chinese and English characters are represented by double characters. in order to distinguish between Chinese, the highest bit is set to 1.

Gb2312 is a subset of gbk, gbk is a subset of gb18030, and gb2312 can only store simplified Chinese characters

Gbk includes a large character set of Chinese, Japanese and Korean characters

Usually the gbk character set is sufficient

International versatility is worse than utf8, but utf8 occupies a larger database than gbk (utf8 is a three-byte character set)

Utf8/utf8mb4

UTF full name (Unicode Transformation Format), is a storage method of Unicode, variable length character coding, also known as universal code

Use utf8 as much as possible in the database character set (including connect, result, and final html pages must all be consistent with utf8)

UTF8 uses variable length bytes to store Unicode characters, such as ASCII letters continue to use 1 byte storage, accented text, Greek letters, or Cyrillic letters, etc., use 2 bytes to store, while commonly used Chinese characters use 3 bytes; that is, an English character equals one byte, and a Chinese character (including traditional Chinese) equals three bytes.

Utf8mb4 can store up to 4 bytes of a character, so it can support more character sets. Utf8mb4 is often used in projects to store emoji expressions.

Latin1 is the 8bit (1 bytes) character set, but cannot cover Asian and African languages

Unidoce is an extension of latin1, which adds support for conventional languages in Asia and Africa, but still does not support all languages, and ASCII is inefficient to express in unidoce (the conversion of a small character set to a large character set is often accompanied by the loss of characters)

Utf8 is an extension of unicode

Character sets such as gbk, gb2312 and utf8 must be encoded by Unicode before they can be converted to each other.

Recommendations for the use of character sets

1. When you are very sure that only Chinese end users can choose gbk / gb2312

2. In order to facilitate data migration and display of multiple terminals, utf8 is preferred.

3. When characters do not need to be case-sensitive, you can use the default xx_ci check set, otherwise select the xx_bin check set (in the production environment, try not to modify the check set)

4. The default character set is latin1, which stores Chinese characters separately, so that the retrieval results are not accurate enough. The advantage is that it saves space and is not recommended.

MySQL character set range

Cloud server layer (server) > database component (database) > data table (table) > field (column) > connection (connection) | result set (result)

MySQL character set priority

Connection (connection) | result set (result) > Field (column) > data Table (table) > Database (database) > Cloud Server layer (server)

Char Typ

Char (N): n represents the number of characters (also called character length), not bytes

Char (N): fixed-length storage, which occupies fixed-length storage space, and the insufficient parts are filled with spaces; when dealing with char (N) type, MySQL needs to return after dropping the space strip.

Storage space: the storage space of the char (N) type is related to the character set. Combined with the knowledge points of the character set just now, a Chinese language occupies 3 bytes in the utf8 character set, 2 bytes in the gbk character set, and numbers and characters are represented by one character.

Storage mechanism: when the length of N is not enough, MySQL needs to fill in special spaces instead of hexadecimal 20 when storing data.

Think about:

Char (30), the maximum number of letters, Chinese characters and space that can be stored in different character sets

Gbk: space occupied by 30 bytes 30*2utf8: space occupied by 30 bytes can be stored by 30*2utf8

Varcahr Typ

Varchar (N): n represents the number of characters (also called character length), not bytes

Varchar (N): variable length storage, using only the necessary storage space.

Storage space: the storage space of the varchar (N) type is related to the character set. Combined with the knowledge points of the character set just now, a Chinese language occupies 3 bytes in the utf8 character set, gbk uniformly occupies 2 bytes, numbers and characters are represented by one character.

Storage mechanism: varchar (N) field storage actually starts from the second byte, and then uses 1 to 2 bytes to represent the actual length, and the rest is the range that can store data, so the maximum available storage range is 65535-3 bytes 65532 bytes; the first byte identifies whether it is empty. (if the length is less than 255 bytes, use one byte to represent the length; greater than 255 bytes use two bytes to represent the length)

Modify database instance character set

Take effect temporarily

Mysql > set character_set = 'gbk';mysql > set character_set_client =' gbk'

Take effect globally

Mysql > set global character_set_client = 'gbk';Query OK, 0 rows affected (0.00 sec)

Take effect permanently

Vim / etc/my.cnf character-set-server=utf8

The above details about the specific methods of setting the MySQL character set are helpful to all of you. If you want to know more about it, you can continue to follow our industry information section.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report