Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Mysql character set and comparison rules

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains the "Mysql character set and comparison rules". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "Mysql character set and comparison rules".

I. Preface

The character set represents the relationship between the stored binaries and how characters are mapped, and the comparison rules refer to the rules of how characters are sorted, such as what rules characters are sorted by if order by is used.

Second, view commands

The supported character set commands are: SHOW (CHARACTER SET | CHARSET) [LIKE matching pattern], CHARACTER SET | CHARSET agrees, both can be used.

The supported command for comparison rules is: SHOW COLLATION [patterns matched by LIKE].

Note that there is a certain rule in the naming of the comparison rules, which generally satisfy:

The comparison rule name begins with the name of the character set associated with it

This is followed by the language in which the comparison rule is mainly used, such as utf8_polish_ci for comparison with Polish rules, utf8_spanish_ci with Spanish rules, and utf8_general_ci is a general comparison rule.

The name suffix means whether the comparison rule distinguishes stress, case and so on in the language. The specific values that can be used are as follows:

Suffix English definition description _ aiaccent insensitive is not stress-sensitive _ asaccent sensitive is stress-sensitive _ cicase insensitive is case-insensitive _ cscase sensitive is case-sensitive _ binbinary is compared in binary

For example, our commonly used utf8_general_ci comparison rule ends with ci, indicating that the comparison is case-insensitive.

III. About utf8 and utfmb4

Utf8 and utfmb4 are commonly used character sets. What's the difference between them? In fact, the real UTF-8 is 1-4 bytes, but the utf8 in mysql does not refer to this, but refers to utf8mb3, in which mb represents the maximum number of bytes occupied. Mysql at first secretly castrated utf-8 in order to save space resources, using 1-3 bytes to represent, in fact, 1-3 bytes is enough to represent the characters we usually use. In fact, utfmb4 is the real utf8, which can map all the Unicode codes.

IV. The level of character set and comparison rules

MySQL has four levels of character sets and comparison rules, including server level, database level, table level, and column level. For a table column, the more specific the granularity of these levels is, the more priority it is to use. When creating databases, tables and columns, if you do not specify what character set and comparison rules to use, automatically reference the configuration of the previous level. Let's take a look at how the character sets and comparison rules are set at each level.

Server-level system variable describes character_set_server server-level character sets collation_server server-level comparison rules

As shown in the figure, the server-level character set and comparison rules are controlled by the system variables character_set_server and collation_server, and the view and modify commands are described in the previous article. We can set it through startup options, configuration files, and runtime changes.

Database level

The system variables for database-level character sets and comparison rules are:

System variable describes the character set of the current character_set_database database collation_database comparison rules of the current database

If you want to see the character set and comparison rules used by the current database, you can use the above variable values (provided you select the current default database using the user statement, if there is no default database, then the variable has the same value as the corresponding server-level system variable).

In addition, these two variables are read-only, and we cannot directly modify these two values to change the character set and comparison rules of the database. These two values can only be changed through the DDL statement. The syntax format is:

CREATE DATABASE database name [[DEFAULT] CHARACTER SET character set name] [[DEFAULT] COLLATE comparison rule name]; ALTER DATABASE database name [[DEFAULT] CHARACTER SET character set name] [[DEFAULT] COLLATE comparison rule name]; table level

Edit and modify:

``CREATE TABLE table name (column information) [[DEFAULT] CHARACTER SET character set name] [COLLATE comparison rule name]]

ALTER TABLE table name [[DEFAULT] CHARACTER SET character set name] [COLLATE comparison rule name] ``

Column level

Edit and modify:

CREATE TABLE table name (column name string type [CHARACTER SET character set name] [COLLATE comparison rule name], other columns.); ALTER TABLE table name MODIFY column name string type [CHARACTER SET character set name] [COLLATE comparison rule name]; in addition

Because the character set and the comparison rules are related to each other, if we only modify the character set, the comparison rules will become the default comparison rules for the modified character set. If only the comparison rules are modified, the character set will become the character set corresponding to the modified comparison rules

V. conversion of character sets in Mysql

We know that the request from the client to the server is essentially a string, and the result returned by the server to the client is essentially a string, and the string is actually binary data encoded using a certain character set. This string does not use the encoding of a character set to go black. The process from sending a request to returning a result is accompanied by multiple character set conversions, and three system variables are used in this process:

The system variable describes the character set used by the character_set_client server to decode the request. When the character_set_connection server processes the request, it changes the request string from character_set_client to the character set used by the character_set_connectioncharacter_set_results server to return data to the client.

The multiple transcoding process is shown in the figure above. Note that if the character set used by a column is inconsistent with the character set represented by character_set_connection, another character set conversion is required. In general, it is necessary to keep the values of these three variables the same as the character set used by the client to avoid unnecessary codec overhead.

Thank you for your reading, the above is the content of the Mysql character set and comparison rules, after the study of this article, I believe you have a deeper understanding of the Mysql character set and comparison rules, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report